Abstract
Objective:
The current international trend is to create large datasets with existing data and/or deposit newly collected data into repositories accessible to the scientific community. These practices lead to more efficient data sharing, better detection of small effects, modelling of confounders, establishment of sample generalizability and identification of differences between any given disorders. In Australia, to facilitate such data-sharing and collaborative opportunities, the Neurobiology in Youth Mental Health Partnership was created. This initiative brings together specialised researchers from around Australia to work towards a better understanding of the cross-diagnostic neurobiology of youth mental health and the translation of this knowledge into clinical practice. One of the mandates of the partnership was to develop a protocol for harmonised prospective collection of data across research centres in the field of youth mental health in order to create large datasets.
Methods:
Four key research modalities were identified: clinical assessments, brain imaging, neurocognitive assessment and collection of blood samples. This paper presents the consensus set of assessments/data collection that has been selected by experts in each domain.
Conclusion:
The use of this core set of data will facilitate the pooling of psychopathological and neurobiological data into large datasets allowing researchers to tackle important questions requiring very large numbers. The aspiration of this transdiagnostic approach is a better understanding of the mechanisms underlying mental illnesses.
Introduction
About 75% of mental health conditions emerge before age 25 (Kessler et al., 2007), and just under one in four young Australians meet criteria for a probable serious mental illness (Mission Australia and the Black Dog Institute, 2017). Despite a range of privately and publicly funded mental health services as well as evidence-based therapies being available for young people, many continue to suffer because they do not easily respond to social, psychological and biological interventions (Bhui, 2017).
It is widely recognised that psychiatry has not yet achieved detailed, satisfactory models of how neurobiological, psychological and social factors interact and drive the onset of psychiatric disorder. Our understanding of the neurobiological mechanisms underlying symptoms of mental illnesses and associated neurocognitive impairments has largely been obtained from the study of adults with long-established and differentiated syndromes. Adult cases tend to present with long histories of illness and treatment that can confound interpretation. Furthermore, studies in adults, by definition, do not allow investigation of normative development and deviations in neurodevelopmental trajectories that contribute to mental disorders, despite this clearly being of great importance given that half of all mental illnesses begin by the age of 14 and three-quarters by mid-20s according to the World Health Organization.
In order to provide the most targeted and successful treatment to young people, a better understanding of the neurobiological changes occurring during the onset and early course of mental illnesses is needed. Key issues that have contributed to our limited understanding of the exact neurobiological underpinnings of mental health are underpowered studies and the lack of reproducible findings. Underpowered studies have led to inconsistent and poorly replicated neurobiological findings in psychiatry. To circumvent these issues, large datasets are required. Creating large datasets lead to (1) better detection of small effects, (2) modelling of confounders, (3) establishment of sample generalizability and (4) identification of differences between any given disorders.
Globally, researchers tend to work in silos, focusing on a theoretical construct of interest for a given categorical diagnosis or looking at a specific psychological, functional or biological outcome. As a consequence, the data they collect cannot always be shared because different tools or approaches have been used. Site-specific data often need imputation, interpolation, etc. in order to be pooled with data from another site, inevitably adding to the variance in the data that cannot be attributed to the studied disorder. Other difficulties arising from the pooling of existing data are the ethical and computational issues with regard to data sharing, as well as science and data-sharing policies that may vary from one research institute to another. This paper offers a solution to these issues by suggesting an approach to harmonise the collection of data across research centres in Australia.
Another matter that is addressed by the current proposal is the categorisation of patients based on their clinical presentation, e.g. Diagnostic and Statistical Manual of Mental Disorders (DSM) and International Classification of Diseases (ICD). These categorical systems imply that a patient can be diagnosed with a disorder that is distinct and independent from other disorders, assumptions that are not justified based on clinical experience and empirical research. To better understand the neurobiological changes occurring as a young person’s mental health deteriorates, a transdiagnostic approach is more appropriate. Specifically, in order to better understand the nature of mental health and illness, the National Institute of Mental Health (NIMH) suggests the use of a Research Domain Criteria (RDoC) approach, which integrates many levels of information (from genomics and circuits to behaviour and self-reports). 1
This paper presents an initiative and methods to harmonise the collection of data across research centres in the field of youth mental health in Australia using an RDoC approach. The proposed assessments and measures were selected and recommended by experts in the field of youth mental health from across Australia in the following key domains: clinical assessment, neurocognitive assessment, neuroimaging data collection and biospecimens collection. Using the same protocol to collect data improves data sharing between collaborators and will enable participating researchers to answer questions that cannot be tackled currently due to the lack of statistical power.
Methodology
A consortium of researchers and clinicians working in the field of youth mental health in Australia, the Neurobiology in Youth Mental Health Partnership, 2 was created. Partners were selected based on their expertise as researchers or clinicians in the field of youth mental health. Twenty-six individuals including professors, psychiatrists, senior and junior researchers, as well as research support staff participated in the selection of measures and assessments for the harmonisation of prospective data collection. They represented 13 independent institutions (Brisbane Institute for Molecular Bioscience; Brisbane Metro North Mental Health Service; Deakin University; Melbourne Neuropsychiatry Centre [The University of Melbourne]; Monash University; Orygen, The National Centre of Excellence in Youth Mental Health [The University of Melbourne]; Orygen Youth Health; QIMR Berghofer; Sunshine Coast Mind and Neuroscience Thompson Institute [The University of Sunshine Coast]; Telethon Kids Institute; The University of Melbourne; The University of Sunshine Coast; Brain and Mind Centre [The University of Sydney], across Australia [New South Wales, Queensland, Victoria, Western Australia]). See Supplementary Table 1 for a full list of the partners involved.
Their task was to make recommendations about the type of data and corresponding measures that should be collected in young people with mental health issues participating in research, with the aim to produce large datasets and attempt to answer important questions in relation to the neurobiology of illnesses.
Panels of experts were formed to work on four specific areas: clinical presentation, neurocognition, neuroimaging and biospecimens. Panel members met face-to-face, via teleconference or through emails over a period of 2 years, in order to reach consensus within each domain. The technique used to reach consensus was a variant of the Nominal Group Technique, whereby participants were asked to answer the following question: ‘If you had 30 minutes with a research participant, what assessments would you conduct?’ Experts were invited to actively participate in idea-generating activities, basing their propositions on the literature. Every suggested assessment was carefully considered by the panel members until everyone agreed to include it or not based on the relevance and importance of the assessment, and on the time restriction. Assessments also needed to be applicable across diagnoses and be relevant to young people aged 12 to 25 years. Wherever possible, cost-effectiveness had to be considered. The battery was to provide a comprehensive neurobiological, psychological and neurocognitive snapshot of a given individual at a given point in time regardless of their clinical presentation. The experts were asked to choose measurements accessible in the public domain whenever possible. However, it was expected that costs would be associated with the collection of neuroimaging and some neurocognitive data, as well as for the collection, storage and processing of biospecimens. Effort was made to chose assessments that have been validated in young people, while keeping in mind that, where possible, measures should ideally leverage recent technological advances in each of the relevant domains.
In order to keep the primary assessments battery short, recommendations were made to collect information about equally important domains in secondary sets of assessments or measures. This will enable harmonisation of data collection in these domains too.
Clinical assessment
The primary clinical battery addresses the domains most commonly assessed across diagnoses. The broad symptom domains used in the primary battery were based on the Psychiatric Domain of the Phenotypes and Exposures project (PhenX), 3 funded by the National Institutes of Health’s (NIH) National Human Genome Research Institute. The domains were depression, mania, suicidal ideation, anxiety, psychological distress, psychosis, personality disorders, quality of life, sleep and substance use. By assessing the mental state of young people in all those domains, researchers can obtain a high-quality snapshot of the clinical picture of the participants.
The recommended clinical assessments were conceptualised to complement interviewer-administered assessments for determining past and current clinical diagnoses assessed by structured clinical interviews (i.e. Structured Clinical Interview for DSM-5 [SCID-5], Comprehensive Assessment of At-Risk Mental States [CAARMS], Social and Occupational Functioning Assessment Scale [SOFAS], Clinical Global Impressions [CGI], etc.). It is also recommended that clinical phenotyping be enhanced by multiple sources (e.g. relatives, clinical file audit, etc.). The Clinical Assessment Panel used the following criteria for recommending specific measures for each domain: valid measures that adequately cover the chosen domains, appropriate for youth populations and brief. The Clinical Assessment Panel also made the decision to recommend only assessments which are available in the public domain (i.e. no financial/licencing costs involved in usage) in order to ensure ease of access.
Table 1 presents the assessments that constitute the primary battery of clinical assessments, which should be administered in full. The total administration time for the primary clinical assessment battery varies between 36 and 58 minutes, which is longer than the targeted 30 minutes. However, it was agreed by the experts that the self-report assessments can be taken home by the research participants and can be filled out in their own time. Self-report assessments are practical, cost-efficient and enable the collection of larger amounts of data. However, the downside is that the data collected may not be as accurate as data collected in face-to-face assessments due to the ‘social desirability bias’ and the protection of privacy. Reduced reliability can be minimised by reassuring the participant that their privacy is protected. Another issue with the use of self-reports is that participants may not understand a given question or the interpretation of a given statement may vary between participants. To minimise this issue, the panel members have chosen assessments that are well validated in the targeted population.
Primary clinical assessment battery.
YMRS: Young Mania Rating Scale; DSM: Diagnostic and Statistical Manual of Mental Disorders.
In order to keep the primary battery as targeted and efficient as possible, it was developed to include prevalent disorders or broader clinical syndromes. More specialised areas can be addressed in the secondary battery of assessments. The assessments included in this secondary battery are presented in Table 2.
Secondary clinical assessment battery.
OCD: obsessive compulsive disorder; PTSD: post-traumatic stress disorder; UCLA: University of California, Los Angeles; ADHD: attention-deficit/hyperactivity disorder; DSM: Diagnostic and Statistical Manual of Mental Disorders.
The Clinical Assessment Panel has set a long-term vision of developing and validating a new, transdiagnostic self-report tool that would have the ability to map on to the clinical stages of psychiatric illnesses. The panel’s vision is for this new measure to take a broad youth-specific assessment approach, furthering the model adopted by the Brief Psychiatric Rating Scale (BPRS). This would align with recent conceptualisations of the structure of psychopathology, based on analysis of longitudinal birth cohort data (Caspi et al., 2014). The Clinical Assessment Panel views the establishment of a nationally adopted minimum dataset as a significant step towards this.
Neurocognitive assessment
Neurocognitive impairments are common and associated with poorer functional outcomes for individuals with a mental illness (Lee et al., 2017). Comprehensive meta-analyses indicate that clinically and statistically significant neurocognitive impairments (e.g. in attention, learning, memory, processing speed and executive function) are evident early in the course of mental illnesses, including psychosis (Mesholam-Gately et al., 2009), bipolar disorder (Lee et al., 2014a), eating disorders (Zakzanis et al., 2010) and depression (Goodall et al., 2018). However, there is significant heterogeneity in neurocognitive functioning between and within individuals (regardless of the diagnosis) (Lee et al., 2014b), hence the need to move towards coordinated transdiagnostic approaches to research into neurocognition in mental illness.
The aim of the neurocognition battery is to assess the domains of neurocognition that are most relevant transdiagnostically in youth aged 12 to 25 years, including processing speed, attention, working memory, verbal learning and memory, executive functioning and social cognition (Cotter et al., 2018; Lee et al., 2015). A number of criteria were considered in designing the battery including (1) brevity, (2) ease of administration and scoring by non-specialists, (3) applicable to age 12 to 25 years, (4) applicable to various cultures, (5) preference for computerised over paper-and-pencil tests, (6) alternate forms for repeat testing and (7) sensitive to change. A number of alternatives were considered by the panel including Cogstate, CANTAB, IntegNeuro, WebNeuro, WebCNP, NIH Toolbox and individual paper-and-pencil tasks. Several of the tests in the final chosen battery are from Cogstate because these tasks met all of the above criteria. Furthermore, support is provided within Australia.
The chosen assessments (Table 3) have broad applicability and have been used in populations of diverse cultural backgrounds (including Aboriginal and Torres Strait Islander peoples) (Cairney et al., 2007; Pearce et al., 2014), ages, and clinical presentations (Maruff et al., 2009). Apart from the Rey Auditory Verbal Learning Test (RAVLT), they do not rely heavily on language and verbal ability. The tasks have been shown to be sensitive to change, which is important in clinical trials and longitudinal studies (Collie et al., 2007). They have also been shown to be sensitive to subtle neurocognitive impairment, which is of high importance in young people with emerging or subthreshold mental disorder (Patel et al., 2017).
Primary neurocognitive assessment battery.
The primary cognitive battery was designed to have utility for predicting or informing a range of clinical outcomes, including psychopathology, psychosocial functioning, severity of illness, state, trait and ‘scar’/progressive markers of illness and treatment response. However, it should be noted that the Cogstate assessments have costs associated.
Table 3 presents the primary neurocognitive tasks to be administered in full. Alike the clinical assessments, to ensure the primary battery remains as targeted and efficient as possible, additional neurocognitive domains can be assessed using to the secondary battery.
The administration of the secondary battery is supplementary and optional and it can also be administered in part. Assessment included in the secondary battery is presented in Table 4.
Secondary neurocognitive assessment battery.
Neuroimaging data collection
The Neuroimaging panel identified two important aspects for the decision-making process: (1) the extent of sequence harmonisation across scan sites, and (2) the selection of imaging modalities. Several models for scan sequence harmonisation were identified ranging from full harmonisation to a very permissive model. The ENIGMA consortium 4 is a good example of the permissive model, e.g. same imaging modalities, no control around scan protocols, common pipeline for quality assurance (QA), data processing and statistical analysis. It involves data sharing at the level of summary statistics or derived imaging measures. The restrictive model, or ‘Simple up’, implies identical scan protocols in the same imaging domains and the same common pipeline for QA. The restrictive model mandates a simple common set of scan sequences and aims for complete data sharing with the purpose to build a central database. Finally, the aspirational model, or ‘Smart down’, uses the same imaging modalities, with a set of scan protocols optimised for high-end scanner platforms but designed to be broadly compatible with lower-end scanner platforms.
It was agreed by the neuroimaging panel that although the permissive model is a very useful for the purpose of retrospective pooling of already collected data, we should aim for more extensive harmonisation than the permissive model. However, full harmonisation and standardisation of scan sequences across all sites (i.e. restrictive model) is likely not realistic, since this would necessarily restrict participation to sites where a particular scanner vendor or platform was available. It was therefore agreed that the aspirational model would be used, i.e. a set of sequences have been defined with the protocols optimised for high-end platforms with a second set of similar sequences, which can be implemented should the chosen vendor/operating system not support the high-end sequences. The requirements for high-end platforms are (1) high-density head coils (32 channel or 64 head and neck), (2) fast gradient switching and (3) slice acceleration (for further details, see below). Sites with access to scanners that cannot support these capabilities should aim to acquire data at a similar spatial resolution while compromising in other aspects, e.g. temporal resolution for functional magnetic resonance imaging (fMRI), number of encoding directions and/or maximum b-value for diffusion MRI.
We chose to include state-of-the-art (or ‘high-end’) sequences to optimise the balance between high image quality given and a limited time of scanning and to be consistent and further harmonise with other major international big data neuroimaging initiatives such as UK Biobank, 5 Human Connectome Project (HCP) 6 and the Adolescent Brain Cognitive Development Study. 7 The proposed brain imaging data acquisition includes six modalities, covering structural, diffusion and resting-state functional imaging.
The advantages of harmonising diffusion-weighted imaging and resting-state fMRI are the consistency of structural and functional brain networks across different subject groups, analysis methods and types of scanning protocols, and their application includes minimal demands on patient compliance and short acquisition times. Sufficiently high spatial resolution of structural MRI enables the user to choose between traditional volume-based analyses and HCP-like surface-based analysis. Compulsory modalities are presented in Table 5. The current paper presents an overview of selected modalities, but a more detailed description of the acquisition parameters and scan protocols can be found in the supplemental material.
Primary set of neuroimaging modalities.
MRI: magnetic resonance imaging.
To keep the total scan time of the primary imaging assessments within 30 minutes, it was decided that task-based fMRI would only be included in the secondary set of modalities. The choice for a specific task included in the second set of modalities will be dependent on the personal interest of researchers and/or on the aims of individually funded projects. However, to promote data sharing of task-related fMRI data, the fMRI tasks that are currently being used by the different members of the neuroimaging panel can be found on the Neurobiology Partnership website. Secondary set of modalities also includes T2-weighted structural images, which is a structural technique with contrast dominated by signal decay from interactions between water molecules (T2 relaxation times). T2 images depict alterations to tissue compartments typically associated with pathology (e.g. white matter lesions). The final modality is susceptibility-weighted imaging, which is a structural technique that is sensitive to magnetised tissue constituents (magnetic susceptibility). Data from one scan (including phase and magnitude images from two echo times) can be processed in multiple ways to reflect venous vasculature, microbleeds or aspects of microstructure (e.g. iron, calcium and myelin).
Collection of biospecimens
Peripheral biomarkers, i.e. biomarkers that can be quantified from peripheral tissue (such as plasma, blood cells or hair), are promising indicators of the pathophysiological processes that underlie mental disorders (McGorry et al., 2013). As such, peripheral biomarkers may serve multiple purposes, including indicating risk for onset or progression to a more advanced stage, delineating diagnostic entities or informing treatment choice.
Several meta-analyses of cross-sectional studies support alterations in peripheral biomarkers in schizophrenia/psychosis (Berger et al., 2016; Goldsmith et al., 2016), mood and anxiety disorders (Berk, 2015). However, it is less clear whether these group-level differences extend to earlier illness stages, how the significant within-group heterogeneity can be accounted for and longitudinal studies are needed to delineate the role of biomarkers over time. The harmonisation of biospecimen collection has the potential to address these questions, as it allows the detection of smaller effects, better control of confounders and cross-validation of findings.
The primary purpose of the harmonisation of the biospecimen collection is to agree on a set of specimens that are supported as candidate biomarkers by the literature (McGorry et al., 2013), can be collected in ambulatory settings and are of relevance to a broad spectrum of disorders. As such, the selected biomarkers consist of commonly reported and novel biomarkers, and those that have shown prospective associations with illness trajectories. The selection criteria used by the biospecimens panel were as follows: established association with risk and clinical outcomes, relevance to the pathophysiology of mental disorders (in particular considering a transdiagnostic approach), minimising the burden to the participant, and limiting the invasiveness of the specimen collection.
The collection of the selected primary biomarker simply requires a blood draw (Table 6). The secondary biospecimens collection requires a supplementary blood tube and the collection of a hair sample. Together, these can be completed within approximately 5 minutes.
Primary biospecimens collection.
EDTA: ethylenediaminetetraacetic acid.
Secondary biospecimen collection includes the collection of blood using the PAXgene tube, which enables the quantification of transcription products of genes relevant to psychiatric disorders. RNA quantification can be performed to determine differences in gene expression between affected and unaffected participants, as a response to treatment and more. The panel also recommends the collection of hair for cortisol measurement. Cortisol is the end product of the hypothalamic–pituitary–adrenal axis (HPA axis) and a major mediator of the body’s response to stress. Meta-analyses show elevated morning cortisol levels (Girshkin et al., 2016) and flattened awakening response in patients with schizophrenia and first-episode psychosis (Berger et al., 2016). Hair cortisol reflects the average cortisol concentrations in the past month and is robust to the sampling bias inherent to plasma cortisol (sample collection may evoke a stress response).
Discussion
In the present paper, a list of assessments or measures are presented with a view to invite our collaborators working in the field of youth mental health to collect a core set of data in an harmonised way. The measures have been thoughtfully selected by experts in the field following multiple consultations and using a transdiagnosis approach. In the study of the neurobiology of mental illness, large datasets are necessary due to the heterogeneity of the studied populations and, too often, pooling data from independent studies is not possible due to the data collected not being comparable. The harmonised collection of data will enable the production of a large dataset holding data on the clinical and neurobiological presentation of young people in the early stages of a mental illness. For example, large datasets will enable the identification of biomarkers in a more efficient way. Biomarkers can predict risk, delineate diagnostic entities (diagnostic or trait biomarkers), correlate with severity (state biomarkers), correlate with staging (stage biomarkers), allow prediction of treatment response (treatment biomarkers), and/or can predict the course and outcome of an illness (prognostic biomarkers) (McGorry et al., 2013). An important aim of studying biomarkers is to systematically evaluate their potential in regards to the above classifications and in particular with relevance to transdiagnostic conceptualisations and clinical staging.
Further to the assessments presented here, standardised demographics, history of medical conditions and history of treatment are also available on the Neurobiology Partnership website.
A limitation to the use of the suggested batteries of assessment in youth with mental ill health is that psychopathology is a ‘western’ construct, and within the Australian context, these measures have not been validated in Aboriginal and Torres Strait Island populations, which is an important consideration for data harmonisation.
According to data-sharing policies currently being prepared and pending ethics and governance clearance, contributors to the dataset will have access to the data and also to the expertise of other researchers and/or clinicians involved in the Neurobiology in Youth Mental Health Partnership. The Neurobiology Partnership will provide support in the implementation of one or more of the four suggested core sets of data into your research projects. Any interested collaborator is invited to visit the website https://NeurobiologyPartnership.org.au.
Supplemental Material
SuppMaterial – Supplemental material for Harmonised collection of data in youth mental health: Towards large datasets
Supplemental material, SuppMaterial for Harmonised collection of data in youth mental health: Towards large datasets by Suzie Lavoie, Kelly Allott, Paul Amminger, Cali Bartholomeusz, Maximus Berger, Michael Breakspear, Anjali K Henders, Rico Lee, Ashleigh Lin, Patrick McGorry, Simon Rice, Lianne Schmaal and Stephen J Wood in Australian & New Zealand Journal of Psychiatry
Supplemental Material
Supp_Table1 – Supplemental material for Harmonised collection of data in youth mental health: Towards large datasets
Supplemental material, Supp_Table1 for Harmonised collection of data in youth mental health: Towards large datasets by Suzie Lavoie, Kelly Allott, Paul Amminger, Cali Bartholomeusz, Maximus Berger, Michael Breakspear, Anjali K Henders, Rico Lee, Ashleigh Lin, Patrick McGorry, Simon Rice, Lianne Schmaal and Stephen J Wood in Australian & New Zealand Journal of Psychiatry
Footnotes
Acknowledgements
The authors would like to acknowledge the members of the Neurobiology in Youth Mental Health Partnership for their participation in the various panels of experts.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
K.A. and A.L. are supported by National Health and Medical Research Council (NHMRC) Career Development Fellowships nos. 1141207 and 1148793, respectively.
Supplemental Material
Supplemental material for this article is available online.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
