Sage Journals: Discover world-class research

Abstract

Background

Early-onset depression contributes significantly to long-term disability and suicide, making high-quality healthcare for young people with depression a critical concern. Medical record review (MRR) is widely used to assess healthcare quality. However, its application to depression care for children and adolescents appears underexplored, with no consensus on conceptualising or measuring quality. This systematic review aimed to evaluate how quality was operationalised in primary studies using MRR in this context.

Methods

A structured search in PubMed, CINAHL, and PsycInfo following PRISMA guidelines identified 1,690 unique articles. Studies using MRR to evaluate outpatient depression healthcare quality for patients ≤17 years were included. Thematic synthesis was applied, focusing on methods, indicator themes, and quality framework alignment.

Results

Six studies published in 2005–2022 were included. These used 3–32 indicators covering risk assessments, diagnostic assessment, treatment, and monitoring, but indicators and operationalisation methods varied widely. Two studies reported using consensus methods. None incorporated the Institute of Medicine or World Health Organization quality frameworks. Binary opportunity indicator assessments were standard, but methods for deriving composite measures differed.

Conclusions

Despite shared themes, heterogeneity and lack of framework alignment limit comparability. Robust operationalisation methods ensuring indicator reliability and validity would strengthen future measurement of depression healthcare quality.

Plain Language Summary

Why This Study Was Done

Depression in young people is a major cause of long-term health problems and suicide, making access to high-quality care especially important. Researchers often use medical records to understand how well care is delivered, but this approach appears to be rarely used to assess the quality of depression care for children and adolescents. There is also no clear agreement on how to define or measure quality in this context.

What Was Done

Researchers searched three major databases (PubMed, CINAHL, and PsycInfo) for studies that used medical records to assess the quality of outpatient depression care for young people aged 17 and under. After screening 1,690 articles, six studies published between 2005 and 2022 met the inclusion criteria. The researchers examined how each study defined and measured quality, which areas of care they evaluated, and whether established quality frameworks were used to guide the assessments.

What Was Found

The six included studies used 3–32 indicators each to assess areas such as risk evaluation, diagnosis, treatment, and monitoring. However, the indicators themselves and how they were defined varied widely. Only two studies described using consensus-based methods to develop their indicators, and none used quality frameworks from the Institute of Medicine or the World Health Organization. All studies used simple yes/no checklists to determine whether specific actions had been documented, but the ways they grouped or summarised indicators differed.

What This Means

Although the studies addressed similar areas of care, the lack of standardisation in how quality was defined and measured makes the findings difficult to compare. To support better care for young people with depression, future research would benefit from clearer and more reliable ways to define and measure healthcare quality.

Keywords

child adolescent depression care healthcare record chart review quality indicator

Introduction

Depression is a major cause of disability-adjusted life years among adolescents, significantly contributing to the global burden of disease (Ferrari, 2022; WHO, 2021). Since the turn of the millennium, the prevalence of depressive disorders and symptoms among young people has nearly doubled (Shorey et al., 2021). Early-onset depression increases the risk of future somatic, psychosocial, and psychiatric complications (Thapar et al., 2022). It also heightens the risk of suicide, a leading cause of death in adolescents (Moitra et al., 2021). Despite this, many children and adolescents with depression do not receive adequate care. Mental healthcare for young people suffers from insufficient and unevenly distributed resources and often fail to adhere to best practices (Kruk et al., 2018; Moitra et al., 2022; Mora Ringle et al., 2019; Wickersham et al., 2024). These inadequacies impose substantial societal costs (Bodden et al., 2018), underscoring the need for high-quality outpatient depression care for young people. However, knowledge of care processes within child and adolescent mental health services (CAMHS) remains limited, and quality assessment lacks consensus on conceptualisation, measurement, and use of established global healthcare quality frameworks (Kilbourne et al., 2018; Leslie et al., 2018; Quinlan-Davidson et al., 2021; Skokauskas et al., 2019; Williams & Beidas, 2019).

Quality Conceptualisation and Operationalisation

Healthcare quality is an abstract concept with definitions varying by context and perspective (Wensing et al., 2020). It is therefore essential to define, conceptualise, and operationalise quality clearly within its specific context (Vassar & Holzmann, 2013; Wensing et al., 2020). The Donabedian model (Donabedian, 2005) has long shaped healthcare quality research by categorising quality into three elements: Structure – resources, environments, or organisational conditions; Process – healthcare activities and system performance; and Outcome – results and effects following interventions, as illustrated in Figure 1. The model posits that structures influence processes, which in turn affect outcomes. While patient outcomes are the ultimate quality measure, they often take time to manifest and are influenced by factors beyond healthcare processes (Lewandowski et al., 2013). Consequently, process measures play a critical role in quality assessment.

Figure 1.

The Donabedian model for quality improvement, with suggested indicators (Donabedian, 2005)

Process measures can be based on empirical standards (how processes are) or normative standards (how processes should be). When valid and reliable, these measures can reveal associations between changes in structure or processes and specific patient outcomes (Kilbourne et al., 2018; Wensing et al., 2020). The Institute of Medicine (IOM, 2001) outlined six quality aims – healthcare should be: Safe, Effective, Efficient, Patient-centered, Equitable, and Timely. These aims are the outcome measures within the Implementation Outcomes Framework (Proctor et al., 2011), aligning with the World Health Organization (WHO, 2006) quality dimensions. However, their implementation has been hindered by limited evidence on quality measures in CAMHS that align with these frameworks, highlighting a global evidence gap (Quinlan-Davidson et al., 2021).

Quality Measurement

Medical record review (MRR) is a well-established method for evaluating healthcare quality, allowing for retrospective analysis of care processes (Gearing et al., 2006; Vassar & Holzmann, 2013; Worster & Haines, 2004). Direct observation is an alternative but risks influencing care delivery (Donabedian, 2005). While administrative data enables long-term process measurement for large populations, it provides limited insights into detailed care processes (Kilbourne et al., 2018). Operationalising healthcare quality measures, particularly for depression care, is challenging due to variability in conceptualisation and measurement. The aim of quality measurement is expected to guide indicator development (Wensing et al., 2020), as illustrated in Figure 2. Yet, efforts to establish universal psychiatric quality indicators have yielded few measures suitable for outpatient CAMHS (Fisher et al., 2013; Iyer et al., 2016). Quality indicators for adolescent depression care, presented in Appendix Table A1, have been applied to electronic health records (Lewandowski et al., 2013). However, further research is needed to validate these measures, which do not align with other standards, as outlined in Appendix Table A2 (NICE, 2013).

Figure 2.

Quality indicator characteristics based on aim: quality improvement or implementation research (Wensing et al., 2020)

The validity and reliability of MRR depend on the use of comparable measures and consistent analytical methods, making a two-step operationalisation essential (Vassar & Holzmann, 2013). The first step involves defining indicators that are relevant, evidence-based, applicable, and feasible (McGlynn, 2010; Vassar & Holzmann, 2013; Wensing et al., 2020). Consensus methods (Bourrée et al., 2008), such as the RAND Appropriateness Method (RAM), Delphi method, consensus development conferences, or nominal groups, can strengthen indicator validity (Wensing et al., 2020), with RAM demonstrating evidence of validity (McGlynn, 2010). The second step aims to enhance validity and reliability by reviewing the literature to identify how comparable studies have operationalised similar quality concepts and indicators (Vassar & Holzmann, 2013). Despite its utility, MRR for outpatient CAMHS depression care appears underexplored, with existing tools and studies primarily addressing inpatient safety or pharmacological aspects (Hetrick et al., 2012; Nilsson et al., 2020; Socialstyrelsen, 2021). Specific, agreed-upon, and contextualised indicators aligned with quality frameworks have been called for (Kilbourne et al., 2018; Leslie et al., 2018; Quinlan-Davidson et al., 2021; Skokauskas et al., 2019; Williams & Beidas, 2019). Without such standardisation, the use of various instruments can lead to inconsistent assessments with a low degree of comparability, increasing diversity.

Objectives

The primary objective was to systematically review how quality has been operationalised in primary studies using MRR to assess the quality of outpatient depression healthcare for children and adolescents – focusing on indicator development methods, thematic areas, analytical approaches, and alignment with quality frameworks – while also evaluating consistency and proposing strategies to strengthen the validity and reliability of future quality improvement research.

Method

This review was based on primary studies published in peer-reviewed journals. The systematic literature search, review, and thematic synthesis followed the PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analyses) 2020 statement (Page et al., 2021; Rethlefsen et al., 2021) and the SWiM (Synthesis Without Meta-Analysis) reporting guideline (Campbell et al., 2020).

Search Strategy and Eligibility Criteria

A structured search was conducted in PubMed, CINAHL, and PsycInfo using a Population/Problem, Intervention, Comparison/Control, Outcome (PICO) framework, as detailed in Table 1. Search blocks comprised free-text terms, relevant keywords, and database-specific indexing terms. No time constraints or language restrictions were applied, although all search terms were in English. Search sensitivity was calibrated against key articles (Ellis et al., 2019; Hermens et al., 2015; Pile et al., 2020), which were selected by two authors (SR and HJ) who also defined the eligibility criteria and themes for data extraction.

Table 1.

PICO Framework Used to Design the Search Strategy

PICO	Description	Search terms
Population/Problem	Children or adolescents receiving outpatient care for depression	Adolescent, adolescence, child, children, young people, young adults, depression, depressive disorder
Intervention	Medical record review	Medical record review, retrospective chart review, records, data
Comparison/Control	Clinical practice guidelines or quality indicators	Assessing, assessment, guidelines
Outcome	Depression healthcare quality	Quality of care, care quality, safe care, evaluation, adherence

Articles were eligible if they focused on children or adolescents aged up to 17 years diagnosed with depression, evaluated outpatient depression quality using MRR, and included quantitative quality indicators. Primary care studies were included if the healthcare processes were comparable to those in outpatient CAMHS. Studies focusing exclusively on primary care screening, pharmacological treatment (not typically recommended as first-line treatment for this age group), or subgroups with depression associated with specific somatic conditions or traumatic events were excluded.

Selection, Data Abstraction, and Analysis

Searches were conducted on 30 November 2024. The first author performed a stepwise manual selection process, starting with titles and abstracts, followed by full-text assessment for eligibility. Reference lists of key articles (Ellis et al., 2019; Hermens et al., 2015; Pile et al., 2020) were also hand-searched by two authors (SR and HJ) to identify any additional eligible studies. An updated search was conducted by the first author on 31 March 2025, covering the period from 1 December 2024 to 31 March 2025. As no further eligible studies were identified, the PRISMA flow diagram and all associated tables reflect the original search results. Because this review focused exclusively on methodological approaches to quality operationalisation, no formal evidence grading was conducted. Data extraction was performed by the first author, with the included studies and extracted data independently reviewed and verified by another author (BH). Consensus was confirmed through author discussions.

Extracted data included basic study characteristics (authorship, publication year, study country, setting, patient age, and sample size), quality indicator development methods (quality standards and consensus methods), quality indicators used in MRR (individual indicators and composite measures/domains with occurrences), and analytical approaches (how indicators were assessed and reported, relevant complementary measures or analyses, and use of IOM or WHO frameworks). To ensure clarity, “indicators” in this review refer to individual quality measures, and “domains” to composite measures, regardless of primary study terminology. Supplementary materials of the included articles were also reviewed where available. A thematic synthesis, conducted by the first author, and verified by another author (BH), identified common indicator themes and methodological variations.

Results

The initial systematic literature search identified 1,690 unique articles across the three databases. Results for each search string are presented in Table 2, reflecting the original search. An updated search on 31 March 2025 resulted in no additional eligible studies. The specific search strings used for PubMed, CINAHL, and PsycInfo are detailed in Appendix Table A3.

Table 2.

Database Searches Conducted on 30 November 2024; Updated in March 2025 with No Further Eligible Studies Identified

Search number	Search strings (title or abstract)	PubMed	CINAHL	PsycInfo
#1	Adolescen* OR child* OR young people OR young adults	2,087,907	707,975	1,050,921
#2	Depression OR depressive disorder	482,476	157,140	329,081
#3	Assessing OR assessment OR guideline*	2,165,191	660,020	573,538
#4	Quality of care OR care quality OR safe care OR evaluation OR adherence	1,804,630	562,783	389,604
#5	Medical record review OR retrospective chart review OR records OR data	5,436,448	1,152,243	5,429,797
#6	#1 AND #2 AND #3 AND #4 AND #5	478	326	1,467

Exclusions

The majority of articles met none or only one of the inclusion criteria: children or adolescents, depressive disorder, or MRR methodology. Several of these studies focused on adult patients, pharmacological treatment alone, mental health conditions other than depression, and/or depression associated with specific somatic conditions – neurological, oncological, endocrine, and cardiac diseases – or traumatic events, where interventions primarily targeted the somatic condition or trauma. The selection process is outlined in Figure 3. Among the 12 full-text articles assessed, which met more than one inclusion criteria, five were excluded for not using MRR, while one focused on pharmacological treatment (Hetrick et al., 2012).

Figure 3.

Identification and selection of peer-reviewed articles for systematic review. Initial searches conducted on 30 November 2024

Search Outcomes

Six articles met the eligibility criteria (Ellis et al., 2019; Garbutt et al., 2022; Hermens et al., 2015; Kramer et al., 2008; Pile et al., 2020; Zima et al., 2005). The overall precision of the database search was 0.4% (6/1,690), as all included articles were identified in at least one of the databases. One additional article was identified through the reference lists of key articles but was excluded following the full-text review (Hetrick et al., 2012). Since all included articles were retrieved through the database search, the recall rate was 100% (6/6). Three of the included articles were present in all databases (Ellis et al., 2019; Hermens et al., 2015; Pile et al., 2020). Precision and recall for each database are detailed in Table 3.

Table 3.

Precision and Recall of Database Search Results – Findings Specified by First Authors of Included Articles

Databases	Precision (P)	Recall (R)	Garbutt	Pile	Ellis	Hermens	Kramer	Zima
PubMed	4/478 = 0.8%	4/6 = 67%	–	+	+	+	–	+
CINAHL	5/326 = 1.5%	5/6 = 83%	+	+	+	+	+	–
PsycInfo	5/1,467 = 0.3%	5/6 = 83%	+	+	+	+	–	+

Note. R = Recall: The proportion of included articles retrieved from a specific database out of all included articles.

P = Precision: The proportion of included articles retrieved from a database out of the total number of identified articles identified in that database.

Study Characteristics

The six included studies were conducted in the United Kingdom (Pile et al., 2020), Australia (Ellis et al., 2019), the Netherlands (Hermens et al., 2015), and the United States (Garbutt et al., 2022; Kramer et al., 2008; Zima et al., 2005). Patient ages ranged from 0 to 21 years, with sample sizes varying between 45 and 655 medical records. The articles were published between 2005 and 2022. Detailed characteristics for each study are provided in Appendix Table A4.

Indicator Development and Operationalisation

Four studies developed their indicators based on clinical guidelines (Ellis et al., 2019; Garbutt et al., 2022; Hermens et al., 2015; Pile et al., 2020). The other two used a literature review (Kramer et al., 2008) and consensus methods, including RAM and a modified Delphi process (Zima et al., 2005). Two studies explicitly detailed their use of established consensus methodologies (Ellis et al., 2019; Zima et al., 2005). These studies also evaluated healthcare quality across multiple diagnoses, presenting depression-specific results separately in the main text or supplementary materials. Two studies reported both pre- and post-intervention proportions of approved indicators (Garbutt et al., 2022; Pile et al., 2020).

The studies reported a total of 73 indicators, ranging from 3–32 per study, with a median of 8.5. Indicators covered the themes of risk assessment, diagnostic assessment, treatment, and monitoring/follow-up. Binary opportunity indicator assessment (approved/not approved) was universally applied. Hence, continuous scales were not used. The operationalisation methodologies were inconsistently described, and none of the included studies linked their quality concepts or indicators to the IOM or WHO frameworks (IOM, 2001; WHO, 2006). Full data abstraction is available in Appendix Table A4.

Quality Indicators

The distribution of indicators by common themes is presented in Table 4. Indicators for risk assessment, diagnostic assessment, and the provision of appropriate treatment based on guideline indications (psychosocial, psychotherapy, and/or pharmacological) were included in all but one study (Garbutt et al., 2022). All studies included indicators for monitoring or follow-up. Risk assessment indicators were the most homogeneous, whereas diagnostic assessment and monitoring/follow-up indicators were the most common and diverse. Indicator heterogeneity was evident, except for risk assessments. Psychotherapy indicators focused on appropriate treatment indications, components, and session frequency. Pharmacology indicators addressed proper medication selection, written consent, provision of information, follow-up schedules, and monitoring parameters. Additional details are available in Appendix Table A4.

Table 4.

Distribution of Quality Indicators by Themes Identified Through Thematic Synthesis

Authors Year	Risk assessments	Diagnostic assessments	Depression monitoring	Psychosocial/social interventions	Psychotherapy interventions	Pharmacology interventions
Garbutt el al. (2022)	—	—	Follow-up: -After diagnosis -Once stable -Self-rating used	—	—	—
Pile et al. (2020)	-Completed as recommended -Full risk screen cases	Parental mental health issues: -Considered -Identified	-Self-rating used	—	-Evidence-based intervention offered	-Current or past antidepressant prescription
Ellis et al. (2019)	-Self-harm/suicidal intent	-Functional level -Other causes Circumstances: -Family -Personal/ interpersonal	-Mental state -Treatment goals -Home functioning -School functioning	-Information/provision -Community support -Emergency plan	-First treatment if moderate/severe depression	-Not first treatment for mild depression -Side effects monitoring
Hermens et al. (2015)	-Suicide risk	-Screening tool used -Semi-structured interview used -Depression severity	Response monitoring: -First intervention -Second intervention -Adjustment if needed	-Provided for mild depression	-Provided for moderate depression	-Provided as adjunct treatment for severe depression
Kramer et al. (2008)	-Suicide risk	-Substance use	-Treatment duration at least eight sessions	-Family intervention	-At least one CBT component included	-Prescribed
Zima et al. (2005)	-Aggressive behaviour -Child abuse screening Suicidal: -Risk factor -Ideation re-assessed	-Child strength -Target symptom -School symptoms -Academic functioning -Psychosocial stressor -Family mental health -Substance use -Medical problem	-Symptom ratings^a -School contact when needed -Physical examination in past year -Monthly sessions -Informed consent to release information -Harm risk- adjusted treatment setting	-Caregiver informed -Family intervention/Parent referral -Provided within the first 90 days if no psychotherapy -Monthly (6–9 months) -Suspected child abuse reported	-Behavioural therapy session^a	-Referral if needed -Written consent -Caregiver informed Monitoring^b: -Monthly (3 months) -Licensed provider -Target symptom -Side effects -Physical if needed
Indicator ∑	9	18	18	10	5	13

Note. ∑ = Total number of indicators per theme.

^aReferring to externalising disorders.

^bMedication-specific monitoring.

Quality Domains

Three of the six studies grouped their quality indicators into domains (Ellis et al., 2019; Hermens et al., 2015; Zima et al., 2005), each comprising between one and 10 indicators. The number of domains in these studies ranged from two to nine, covering two main areas: assessment and treatment/monitoring. One study also reported six indicators without domain affiliation (Ellis et al., 2019), although these were all related to treatment/monitoring. This grouping highlights variations in how studies categorised their indicators.

In contrast, studies that did not categorise indicators into domains reported fewer quality indicators overall (Garbutt et al., 2022; Kramer et al., 2008; Pile et al., 2020). The total number of indicators ranged from three to seven, whereas studies using domains had 10–32 indicators. Table 5 presents the quality domains with their respective number of indicators, alongside indicators from studies without domains for comparability. Further details are available in Appendix Table A4.

Table 5.

Quality Domains Classified Into Assessment and Treatment/Monitoring (Indicators per Domain)

Authors Year	Indicator total	Assessment domains -Indicators (if no domains)		Treatment/monitoring domains -Indicators (if no domains)
Garbutt et al. (2022)	3	—	—	-At diagnosis, follow-up care within 6 weeks -Once stable, follow-up care within 3 months -Self-rating recorded during follow up	— — —
Pile et al. (2020)	7	-Risk assessment appropriately completed -Cases requiring a full risk screen -Consideration of parental mental health -Parental mental health issues identified	— — — —	-Self-report questionnaire -Evidence-based psychological intervention -Current or past antidepressant prescription	— — —
Ellis et al. (2019)	15	Assessment	(5)	Information, treatment, and management -Antidepressant medication correctly offered -Psychotherapy correctly offered -Monitoring of adverse drug reactions -Monitoring of mental state during medication -Home functioning outcome -School functioning outcome	(4) — — — — — —
Hermens et al. (2015)	10	Screening Diagnosis Severity assessment	(1) (1) (1)	Stepped care treatment provided Monitoring and treatment adjustment	(4) (3)
Kramer et al. (2008)	6	-Suicide assessment -Substance use assessment	— —	-Family intervention -Antidepressant medication prescribed -Treatment duration, minimum number if visits -Psychological treatment component	— — — —
Zima et al. (2005)	32	Initial clinical assessment	(10)	Service sector linkage Basic treatment principles Psychosocial treatment Medication referral Safety – patient protection Safety – informed medication decision Safety – medication monitoring Safety – medication-specific monitoring	(3) (4) (3) (1) (4) (2) (2) (3)

Analytical Approaches

All studies utilised binary opportunity scoring for quality indicators. Since some indicators applied only to subsets of records, adherence rates were calculated based on the relevant medical records for each indicator. Thus, this approach was used for indicators related to pharmacological treatment. Two studies further examined variations in quality between subgroups defined by patient or clinical characteristics using statistical tests, such as Chi-square (Kramer et al., 2008) and logistic regression (Zima et al., 2005).

Studies presenting quality domains adopted different approaches to analysing these composite measures (Ellis et al., 2019; Hermens et al., 2015; Zima et al., 2005), none of which were empirically derived. One study calculated domain adherence as the average proportion of approved indicators within the domain (Hermens et al., 2015), while the other two defined thresholds for approved quality (Ellis et al., 2019; Zima et al., 2005). Of these, one required all indicators within a domain to be approved to meet the quality threshold (Ellis et al., 2019). The other presented three measures: the proportion with any domain indicator approved, the proportion with all domain indicators approved, and the proportion with a domain indicator mean at or above the 75^th percentile, which was defined as the threshold for “probable acceptable care” (Zima et al., 2005).

Two studies incorporated mixed methods, adding complementary qualitative data from healthcare providers (Garbutt et al., 2022; Hermens et al., 2015). Another study analysed service access in relation to expected prevalence (Pile et al., 2020), while one examined concordance between MRR diagnoses and those derived from structured diagnostic interviews (Kramer et al., 2008). The level of methodological detail varied across studies, complicating comparisons of analytical approaches. Further details are provided in Appendix Table A4, which also presents the approved proportions for each quality indicator.

Discussion

This systematic review, using no date restrictions, identified only six primary studies assessing outpatient depression healthcare quality for children and adolescents using quality indicators in medical records. All were conducted in Western countries with well-developed psychiatric services for this population, as reflected in clinical practice guidelines (Hickie et al., 2019; NICE, 2019; Walter et al., 2023). However, quality measures were inconsistent, and quality conceptualisation was evidently insufficient. The operationalisation of healthcare quality was sparsely described in some studies, and methodologies for quality indicator development varied. While all studies reported some form of content validity, four based their indicators on clinical practice guidelines. This approach can be challenging, as high-quality interventions in complex clinical settings may not always align with guideline recommendations.

Two studies explicitly mentioned using established consensus methods (RAM and modified Delphi), potentially enhancing indicator validity (Ellis et al., 2019; Zima et al., 2005). These studies also provided more detailed methodological descriptions. Three studies reported very few quality indicators (Garbutt et al., 2022; Kramer et al., 2008; Pile et al., 2020), whereas the others included more detailed and numerous indicators – in line with characteristics linked to research aims, as illustrated in Figure 2 – and encompassed broader aspects of healthcare quality (Ellis et al., 2019; Hermens et al., 2015; Zima et al., 2005). These latter studies summarised quality indicators into domains, which may facilitate linkage to global quality frameworks. All studies employed opportunity scoring, allowing for contextual appropriateness – for example recommending pharmacological treatment for severe depression but not for mild depression. However, they all relied on binary indicator assessment, limiting granularity in evaluating healthcare quality.

Quality Indicators and Domains

Common themes among quality indicators included risk assessments, diagnostic assessments, monitoring/follow-up, and various treatment interventions. Risk assessment indicators were the most homogeneous – perhaps reflecting the influence of existing tools focusing on patient safety. In contrast, diagnostic assessment and monitoring/follow-up indicators showed the greatest variability. Complex clinical conditions remain challenging to operationalise through indicators, which may explain the heterogeneity among indicators. Overall, indicators were rarely described consistently or aligned with the nationally advocated quality indicators presented in Appendix Tables A1 and A2 (Lewandowski et al., 2013; NICE, 2013). This lack of consensus confirms findings from previous research (Kilbourne et al., 2018; Leslie et al., 2018; Lewandowski et al., 2013; Quinlan-Davidson et al., 2021; Skokauskas et al., 2019; Williams & Beidas, 2019).

Three studies utilised quality domains related to assessment or treatment/monitoring (Ellis et al., 2019; Hermens et al., 2015; Zima et al., 2005). None employed empirical methodologies for summarising indicators, and domain adherence was reported differently across studies. Two studies defined thresholds for domain approval, but their approaches varied: one required all indicators within a domain to be met (Ellis et al., 2019), while the other empirically defined a threshold for “probable acceptable care” (Zima et al., 2005). Although empirical methodology is appealing, the latter approach introduces variability in standards, as it depends on measurement rather than predefined criteria. Conversely, high normative standards may lead to overly stringent assessments, which a continuous quality scale could mitigate. However, different approaches can lead to incomparability between studies. Thus, providing multiple adherence measures may offer a practical solution.

Global Quality Frameworks

None of the studies linked their quality concepts to the IOM or WHO quality frameworks (IOM, 2001; WHO, 2006), confirming their infrequent use and highlighting the lack of standardised frameworks in operationalisation. Retrospective application of such frameworks is impractical given the nature of existing indicators. Determining how best to represent the IOM aims (Safe, Effective, Efficient, Patient-centered, Equitable, and Timely) remains challenging, as domains and indicators often overlap across multiple aims. Nonetheless, applying global frameworks in future research may be essential for enabling cross-contextual comparisons of healthcare quality across diagnoses, treatments, settings, and demographics.

Complementary Measurements

Four studies employed additional methods alongside MRR (Garbutt et al., 2022; Hermens et al., 2015; Kramer et al., 2008; Pile et al., 2020), consistent with the Donabedian model’s three elements: Structure, Process, and Outcome (Donabedian, 2005), as outlined in Figure 1. For instance, one study examined the concordance between medical record diagnoses and those from structured diagnostic interviews, linking process and outcome measures (Kramer et al., 2008). Another combined provider reports, MRR follow-up indicators, and remission rates based on patient self-ratings, addressing all three elements (Garbutt et al., 2022). This approach aligns with the Implementation Outcomes Framework (Proctor et al., 2011), where complementary measures provide a comprehensive understanding of intervention outcomes.

MRR findings may be skewed by insufficient documentation, which could inaccurately suggest poor healthcare quality. However, high-quality documentation is itself a key component of healthcare quality. Patient and provider assessments of perceived healthcare quality are valuable complements to process indicators but require sufficiently validated methods (Wensing et al., 2020). For instance, dissatisfaction with ineffective treatment may not necessarily indicate poor quality, but addressing such experiences remains an essential aspect of healthcare quality. Understanding the purpose, interpretation, and interrelation of different measures is crucial for accurate quality assessments (Kilbourne et al., 2018; Wensing et al., 2020).

Implications

This systematic review highlights substantial heterogeneity in how MRR quality indicators are operationalised across studies. Despite some common themes, the lack of standardisation and alignment with global quality frameworks limits comparability. Without such standardisation, there is a risk of generating multiple concepts and frameworks, leading to inconsistent assessments with limited comparability. Quality indicators based on guidelines and scientific literature have been proposed, as detailed in Appendix Table A1, yet they appear to have had little influence on quality measurement within this field and provide limited insight into the quality of care processes. This underscores the need for adopting more rigorous methodologies for operationalising quality, especially in research settings (Wensing et al., 2020). Strengthening the evidence base for healthcare quality measurement is essential, and the recommended two-step operationalisation of MRR indicators is crucial in achieving consensus. However, addressing processes tailored to complex clinical conditions remains a challenge.

Although identifying systemic barriers to the use of MRR in CAMHS was beyond the scope of this review, several remarks and reflections can nonetheless be made. The two-step operationalisation of quality indicators, which enhances the validity and reliability of MRR (Vassar & Holzmann, 2013), places substantial demands on knowledge, skills, and time. Another complicating factor is the variable quality and consistency of CAMHS medical records concerning depression – particularly regarding therapeutic interventions, which are often inconsistently documented in free-text clinical notes (Wickersham et al., 2024). In addition, a thorough understanding of both what information is recorded and how is crucial when conducting MRR. Based on the authors’ own experience, CAMHS medical records relating to depression care present considerable interpretative challenges. Greater standardisation of record systems across different providers has been recommended (Wickersham et al., 2024). Taken together, these observations suggest that resource limitations may be an important factor explaining the infrequent use of MRR in outpatient depression care, as well as discrepancies between indicators. This interpretation aligns with previous research showing that CAMHS services generally suffer from insufficient and unevenly distributed resources (Moitra et al., 2022), and that non-adherence to clinical guidelines is associated with such resource constraints (Mora Ringle et al., 2019).

Using clinical guidelines as a foundation for quality measurement offers the advantage of clearly defined intervention descriptions, provided that the guidelines themselves are sufficiently detailed (Wensing et al., 2020). However, existing tools and studies have primarily focused on patient safety and pharmacological aspects (Hetrick et al., 2012; Nilsson et al., 2020; Socialstyrelsen, 2021), likely reflecting that such indicators are easier to operationalise – an observation supported by the homogeneity of the risk assessment indicators identified in this review. Since patient outcomes depend on multiple factors, many of which are unrelated to care processes, process indicators remain essential for quality measurement (Lewandowski et al., 2013). Quality indicators should be clearly aligned with their intended aims and areas of study, ensuring valid and reliable links between frameworks, indicators, and patient outcomes. However, this is complicated by the limited evidence on CAMHS quality measures that align with established quality frameworks (Quinlan-Davidson et al., 2021). Yet, such frameworks are fundamental not only for conceptualising quality but also for integrating rigour into operationalisation and enabling cross-study comparisons (IOM, 2001; WHO, 2006). Incorporating these frameworks would enhance validity and expand knowledge of the frameworks themselves. Nevertheless, linking quality indicators to quality frameworks can be challenging, as tensions and overlaps exist among the dimensions – Safe, Effective, Efficient, Patient-centred, Equitable, and Timely – and there is no established consensus on how these frameworks should be applied (Quinlan-Davidson et al., 2021).

The inherent difficulties in operationalising healthcare quality indicators may underscore the importance of linking these indicators to established global quality frameworks, particularly in implementation research utilising the Implementation Outcomes Framework (Proctor et al., 2011; Wensing et al., 2020). Incorporating global quality frameworks early in the operationalisation process lays the groundwork for developing appropriate quality indicators and ensures that all critical aspects are addressed. However, certain dimensions of these frameworks may be more effectively and efficiently measured through methods other than medical record review, highlighting the need for complementary approaches.

Quality assessments serve diverse purposes and rely on data sources with varying levels of availability. Ideally, indicators operationalised in research contexts could provide a foundation from which quality improvement initiatives focused on clinical care delivery may select fewer, approximate, yet still valid measures. Identifying links between specific process indicators, quality frameworks, qualitative aspects, and health outcomes could streamline future operationalisation and measurement of healthcare quality, while maintaining validity. Conversely, reliance on low-validity indicators risks displacement effects, including resource misallocation and potential adverse clinical consequences. Emerging techniques, including artificial intelligence, offer opportunities for more detailed and nuanced measurements but require robust, evidence-based quality indicators to achieve their full potential.

Strengths and Limitations

The sensitivity (recall) and precision of the searches suggest that the thorough process likely captured most available studies in this field. However, only six heterogeneous studies met the eligibility criteria, highlighting the rarity of outpatient MRR studies on depression healthcare quality for children and adolescents, likely due to the method’s resource-intensive nature. The limited number of included studies and the heterogeneity of indicators constrain the generalisability of findings. Additionally, the homogeneity of healthcare settings in these studies further restricts broader applicability. Not including studies addressing non-depressive psychiatric conditions may have limited the scope of insights, although indicators for other diagnoses would likely not align with depression-specific care. As this review focused on the operationalisation of healthcare quality in MRR, it excluded studies using electronic health records, as well as methods for assessing patient outcomes and experiences or provider perspectives on care processes. These methods, however, are important complements.

One article was initially considered eligible for inclusion but was ultimately excluded due to its primary focus on pharmacological treatment, following author discussion (Hetrick et al., 2012). The study was conducted in Australia and included a sample of 150 participants aged 15–25 years. Four quality indicators, derived from a clinical guideline, were used. However, inclusion of this article would not have affected the overall results or conclusions of the present systematic review.

Having a single investigator (SR) responsible for study selection posed a potential risk of bias and omission of relevant studies. This risk could not be fully mitigated, although another author (HJ) hand-screened the key articles to identify any additional eligible studies. The included studies and extracted data were then independently reviewed and verified by another author (BH) to minimise errors in data extraction. Due to practical constraints, detailed reasons for exclusion were not documented for studies excluded during the title and abstract screening stage.

Conclusions

This systematic review reveals substantial variation in how quality is conceptualised and measured, highlighting the need for more rigorous and standardised approaches to operationalising healthcare quality indicators for depression care in CAMHS. Aligning these efforts with established global healthcare quality frameworks can enhance the validity, reliability, and comparability of future studies. Strengthening the evidence base for quality indicators is essential not only to advance research, but also to support clinical quality improvement efforts and improve patient outcomes in this vulnerable population. The use of MRR and validated quality indicators enables the identification of specific deficiencies in healthcare delivery. As tailored approaches are often required, this can enhance the effectiveness of quality improvement initiatives aimed at optimising recommended, evidence-based care, ultimately leading to improved clinical outcomes and measurable resource efficiency. Beyond the benefits for patients and healthcare services, preventing suicide and other severe consequences contributes to wider, sustainable societal development.

Footnotes

Acknowledgement

This research was supported by Linus Brorsson, research librarian at Lund University, Lund, Sweden, whose expertise was invaluable.

ORCID iDs

Susanne Remvall

Håkan Jarbin

Consent to Participate

No procedures involving human participants were performed.

Author Contributions

SR and HJ conceptualised the research questions, selected the key articles, defined the eligibility criteria and themes for data extraction, and hand-searched the key articles for additional eligible studies. SR and BH specified the methodology and discussed the article that was initially considered eligible for inclusion but ultimately excluded. SR conducted the systematic search, screening, review, and synthesis, and drafted the manuscript. BH independently reviewed the included studies and verified the extracted data. All authors (SR, BH, and HJ) revised the manuscript, provided critical commentary, and approved the final version.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Financial support was provided by Region Skåne and the Stiftelsen Lindhaga (2024-12-04).

Declaration of Conflicting Interests

The authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Following the updated search on 31 March 2025, a medical record review by the same research team, Quality of depression assessments in child and adolescent psychiatry: Findings from a nationwide Swedish medical record review, was published in June 2025 (after submission of this review). Although it met the inclusion criteria, it was not included to avoid self-citation bias. This systematic review also formed part of the two-step operationalisation of quality indicators underpinning that study. The article is available at .

Data Availability Statement

No datasets were generated or analysed. Search strings are provided in .*

Author Biographies

Susanne Remvall is a senior consultant in child and adolescent psychiatry and a PhD student. Her research focuses on depression, healthcare quality, and guideline implementation. She is currently involved as a researcher in the nationwide Swedish multicentre research and implementation programme Deplyftet, which utilises mixed methods to assess the nationwide implementation of a clinical practice guideline for depression.

Håkan Jarbin is a senior consultant in child and adolescent psychiatry, a psychotherapist, and holds a PhD. He serves as a clinical medical director and is the principal investigator of a national implementation programme focused on improving healthcare for depression in young people. His clinical research focuses on depression, rating scales, and guideline implementation. He is also the principal investigator of a randomised controlled trial comparing aerobic exercise with leisure group activities in adolescent depression.

Björn Hofvander is a senior clinical psychologist, Associate Professor, and head of the LU-CRED research group at Lund University. His research primarily aims to understand the development of aggressive and antisocial behaviour across the lifespan, with a particular focus on longitudinal trajectories from early childhood to older age, and their interaction with mental health, neurocognition, somatic health, and adaptive functioning. His work is situated within forensic psychiatry and draws on theories and methods from multiple disciplines. He also contributes as a researcher to the implementation programme Deplyftet.

Appendix

Table A1.

Operational Definitions of Quality Indicators for Adolescent Depression Care (Lewandowski et al., 2013)

Quality indicators	Target population	Operational definitions
Depression screening	Adolescents presenting to healthcare services	Screening using a validated depression screening instrument
Depression diagnosis	Adolescents with a positive depression screening result or presenting to specialist services with depression-related symptoms	Diagnostic assessment for depressive disorder
Suicide assessment	Adolescents with depression or reporting self-harm	Suicide risk assessment
Initial counselling	Adolescents with mild depression	A minimum of two supportive sessions within two months, including psychoeducation, validation and problem-focused strategies
Treatment initiation	Adolescents with moderate, severe, or persistent mild depression	Initiation of psychotherapy or antidepressant medication, or referral to specialist services
Healthcare coordination	Adolescents with depression receiving care across service levels	Coordinated communication between involved care providers
Pharmacological treatment	Adolescents with depression initiating pharmacological treatment	Antidepressant medication for a minimum of two months
Psychological treatment	Adolescents with depression initiating psychotherapy	A minimum of eight psychotherapy sessions within four months
Symptom reassessment	Adolescents with depression	Depressive symptom monitoring using a validated instrument within two to three months
Remission achievement	Adolescents with depression	Symptom levels below the clinical screening threshold for depression or no longer meeting diagnostic criteria within six months
Treatment modification	Adolescents with depression showing insufficient symptom reduction and remaining above the clinical threshold on a validated instrument, or continuing to meet diagnostic criteria after three months of treatment	Treatment adjustment, including addition or intensification of medication or psychotherapy, or referral to specialist services

Note. Adapted and synthesised from Lewandowski et al. (2013).

Table A2.

Quality Requirements derived from the NICE Depression in Children and Young People Quality Standard (NICE, 2013)

Target population	Quality requirements
Children and adolescents with suspected depression	Formal documentation of a confirmed diagnosis in the medical record
Children and adolescents with depression	Provision of age-appropriate information about depression and available treatment options
Children and adolescents with suspected severe depression and elevated suicide risk	Assessment by CAMHS within 24 hours of referral, with safety arrangements in place while awaiting assessment, where required
Children and adolescents with suspected severe depression without elevated suicide risk	Assessment by CAMHS within two weeks of referral
Children and adolescents undergoing depression treatment	Clinical presentation recorded at the initiation and completion of each treatment phaseClinical presentation recorded at the initiation and completion of each treatment phase

Note. NICE = National Institute for Health and Care Excellence; CAMHS = Child and Adolescent Mental Health Services. Adapted from NICE, 2013, Depression in Children and Young People: Quality Standard (QS48). © NICE 2013.

Table A3.

Database Searches in PubMed, CINAHL, and PsycInfo Conducted on 30 November 2024. An Updated Search on 31 March 2025 Resulted in No Additional Eligible Studies

Databases	Searches	Search string	Results
PubMed
	#1	adolescent[Title/Abstract] OR child[Title/Abstract] OR young people[Title/Abstract] OR young adults[Title/Abstract]	2,087,907
	#2	Depression [Title/Abstract] OR depressive disorder[Title/Abstract]	482,476
	#3	Assessing [Title/Abstract] OR assessment[Title/Abstract] OR guideline*[Title/Abstract]	2,165,191
	#4	Quality of care [Title/Abstract] OR care quality[Title/Abstract] OR safe care[Title/Abstract] OR evaluation[Title/Abstract] OR adherence[Title/Abstract]	1,804,630
	#5	Medical record review [Title/Abstract] OR retrospective chart review[Title/Abstract] OR records[Title/Abstract] OR data[Title/Abstract]	5,436,448
	#6	#1 AND #2 AND #3 AND #4 AND #5	478
CINAHL
	#1	TI (adolescent* OR child* OR young people OR young adults) OR AB (adolescen* OR child* OR young people OR young adults)	707,975
	#2	TI (depression OR depressive disorder) OR AB (depression OR depressive disorder)	157,140
	#3	TI (assessing OR assessment OR guideline) OR AB (assessing OR assessment OR guideline)	660,020
	#4	TI (quality of care OR care quality OR safe care OR evaluation OR adherence) OR AB (quality of care OR care quality OR safe care OR evaluation OR adherence)	562,783
	#5	TI (medical record review OR retrospective chart review OR records OR data) OR AB (medical record review OR retrospective chart review OR records OR data)	1,152,243
	#6	#1 AND #2 AND #3 AND #4 AND #5	326
PsycInfo
	#1	TI (adolescent* OR child* OR young people OR young adults) OR AB (adolescen* OR child* OR young people OR young adults)	1,050,921
	#2	TI (depression OR depressive disorder) OR AB (depression OR depressive disorder)	329,081
	#3	TI (assessing OR assessment OR guideline) OR AB (assessing OR assessment OR guideline)	573,538
	#4	TI (quality of care OR care quality OR safe care OR evaluation OR adherence) OR AB (quality of care OR care quality OR safe care OR evaluation OR adherence)	389,604
	#5	TI (medical record review OR retrospective chart review OR records OR data) OR AB (medical record review OR retrospective chart review OR records OR data)	5,429,797
	#6	#1 AND #2 AND #3 AND #4 AND #5	1,467

Note. The updated search on 31 March 2025 (covering 1 December 2024–31 March 2025) yielded 10 hits in PubMed, 14 in CINAHL, and 22 in PsycInfo, with no additional eligible studies identified.

Table A4.

Thematic Synthesis of Articles Included in the Systematic Review (n = 6)

Authors (publication year) Country – Study setting Patient age (sample size)	Quality indicator (QI) development: Quality standard -Consensus method	Quality measures: -Quality domains (QD) = composite measures -Quality indicators (QI) for medical record review (MRR)	Proportion approved (actual %)	Analysis approaches and result reporting Complementary measures and analyses Use of IOM or WHO quality frameworks
Garbutt et al. (2022) United States – Primary care 12-17 years (n = 199/217)	QIs derived from guidelines -Expert consensus -Literature review	-At diagnosis; follow-up care (call or visit) within 6 weeks -Once stable; follow-up care (call or visit) within 3 months -Self-rating PHQ-A recorded at least once during follow up	40/81 30/60 34/76	QI assessment: % of all cases approved Other measures: Remission rates; provider reports Quality framework: Not mentioned
Pile et al. (2020) United Kingdom – CAMHS 5-18 years (n = 45/45)	QIs derived from guidelines -NICE guideline consensus method	-Risk assessment appropriately completed -Cases requiring a full risk screen -Consideration of parental mental health -Parental mental health issues identified -Self-report questionnaire administered -Evidence-based psychological intervention offered -Currently or previously prescribed antidepressant medication	96/67 29/11 96/93 33/51 62/82 69/76 22/42	QI assessment: % of eligible cases approved Other measures: Service access per expected prevalence Quality framework: Not mentioned
Ellis et al. (2019) Australia – Inpatient care – Emergency departments – General practices – Outpatient paediatrics 3-15 years (n = 133) Depression subsample	QIs derived from guidelines -RAND Appropriateness Method	Suspected depression receiving appropriate assessment: -Family circumstances assessed -Personal and interpersonal circumstances assessed -Functional level assessed -Assessed for self-harm and/or suicidal intent -Assessed for other causes Appropriate information, treatment and management offered: -Information and resources concerning evidence-based care -Offered community supports -Treatment/management goals set -Emergency safety plan -Mild depression; not prescribed SSRI as first treatment -Moderate/severe depression; psychotherapy first treatment -Prescribed SSRI; monitoring of adverse drug reactions -Prescribed SSRI; monitoring of mental state -Home functioning treatment outcome assessed in 8 weeks -School functioning treatment outcome assessed in 8 weeks	33 67 75 78 64 59 35 79 75 78 44 91 81 69 90 74 74	QI assessment: % of eligible cases approved QD assessment: % of cases meeting all QIs within each QD Quality variation: % of QIs or QDs or approved per healthcare setting Quality framework: Not mentioned
Hermens et al. (2015) Netherlands – CAMHS 0-21 years (n = 655)	QIs derived from guidelines -Expert consensus	Screening: -Use of a screening tool Diagnosis: -Use a semi-structured interview to diagnose depression Assess the severity of the depression: -Structured severity assessment at team meeting registered Provide stepped care treatment: -Mild depression; psychosocial intervention, watchful waiting -Moderate depression; psychotherapy (IPT/CBT) -Severe depression; addition of antidepressant (Fluoxetine) -Suicide risk assessment weekly, first 6 weeks with medication Monitor treatment response – reallocate if necessary: -First-step intervention response monitoring after 4-8 weeks -Psychotherapy/SSRI response monitoring after 10-14 weeks -Insufficient response; switch to a more intensive treatment	72 38 77 41 33 66 2 25 32 46 30 69	QI assessment: % of eligible cases approved QD assessment: % of QIs approved within each QD Other measures: Provider interviews Quality framework: Not mentioned
Kramer et al. (2008) United States – CAMHS 11-18 years (n = 208)	QIs based on evidence -Literature review	-Suicide assessment -Substance use assessment -Family intervention -Antidepressant medication prescribed -Treatment duration at least eight outpatient sessions -Treatment including CBT component	77 71 80 51 41 73	QI assessment: % of eligible cases approved Quality variation: Chi-square tests for subgroup comparisons Other measures: Concordance between record-based and structured interview diagnoses Quality framework: Not mentioned
Zima et al. (2005) United States – CAMHS 6-16 years (n = 308) Depression subsample (supplementary material)	QIs derived by consensus -RAND Appropriateness Method -Modified Delphi method	Initial clinical assessment (first three months): -Child strength -Target symptom by primary caregiver -School-related symptoms report -Academic functioning -Psychosocial stressor -Mental health problem in family -Suicide risk factor -Aggressive behaviour symptom -Alcohol or illicit drug use -Medical problem Service sector linkage: -ADHD; parent-/teacher-reported symptom ratings -School contact or school records requested, when needed -Physical examination in past year or in assessment period Basic treatment principles: -Caregiver informed of child’s diagnosis and given information -Mental health visits at least monthly -If clinical information released, informed consent obtained -Family intervention or parent referral made Psychosocial treatment: -Recommended in first 90 days, if not receiving psychotherapy -ADHD or conduct disorder; behavioural therapy session -Depression; psychosocial treatment monthly for 6–9 months Medication referral: -ADHD/depression; evaluation/non-referral rational Safety - patient protection: -Screen for history of physical/sexual abuse first 3 months -If current abuse suspected; report filed -Suicidal ideation in past 4 weeks; new assessment in 3 months -Risk of harm to self/others; more restrictive treatment setting Safety – informed medication decision: -Written consent from caregiver before starting medication -Caregiver informed of mandated medication information Safety – medication monitoring: -Monthly or more frequent monitoring during first 3 months -Monitoring conducted by provider licensed to prescribe Safety – medication-specific monitoring: -Status of target symptom noted each medication visit -Status of side effect noted each medication visit -Vital sign monitored/laboratory study ordered, as indicated	86 92 70 89 96 79 91 58 80 83 — 41 37 21 49 62 47 97 — 53 78 78 79 84 64 73 38 55 81 33 20 29	QI assessment: % of eligible cases approved QD assessment 1: % of cases meeting any QI within each QD QD assessment 2: % of cases meeting all QIs within each QDQD assessment 3: % with “probable acceptable care” (mean QI per QD ≥75th percentile) Data weighted for: Two-stage probability sampling; clinic response; adjustment for missing records Quality variation: Logistic regression – “probable acceptable care” per QD (dependent variable) and subgroup predictors (patient characteristics) Quality framework: Briefly mentioned, not utilised

Note. QI = quality indicator; QD = quality domain; MRR = medical record review; IOM = Institute of Medicine; WHO = World Health Organization; PHQ-A = patient health questionnaire – adolescent; CAMHS = child and adolescent mental health services; NICE = National Institute for Health and Clinical Excellence; SSRI = selective serotonin reuptake inhibitor; CBT = cognitive behavioural therapy; IPT = interpersonal therapy; ADHD = attention deficit hyperactivity disorder.

References

Bodden

D. H. M.

Stikkelbroek

Dirksen

C. D.

(2018). Societal burden of adolescent depression, an overview and cost-of-illness study. Journal of Affective Disorders, 241, 256–262. https://doi.org/10.1016/j.jad.2018.06.015

Bourrée

Michel

Salmi

L. R.

(2008). Méthodes de consensus: revue des méthodes originales et de leurs grandes variantes utilisées en santé publique. Revue d’Epidemiologie et de Sante Publique, 56(6), 415–423. https://doi.org/10.1016/j.respe.2008.09.006

Campbell

McKenzie

J. E.

Sowden

Katikireddi

S. V.

Brennan

S. E.

Ellis

Hartmann-Boyce

Ryan

Shepperd

Thomas

Welch

Thomson

(2020). Synthesis without meta-analysis (SWiM) in systematic reviews: Reporting guideline. BMJ, 368, l6890. https://doi.org/10.1136/bmj.l6890

Donabedian

(2005). Evaluating the quality of medical care. 1966. The Milbank Quarterly, 83(4), 691–729. https://doi.org/10.1111/j.1468-0009.2005.00397.x

Ellis

L. A.

Wiles

L. K.

Selig

Churruca

Lingam

Long

J. C.

Molloy

C. J.

Arnolda

Ting

H. P.

Hibbert

Dowton

S. B.

Braithwaite

(2019). Assessing the quality of care for paediatric depression and anxiety in Australia: A population-based sample survey. Australian and New Zealand Journal of Psychiatry, 53(10), 1013–1025. https://doi.org/10.1177/0004867419866512

Ferrari

A. J.

, (2022). Global, regional, and national burden of 12 mental disorders in 204 countries and territories, 199-2019: A systematic analysis for the global burden of disease study 2019. The Lancet Psychiatry, 9(2), 137–150. https://doi.org/10.1016/S2215-0366(21)00395-3

Fisher

C. E.

Spaeth-Rublee

Alan Pincus

IIMHL Clinical Leaders Group . (2013). Developing mental health-care quality indicators: Toward a common framework. International Journal for Quality in Health Care, 25(1), 75–80. https://doi.org/10.1093/intqhc/mzs074

Garbutt

Dodd

Rook

Graham

Wang

Sterkel

Plax

(2022). Improving Follow-Up for adolescents with depression in primary care. Pediatrics, 149(6), e2021051107. https://doi.org/10.1542/peds.2021-051107

Gearing

R. E.

Mian

I. A.

Barber

Ickowicz

(2006). A methodology for conducting retrospective chart review research in child and adolescent psychiatry. Journal of the Canadian Academy of Child and Adolescent Psychiatry, 15(3), 126–134.

10.

Hermens

M. L.

Oud

Sinnema

Nauta

M. H.

Stikkelbroek

van Duin

Wensing

(2015). The multidisciplinary depression guideline for children and adolescents: An implementation study. European Child & Adolescent Psychiatry, 24(10), 1207–1218. https://doi.org/10.1007/s00787-014-0670-4

11.

Hetrick

S. E.

Thompson

Yuen

Finch

Parker

A. G.

(2012). Is there a gap between recommended and ‘real world’ practice in the management of depression in young people? A medical file audit of practice. BMC Health Services Research, 12, 178. https://doi.org/10.1186/1472-6963-12-178

12.

Hickie

I. B.

Scott

E. M.

Cross

S. P.

Iorfino

Davenport

T. A.

Guastella

A. J.

Naismith

S. L.

Carpenter

J. S.

Rohleder

Crouse

J. J.

Hermens

D. F.

Koethe

Markus Leweke

Tickell

A. M.

Sawrikar

Scott

(2019). Right care, first time: a highly personalised and measurement-based care model to manage youth mental health. Med J Aust, 211(9), 3–46. https://doi.org/10.5694/mja2.50383

13.

IOM . (2001). Crossing the quality chasm: A new health system for the 21st century. National Academies Press (US).

14.

Iyer

S. P.

Spaeth-Rublee

Pincus

H. A.

(2016). Challenges in the operationalization of mental health quality measures: An assessment of alternatives. Psychiatric Services, 67(10), 1057–1059. https://doi.org/10.1176/appi.ps.201600198

15.

Kilbourne

A. M.

Beck

Spaeth-Rublee

Ramanuj

O'Brien

R. W.

Tomoyasu

Pincus

H. A.

(2018). Measuring and improving the quality of mental health care: A global perspective. World Psychiatry, 17(1), 30–38. https://doi.org/10.1002/wps.20482

16.

Kramer

T. L.

Miller

T. L.

Phillips

S. D.

Robbins

J. M.

(2008). Quality of mental health care for depressed adolescents. American Journal of Medical Quality, 23(2), 96–104. https://doi.org/10.1177/1062860607310919

17.

Kruk

M. E.

Gage

A. D.

Arsenault

Jordan

Leslie

H. H.

Roder-DeWan

Adeyi

Barker

Daelmans

Doubova

S. V.

English

García-Elorrio

Guanais

Gureje

Hirschhorn

L. R.

Jiang

Kelley

Lemango

E. T.

Liljestrand

Twum-Danso

N. A. Y.

(2018). High-quality health systems in the sustainable development goals era: Time for a revolution. Lancet Global Health, 6(11), e1196–e1252. https://doi.org/10.1016/s2214-109x(18)30386-3

18.

Leslie

H. H.

Hirschhorn

L. R.

Marchant

Doubova

S. V.

Gureje

Kruk

M. E.

(2018). Health systems thinking: A new generation of research to improve healthcare quality. PLoS Medicine, 15(10), Article e1002682. https://doi.org/10.1371/journal.pmed.1002682

19.

Lewandowski

R. E.

Acri

M. C.

Hoagwood

K. E.

Olfson

Clarke

Gardner

Scholle

S. H.

Byron

Kelleher

Pincus

H. A.

Frank

Horwitz

S. M.

(2013). Evidence for the management of adolescent depression. Pediatrics, 132(4), e996–e1009. https://doi.org/10.1542/peds.2013-0600

20.

McGlynn

(2010). Measuring clinical quality and appropriateness. In Smith

E. M.

Papanicolas

Leatherman

(Eds.), Performance measurement for health system improvement: Experiences, challenges and prospects. Cambridge University Press. https://doi.org/10.1017/CBO9780511711800.005

21.

Moitra

Santomauro

Collins

P. Y.

Vos

Whiteford

Saxena

Ferrari

A. J.

(2022). The global gap in treatment coverage for major depressive disorder in 84 countries from 2000-2019: A systematic review and Bayesian meta-regression analysis. PLoS Medicine, 19(2), Article e1003901. https://doi.org/10.1371/journal.pmed.1003901

22.

Moitra

Santomauro

Degenhardt

Collins

P. Y.

Whiteford

Vos

Ferrari

(2021). Estimating the risk of suicide associated with mental disorders: A systematic review and meta-regression analysis. Journal of Psychiatric Research, 137, 242–249. https://doi.org/10.1016/j.jpsychires.2021.02.053

23.

Mora Ringle

V. A.

Hickey

J. S.

Jensen-Doss

(2019). Patterns and predictors of compliance with utilization management guidelines supporting a state policy to improve the quality of youth mental health services. Children and Youth Services Review, 96, 194–203. https://doi.org/10.1016/j.childyouth.2018.11.035

24.

NICE . (2013). Depression in children and young people. Quality standard [QS48]. https://www.nice.org.uk/guidance/qs48

25.

NICE . (2019). Depression in children and young people: identification and management. NICE guideline [NG134]. Natinal Institute for Health and Care Excellence.

26.

Nilsson

Borgstedt-Risberg

Brunner

Nyberg

Nylén

Ålenius

Rutberg

(2020). Adverse events in psychiatry: A national cohort study in Sweden with a unique psychiatric trigger tool. BMC Psychiatry, 20(1), 44. https://doi.org/10.1186/s12888-020-2447-2

27.

Page

M. J.

McKenzie

J. E.

Bossuyt

P. M.

Boutron

Hoffmann

T. C.

Mulrow

C. D.

Shamseer

Tetzlaff

J. M.

Akl

E. A.

Brennan

S. E.

Chou

Glanville

Grimshaw

J. M.

Hróbjartsson

Lalu

M. M.

Loder

E. W.

Mayo-Wilson

McDonald

Whiting

(2021). The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. Systematic Reviews, 10(1), 89. https://doi.org/10.1186/s13643-021-01626-4

28.

Pile

Shammas

Smith

(2020). Assessment and treatment of depression in children and young people in the United Kingdom: Comparison of access to services and provision at two time points. Clinical Child Psychology and Psychiatry, 25(1), 119–132. https://doi.org/10.1177/1359104519858112

29.

Proctor

Silmere

Raghavan

Hovmand

Aarons

Bunger

Griffey

Hensley

(2011). Outcomes for implementation research: Conceptual distinctions, measurement challenges, and research agenda. Administration and Policy in Mental Health, 38(2), 65–76. https://doi.org/10.1007/s10488-010-0319-7

30.

Quinlan-Davidson

Roberts

K. J.

Devakumar

Sawyer

S. M.

Cortez

Kiss

(2021). Evaluating quality in adolescent mental health services: A systematic review. BMJ Open, 11(5), Article e044929. https://doi.org/10.1136/bmjopen-2020-044929

31.

Rethlefsen

M. L.

Kirtley

Waffenschmidt

Ayala

A. P.

Moher

Page

M. J.

Koffel

J. B.

Blunt

Brigham

Chang

Clark

Conway

Couban

de Kock

Farrah

Fehrmann

Foster

Fowler

S. A.

Glanville

Group

P. S.

(2021). PRISMA-S: An extension to the PRISMA statement for reporting literature searches in systematic reviews. Systematic Reviews, 10(1), 39. https://doi.org/10.1186/s13643-020-01542-z

32.

Shorey

E. D.

Wong

C. H. J.

(2021). Global prevalence of depression and elevated depressive symptoms among adolescents: A systematic review and meta-analysis. British Journal of Clinical Psychology, 61(2), 287–305. https://doi.org/10.1111/bjc.12333

33.

Skokauskas

Fung

Flaherty

L. T.

von Klitzing

Pūras

Servili

Dua

Falissard

Vostanis

Moyano

M. B.

Feldman

Clark

Boričević

Patton

Leventhal

Guerrero

(2019). Shaping the future of child and adolescent psychiatry. Child and Adolescent Psychiatry and Mental Health, 13, 19. https://doi.org/10.1186/s13034-019-0279-y

34.

Socialstyrelsen . (2021). Handbok i markörbaserad journalgranskning inom barn-och ungdomspsykiatri. https://www.socialstyrelsen.se/globalassets/sharepoint-dokument/artikelkatalog/handbocker/2021-8-7468.pdf

35.

Thapar

Eyre

Patel

Brent

(2022). Depression in young people. Lancet, 400(10352), 617–631. https://doi.org/10.1016/s0140-6736(22)01012-1

36.

Vassar

Holzmann

(2013). The retrospective chart review: Important methodological considerations. Journal of Educational Evaluation for Health Professions, 10, 12. https://doi.org/10.3352/jeehp.2013.10.12

37.

Walter

H. J.

Albright

A. R.

Bukstein

O. G.

Diamond

Keable

Ripperger-Suhler

Rockhill

(2023). Clinical Practice Guideline for the Assessment and Treatment of Children and Adolescents With Major and Persistent Depressive Disorders. J Am Acad Child Adolesc Psychiatry, 62(5), 479–502. https://doi.org/10.1016/j.jaac.2022.10.001

38.

Wensing

Grol

Grimshaw

(2020). Improving patient care: The implementation of change in health care. John Wiley & Sons.

39.

WHO . (2006). Quality of care: A process for making strategic choices in health systems.

40.

WHO . (2021). Comprehensive mental health action plan 2013–2030.

41.

Wickersham

Westbrook

Colling

Downs

Govind

Kornblum

Lewis

Smith

Ford

(2024). The patient journeys of children and adolescents with depression: A study of electronic health records. European Child & Adolescent Psychiatry, 33(4), 1093–1101. https://doi.org/10.1007/s00787-023-02232-6

42.

Williams

N. J.

Beidas

R. S.

(2019). Annual research review: The state of implementation science in child psychology and psychiatry: A review and suggestions to advance the field. Journal of Child Psychology and Psychiatry, 60(4), 430–450. https://doi.org/10.1111/jcpp.12960

43.

Worster

Haines

(2004). Advanced statistics: Understanding medical record review (MRR) studies. Academic Emergency Medicine, 11(2), 187–192. https://doi.org/10.1111/j.1553-2712.2004.tb01433.x

44.

Zima

B. T.

Hurlburt

M. S.

Knapp

Ladd

Tang

Duan

Wallace

Rosenblatt

Landsverk

Wells

K. B.

(2005). Quality of publicly-funded outpatient specialty mental health care for common childhood psychiatric disorders in California. Journal of the American Academy of Child & Adolescent, 44(2), 130–144. https://doi.org/10.1097/00004583-200502000-00005

Medical Record Review Methodology for Assessing Child and Adolescent Depression Healthcare Quality: A Systematic Review and Thematic Synthesis

Abstract

Background

Methods

Results

Conclusions

Plain Language Summary

Why This Study Was Done

What Was Done

What Was Found

What This Means

Keywords

Introduction

Quality Conceptualisation and Operationalisation

Quality Measurement

Objectives

Method

Search Strategy and Eligibility Criteria

Selection, Data Abstraction, and Analysis

Results

Exclusions

Search Outcomes

Study Characteristics

Indicator Development and Operationalisation

Quality Indicators

Quality Domains

Analytical Approaches

Discussion

Quality Indicators and Domains

Global Quality Frameworks

Complementary Measurements

Implications

Strengths and Limitations

Conclusions

Footnotes

Acknowledgement

ORCID iDs

Consent to Participate

Author Contributions

Funding

Declaration of Conflicting Interests

Data Availability Statement

Author Biographies

Appendix

References