Sage Journals: Discover world-class research

Abstract

Objective: With the advent of routine outcomes across Australia and New Zealand, clinicians, managers, parents and children will be interested in change on these measures. This paper presents a number of approaches and the implications.

Method: Health of the Nations Outcome Scales for Children and Adolescents (HoNOSCA) collected during clinical practice for 911 patients were examined for changes over time, clinical significance, treatment status, effect size, and reliable and clinically significant change.

Results: Statistically significant changes in symptom severity were found related to treatment status and to changes in the number of clinically significant scales. An effect size of almost one standard deviation was noted and the proportion of patients who improved was examined. While the reliable change index was calculated, there are clinical complications with this approach. The impact of the capacity to change on specific scales illustrates a critical issue in describing outcomes.

Conclusion: From a number of perspectives, change in HoNOSCA total and scale scores is valid. However, several clinical dilemmas must be faced in deciding which approach should be used. The implications of these choices may affect clinicians, patients, carers and managers in understanding change.

Keywords

adolescent child outcomes mental health services routine measures

The development of Australian mental health services has been guided by the National Mental Health Strategy. This strategy included the objective ‘To institute regular review of client outcomes of services provided to persons with serious mental health problems and mental disorders as a central component of mental health service delivery’ [1, p. 45]. Political questions of ‘value for money’ were seen as impossible to address without ‘system level outcomes based on aggregated individual consumer data as well as broader indicators’ [2, p. 35].

Subsequently, all states and territories have implemented routine outcome measurement (ROM) across public mental health services [3]. While completion rates vary, over 140 000 Health of the Nations Outcome Scales for Children and Adolescents (HoNOSCA) records in child and adolescent mental health services (CAMHS) had been collected by mid 2007. These records contained over 28 000 pairs of HoNOSCA records (http://wdst.mhnocc.org/ cited 10/08/2009). While governments may examine change to inform decisions about value [4], clinicians may be interested to inform treatment decisions [5] and consumers to assist considering the relative merits of different approaches or services [6].

While Australia is one of the first nations to have adopted ROM, internationally an increase in published outcomes data related to CAMHS continues [7]. Routine collection across all CAMHS has occurred in Denmark (Bilenberg, personal communication, August 2005) and New Zealand [8]. Other states (e.g. Ohio, Nova Scotia) and service clusters (e.g. Norway, UK, California) have implemented ROM as part of clinical practice [9–12].

A core measure in Australia, New Zealand, Denmark, Norway and the UK is HoNOSCA, a brief clinician measure of the mental health symptoms and functioning of children and adolescents [13,14]. It comprises 15 scales of which 13 are used to compute the total score. The remaining two scales address problems with knowledge and understanding. Most studies do not report these two scales and it has been suggested the instrument be restricted to the 13 clinical scales and the total score [15].

Clinicians rate each scale on a 0–4 point rating from ‘no problems’ to ‘severe problems’. When HoNOSCA is measured on two or more occasions, the difference score is considered a measure of change without assuming the cause of any such change. The instrument has been found generally to have moderate to good inter-rater reliability [13,14,16] and to be valid [16–19].

Published research has typically reported difference scores, or percentage reduction, in mean total, individual scale or section scores over time as an indicator of change [13,19–23]. Others have described change through the percentage of patients whose difference in total score increased, decreased or remained unaltered [11,24].

Investigations of change have been accompanied by debate about the clinical meaning of any observed change. ‘Clinical significance’ is often contrasted with ‘statistical significance’ and refers to the extent to which change is clinically meaningful [25]. Jacobson and Truax [25] had two components to their method for assessing clinically significant change. First, observed post-treatment means should be more likely to be derived from a functional than dysfunctional population (clinical significance). Secondly, the observed change should be sufficiently large that it is unlikely to have arisen from the imprecision inherent in the instrument (reliable change). As with HoNOS, the adult counterpart of HoNOSCA, ‘the lack of any normative data in non-clinical populations’ [26, p. 720], currently restricts the possibility of determining a cut-off between a functional and dysfunctional population using HoNOSCA scores. Publications that only report statistical significance are often criticized for failing to address the issue of clinical significance: typically whether a patient shifts from a dysfunctional to a functional population [27]. While the method proposed to assess clinical significance has been criticised [27], e.g. some patients will not be ‘cured’ and some measures have no ‘functional’ population norms, the concept remains influential.

Given that ROM has been implemented and that stakeholders may have differing interests and interpretations of outcomes, this study aims to explicate different approaches to change and consider their limitations. This paper will analyse change in HoNOSCA scores at a real world clinical service, Eastern Health CAMHS (EHCAMHS), from a number of perspectives, and discuss the implications. EHCAMHS services a metropolitan and fringe rural area of over 800 000 people, offering community-based treatments for both children and adolescents, and an adolescent day programme and inpatient unit. The service has been guided by the learning organization model and this has supported ROM [28].

Method

All referrals had HoNOSCA completed by the case-managing clinician following assessment, again at six-monthly reviews and at discharge. An 18-month sample revealed a completion rate of 76%.

During the study period, a total of 911 patients had two or more HoNOSCA records completed. The mean age was 11.5 and 60% were male. The most frequent rating pair was assessment-discharge (43.9%), followed by assessment-review (34.2%), first review-second review (11.3%), and first review-discharge (6.6%).

A total of 11% of patients had no diagnosis recorded. Of the remainder, 46.3% and 33.2% had one or two diagnoses recorded respectively. The most frequent was disruptive behaviour disorders (18.3%) with mood, anxiety and adjustment disorders present in more than 10%. More severe presentations were less frequent: e.g. eating (2.5%), pervasive developmental (2.4%), personality (2%) and psychotic disorders (1.9%). All diagnoses were coded according to the DSM-IV groupings [29] with the exception that attention deficit and disruptive behaviour disorders were not merged, and separation anxiety was grouped with anxiety disorders. Duplicate diagnostic groups were removed and a heuristic guide was used incorporating severity and CAMHS frequency to identify the key diagnosis (Table 1).

Table 1.

Heuristic hierarchy for determining diagnostic order

Descending order for recording key diagnosis

Personality disorder

Schizophrenia

Pervasive developmental disorder

Eating disorders

Mood disorders

Mental retardation

Substance related disorders

Anxiety disorders

Disruptive behaviour disorders

Impulse control disorders

Sexual and gender identity disorders

Attention deficit hyperactivity disorders

Somatoform disorders

Other disorders of infancy, childhood or adolescence

Mental disorders due to general medical conditions not elsewhere classified

Tic disorders

Elimination disorders

Feeding and eating disorders of infancy or early childhood

Communication disorders

Learning disorders

Adjustment disorders

Sleep disorders

Delirium, dementia, and amnestic and other cognitive disorders

Motor skills disorders

Psychological factors affecting medical condition

Problems related to abuse or neglect

Relational problems

Additional conditions that may be a focus of clinical attention

Missing data were excluded from analyses of individual scales and from calculations of the total score in line with the national protocol [30]. This is equivalent to treating missing data as zero when calculating the total score. All data were screened for errors using SPSS version 11.0. While there were no univariate outliers [31], a small proportion of multivariate outliers were noted. These outliers were clinically understandable being either elevated ratings in low prevalence symptom areas (e.g. substance misuse) or reduced ratings in common areas (e.g. family relationships). The threat to the integrity of the analyses arising from the unequal cell sizes that frequently occur in clinical research was minimized through conservative Bonferroni corrections [31] although this may increase the risk of missing an effect.

Analyses of change in the mean total and scale scores were conducted with analysis of variance (ANOVA) and multiple analysis of variance (MANOVA) respectively. Tests of statistical significance are often complemented by examinations of effect size. This can be established in a number of ways though there is little agreement about which method is most suited to which situation [32]. Two of the most common classes are those that belong to the family of correlations (‘r’ family) and those that express effect as a function of standard deviation units (‘d’ family) [32]. Weinfurt [33] describes the use of eta-squared as an index of the strength of an effect that is approximately comparable to r². Johnson [34] describes the calculation of ‘g’ as a standardized index of effect size that expresses the difference between the comparisons of interest in units of standard deviations.

The reliable change index was calculated in accordance with Parabiaghi et al. [26]. Those authors use Cronbach's α as the estimate of reliability (r), however in line with other authors, this analysis will use test–retest reliability as the appropriate parameter [25,27]. It should be noted that the test–retest reliability estimate was quite conservative, being obtained over a five-month period [35].

An examination of change in scales follows using the construct of ‘clinical significance’. While HoNOSCA has no norms for a ‘non-clinical’ population, the glossary indicates that a rating of two or more indicates a clinically significant symptom while a rating of zero or one indicates no clinical problem. ‘Clinically significant’ indicates any symptom the clinician considers worthy of clinical attention, such as further assessment, treatment, monitoring or documentation. Each scale at time 1 and 2 can be divided into a dichotomy of clinically significant/not significant, providing four change categories: ‘improved’; ‘deteriorated’; ‘problem, no change’; and, ‘no problem, no change’.

Results

Change in the total score

Repeated-measures ANOVA with time as the independent variable and total score as the dependent variable revealed a significant decrease from time 1 to time 2 (M₁ = 12.73, SD = 6.02 and M₂ = 9.46, SD = 6.38; F = 292.76, df = 1, 842, p < 0.001). Decreased severity was found for all four rating pairs except the review-review pair. The total score decreased from 12.4 at assessment to 7.76 at discharge (F = 294.25, df = 1, 839, p < 0.001), representing a drop in severity of 37%. The review-discharge pair decreased from 12.98 to 9.72 (F = 21.74, df = 1, 839, p < 0.001). The assessment–review pair had a smaller decrease from a mean of 13.4 to 10.72 (F = 76.23, df = 1, 839, p < 0.001). The review-ongoing review pair, being for patients who have been in treatment for at least six months and who remain in treatment, had a slight though non-significant increase in symptom severity from 11.78 to 12.11.

Change in the total score as an effect size

The sub-sample used to calculate effect size was patients who had completed treatment. Again the decrease in mean scores from 12.4 (SD = 5.96) to 7.76 (SD = 5.82) indicated a statistically significant decrease in symptom severity (F = 279.87, df = 1, 385, p < 0.001) and a partial eta-square of 0.42. This effect size is best described as ‘large’ using Cohen's criteria for r² [33] although effect sizes of this magnitude have also been described as medium [36]. Criteria from Kraemer et al. [32] lead to a similar conclusion to Weinfurt [33], with the conversion of r² back to r denoting a ‘large’ to ‘larger than typical’ effect.

As correlations between means increase in repeated-measures designs, F values are less useful than means and standard deviations when calculating ‘g’ [37]. The correlation between the assessment and discharge HoNOSCA total scores was 0.57. Calculation of ‘g’ produces an estimate of the effect size as 0.79 (95% confidence interval 0.58 to 0.99) [34] which has the same descriptive label as the obtained r, i.e. a ‘large’ to ‘larger than typical’ effect [32]. The size of ‘g’ suggests completed treatment at CAMHS produces an average decrease of almost one standard deviation in symptom severity.

Change in scales

Repeated-measures MANOVA, where time [1,2] was the independent within-subject variable and the ratings on each of the 13 scales were the dependent variables, revealed significant effects due to time (F = 23.73, df = 13, 692, p < 0.001).

Subsequent univariate analyses revealed all scales with the exception of Abnormal perceptions showed significant decreases. Emotional, Family relationships and Disruptive showed the largest reduction in severity (Figure 1). This apparent lack of change in Abnormal perceptions will be explored subsequently.

Figure 1.

Mean HoNOSCA scale scores at times 1 and 2.

Change in individual scales by treatment status

MANOVA was used with treatment status (completed or continuing treatment) as the between-subjects and time [1,2] as the within-subjects independent variable. The scale scores were the dependent variables. Scores reduced significantly over time (F = 5.76, df = 13, 382, p < 0.001) and differed between continuing and completed treatments (F = 2.69, df = 13, 382, p = 0.001). There was a significant interaction between time and treatment status (F = 5.48, df = 13, 382, p < 0.001) with post-hoc analysis of change between time 1 and time 2 using MANOVA, indicating that significant changes only occurred for those with completed treatment episodes (F = 21.56, df = 13, 382, p < 0.001). As the subsequent question of interest is which scales differed over time by treatment status, post-hoc univariate analyses with Bonferroni adjusted pairwise comparisons were conducted with an overall type I error rate of 0.05. Every HoNOSCA scale except Substance misuse showed significant change for those who had completed treatment. No scale showed change over time for those patients continuing in treatment.

The clinical significance of change in HoNOSCA scores

Clinical significance and the total score

While each scale has a criterion for clinical significance, there is no equivalent for the total score. However, it is possible to examine change in the number of clinically significant scales between assessment and discharge collection occasions. ANOVA revealed a decrease in the number of clinically significant scales from a mean of 3.82 (SD = 1.96) at assessment to 2.12 (SD = 2.17) at discharge (F = 189.11, df = 1, 304, p < 0.001). At discharge, patients have fewer areas in which there are clinically significant symptoms compared to assessment (Figure 2).

Figure 2.

Number of clinically significant scales at assessment and discharge.

The reliable change index and the total score

The reliable change index (RCI) for the HoNOSCA total score can be calculated with a 95% confidence interval as [26]:

The obtained RCI for the total score in the current sample was 7.4; 1% of the sample would be considered to have reliably deteriorated. Reliable improvement occurred for 28.5% and the status of the remaining 70% would be classified as uncertain.

Who is included in estimates of change?

Previously it was noted that the mean scores for all scales decreased significantly from time 1 to time 2 with the exception of Abnormal perceptions. With completed treatment, Substance misuse did not appear to change. This is not solely a function of which collection occasions are included: the capacity of clients to change is important. The following analysis uses the clinically grounded dichotomy where ratings of two or greater indicate clinical significance (‘problem’) and zero or one as indicates a lack of clinical significance (‘no problem’) on that scale.

Using this dichotomy with Abnormal perceptions as the example, only 3% of the sample improved from clinically significant symptoms at assessment to clinically insignificant symptoms at discharge. However, when only those who had or developed a problem are included, 44% improved. Including only those who started with a problem resulted in 57% of the sample improving to the extent of having no clinically significant symptoms. Including those for whom improvement is impossible (i.e. they neither had nor developed a problem in this symptom area) underestimates the effectiveness of the treatment or intervention for those who have a problem in this area (Figure 3). Figure 3 also illustrates this issue for Disruptive, Self-harm, Substance misuse and Emotional symptoms.

Figure 3.

Percentage of patients improving from clinically significant symptoms for all cases, those who had or developed a problem, and those who had a problem.

Discussion

Irrespective of the method of evaluation, HoNOSCA is sensitive to change in clinical populations. There are statistically significant decreases in total score severity, decreases in the number of clinically significant scales, an effect size approaching one standard deviation and changes within individual scales. This sensitivity to change will be important to funding bodies seeking to ensure limited resources are directed efficaciously.

Using the RCI [25] appears to account for instruments’ reliability; however, there are important caveats. It is typically premised on higher confidence levels (95%) than characterizes clinical decision making. It has the usual statistical and logical problem where no reliable change occurs from assessment to review, from review to review, and from review to discharge, yet reliable change occurs from assessment to discharge. The clinical difficulty is the impact of feedback to patient, parent or clinician indicating there is not yet any reliable change. The suggestion that between assessment and review there has been no reliable change at the 95% confidence interval may be both accurate and simultaneously undermine everyone's motivation for continuing with treatment. It is prudent to account for the level of reliability of the instrument; however, privileging this approach to change is fraught with therapeutic risks. Intriguingly, the obtained RCI of seven from assessment to discharge is similar to the change required for clinicians to assess patients as ‘much better’ [16]. As HoNOSCA continues to appear reliable, valid and feasible enough for routine use, the calculation of reliable change across different parameters and conversion of this information to a form usable by consumers is an important challenge.

Wolpert et al. [24] used a RCI but noted the lack of agreed method for assessing change that is ‘clinically significant’. The major challenge will be to establish which ‘endpoints’ indicate ‘wellness’ and are most useful to practitioners and patients [38]. As an example, children seen in general practice have been rated with HoNOSCA irrespective of whether they were referred to CAMHS [39]. While a small study, children seen by a GP and not referred on had lower mean total scores than CAMHS referrals. Conceptually that study obtained a distribution frequency more indicative of the community than of a clinical population.

Funders are likely to seek a single indicator. Reduction in total scores indicates a drop in symptom severity that is relatively easy to understand although there is no absolute reference point indicating the clinical meaning of any particular total score. Both funders and parents may find difficulties interpreting the meaning of an ‘average decrease of x points’. It may be useful for services to document changes as a percentage of the assessment score. Expressing change as a percentage reduction in symptoms or as changes in percentile rankings may be easier to interpret [36, Table 1. 1.4.1] and enhance comparisons with similar services [40]. Improvement in the number of clinically significant scales is an important approach as it reflects both a qualitative improvement in symptomatology and contextualizes the statistically significant shift in the total score. For example, while only 2.4% had no clinically significant scales at time 1, almost 30% had no clinically significant scales at time 2.

However, all approaches that exclusively use the total score risk ignoring important changes on individual scales. For example, of those for whom there was no reliable change in the total score, almost one third actually improved in non-accidental self-harm from clinically significant to clinically insignificant symptoms.

Typically, all patients with valid data are included in analyses of change. Routine outcome measures are designed for real-world clinical services where the impact of statements such as, ‘there is no significant change in Hallucinations, Delusions or Abnormal perception symptoms for those attending this service’ is likely to be demoralizing for patients and clinicians alike. While that statement is accurate when all are included, it needs to be accompanied by the equally accurate observation that ‘of those who come here with problems in Hallucinations, Delusions or Abnormal perceptions, 57% will not have a clinically significant problem at discharge’. Including patients with no capacity for change in a specific problem area or those who have not finished treatment will underestimate the benefit received by those who have problems.

Excluding the ‘no problem, no change’ group neither minimizes the importance of the overall comparison nor artificially inflates the estimated effect size: It is aimed at answering a parent's question ‘Can you help my child with his or her problem?’ HoNOSCA appears to be able to contribute to answering these questions but the answer requires selecting the patients relevant to the analysis in question. Compared with funders, parents may be more interested in problem areas relevant to their situation. Use of the appropriate sample is also relevant to parametric analyses. Analyses of mean scores will be affected by the size of the ‘no problem, no change’ group. As the ‘no problem, no change’ group will vary by scale, there can be no global exclusion of a particular group of patients from analysis of change.

For those who arrived with a clinically significant problem, substantial proportions left EHCAMHS without clinically significant symptoms. Kiser et al. [41] in a study of partial hospitalization concluded, ‘One year after treatment in our program, your child or adolescent will be having fewer problems overall, will most likely be making better grades in school, having fewer conduct problems, and will be a better friend’ [41, p. 88]. It would be useful for CAMHS to be transparent about expected outcomes. This information may help patients consider the relative merits and costs of different approaches (e.g. day programmes or outpatient appointments). Prior to implementing HoNOSCA, EHCAMHS could not have told parents that 49% of children with emotional difficulties or 80% of those with self-injury improve to the point of their symptoms becoming clinically insignificant from the clinician's perspective.

The approach of dividing HoNOSCA scale scores into clinically significant or insignificant categories may be more robust but it does risk obscuring other changes. A change in rating from severe disruptive problems to only mild problems is important yet will not occasion ‘improved’ status.

A limitation of the obtained effect size was the long test-retest period used to estimate reliability. Some authors consider one week to be optimal [42] for test-retest although the longer period may increase confidence that any effect is not overestimated. Importantly, while there is evidence of change, there was no control group and alternative interventions could have produced equal effect sizes [43]. Weisz and Jensen described effectiveness as relating to ‘evidence that a treatment has beneficial effects when delivered to heterogeneous samples of clinically referred individuals treated in clinical settings by clinicians rather than research therapists’ [44, p. 125]. While child psychotherapy studies have a mean effect size of 0.77, Weisz and Jensen noted that restricting estimates to actual effectiveness studies produced a much lower effect size (0.01). The effect size in the current study was much larger at 0.79, and comparable to the effect size of 0.88 on HoNOSCA scores recently reported from the first round of the Australian national outcomes dataset [36]. In any analysis where random allocation to contrasting and equivalent groups is not possible, caution in interpretation is essential [45]. Simply put, it is not possible to conclude that this, or any Australian CAMHS produced greater effects with real patients and real clinicians than the published clinical effectiveness studies. With mounting support for HoNOSCA's validity, future clinical research will be better placed to examine alternative approaches or compare groups in routine clinical settings using HoNOSCA.

In conclusion, routine outcome measurement is an important yet recent development in CAMHS. Whether using categorical or reliable change approaches, total scores or individual scales, services need to carefully consider the impact of outcomes information on parents, patients and clinicians. Selection of the correct denominator for assessing change is crucial. The impact of feedback on engagement and motivation is likely to be an important area for research. It may be premature to settle on one index of change. All have strengths and flaws. It is only through ongoing use and comparison of different approaches that we can truly understand what questions they may be addressing.

Footnotes

Acknowledgements

Declaration of interest: The authors report no conflicts of interest. The authors alone are responsible for the content and writing of the paper.

References

Commonwealth Department of Health and Ageing. National mental health report 2002: Seventh report Changes in Australia's mental health services under the first two years of the second national mental health plan 1998–2000. Canberra: Commonwealth of Australia; 2002.

Department of Health and Aged Care. Mental health information development: National information priorities and strategies under the second national mental health plan 19982003. Canberra: Commonwealth of Australia; 1999.

Callaly

Hyland

Coombs

Trauer

. Routine outcome measurement in public mental health: results of a clinician survey. Aust Health Rev 2006; 30:164–173.

Hunter

Higginson

Garralda

. Systematic literature review: Outcomes measures for child and adolescent mental health services. J Public Health Med 1996; 18:197–206.

Huxley

. Outcomes management in mental health: a brief review. J Mental Health 1998; 7:273–283.

Rissel

Holt

Ward

. Applying a health outcomes approach in a health service unit. Aust Health Rev 1998; 21:168–181.

Cottrell

Kraam

. Growing up? A history of CAMHS (1987–2005). Child Adolesc Ment Health 2005; 10:111–117.

Chipps

Stewart

Humberstone

. NZ mental health standard measures of assessment and recovery (MH-SMART) initiative. Information collection protocol. Auckland: The National Centre of Mental Health Research and Workforce Development. Te Pou o Te Whakaaro Nui; 2006.

Pirkis

Burgess

Coombs

Clarke

Jones-Ellis

Dickson

. Routine measurement of outcomes in Australia's public sector mental health services. Aust N Z Health Policy 2005; 2:8.

10.

Johnston

Gowers

. Routine outcome measurement: a survey of UK child and adolescent mental health services. Child Adolesc Ment Health 2005; 10:133–139.

11.

Kisely

Campbell

Crossman

Gleich

Campbell

. Are the Health of the Nation Outcome Scales a valid and practical instrument to measure outcomes in North America? A three-site evaluation across Nova Scotia. Community Ment Health J 2007; 43:91–107.

12.

Hanssen-Bauer

Aalen

Ruud

Heyerdahl

. Inter-rater reliability of clinician-rated outcome measures in child and adolescent mental health services. Adm Policy Ment Health 2007; 34:504–512.

13.

Gowers

Harrington

Whitton

. Brief scale for measuring the outcomes of emotional and behavioural disorders in children. Health of the Nation Outcome Scales for Children and Adolescents (HoNOSCA). Br J Psychiatry 1999; 174:413–416.

14.

Hanssen-Bauer

Gowers

Aalen

. Cross-national reliability of clinician-rated outcome measures in child and adolescent mental health services. Adm Policy Ment Health 2007; 34: 513–518.

15.

Garralda

Yates

. HoNOSCA: uses and limitations. Child Psychol Psychiatry Rev 2000; 5:131–132.

16.

Brann

Coleman

Luk

. Routine outcome measurement in a child and adolescent mental health service: an evaluation of HoNOSCA. Aust N Z J Psychiatry 2001; 35:370–376.

17.

Garralda

Yates

Higginson

. Child and adolescent mental health service use HoNOSCA as an outcome measure. Br J Psychiatry 2000; 177:52–58.

18.

Bilenberg

. Health of the Nation Outcome Scales for Children and Adolescents (HoNOSCA) - Results of a Danish field trial. Eur Child Adolesc Psychiatry 2003; 12:298–302.

19.

Manderson

McCune

. The use of HoNOSCA in a child and adolescent mental health service. Irish J PsychologMed 2003; 20:52–55.

20.

Green

Kroll

Imrie

. Health gain and outcome predictors during inpatient and related day treatment in child and adolescent psychiatry. J Am Acad Child Adolesc Psychiatry 2001; 40:325–332.

21.

Kerfoot

Harrington

Rogers

Verduyn

. A step too far? Randomized trial of cognitive-behaviour therapy delivered by social workers to depressed adolescents. Eur Child Adolesc Psychiatry 2004; 13:92–99.

22.

Gowers

Smyth

. The impact of a motivational assessment interview on initial response to treatment in adolescent anorexia nervosa. Eur Eat Disord Rev 2004; 12:87–93.

23.

Harnett

Loxton

Sadler

Hides

Baldwin

. The Health of the Nation Outcome Scales for Children and Adolescents in an adolescent in-patient sample. Aust N Z J Psychiatry 2005; 39:129–135.

24.

Wolpert

Garralda

Baruch

. Supporting documentation: emerging findings of outcomes subgroup of child and adolescent mental health external working group contributing to the development of the children's national service framework (NSF). London: Department of Health, 2003.

25.

Jacobson

Truax

. Clinical significance: a statistical approach to defining meaningful change in psychotherapy research. J Consult Clin Psychol 1991; 59:12–19.

26.

Parabiaghi

Barbato

D'Avanzo

Erlicher

Lora

. Assessing reliable and clinically significant change on Health of the Nation Outcome Scales: method for displaying longitudinal data. Aust N Z J Psychiatry 2005; 39:719–725.

27.

Wise

. Methods for analysing psychotherapy outcomes: a review of clinical significance, reliable change, and recommendations for future directions. J Pers Assess 2004; 82:50–59.

28.

Birleson

Brann

. Reviewing the learning organisation model in a child and adolescent mental health service. Aust Health Rev 2006; 30:181–194.

29.

American Psychiatric Association. Diagnostic and statistical manual of mental disorders: DSM-IV-TR, text revision fourth edition. Washington DC: American Psychiatric Association, 2000.

30.

Department of Health and Ageing. Mental Health National Outcomes and Casemix Collection: overview of clinician-rated and consumer self-report measures. Canberra: Department of Health and Ageing, 2003.

31.

Tabachnick

Fidell

. Using multivariate statistics, third. New York: Harper Collins College Publishers, 1996.

32.

Kraemer

Morgan

Leech

Gliner

Vaske

Harmon

. Measures of clinical significance. J Am Acad Child Adolesc Psychiatry 2003; 42:1524–1529.

33.

Weinfurt

. Multivariate analysis of variance. Grimm

Yarnold

. Reading and understanding multivariate statistics. Washington, DC: American Psychological Association, 1998.

34.

Johnson

. DSTAT: software for the meta-analytic review of research literatures. New Jersey: Lawrence Erlbaum Associates, 1989.

35.

Brann

. Routine outcome measurement in child adolescent mental health services: HoNOSCA: reliable enough, valid enough and feasible enough? PhD thesis, Monash University, Victoria, Australia, 2006.

36.

Australian Mental Health and Outcomes Classification Network - AMHOCN. Child and adolescent national outcomes and casemix collection standard reports, version 1.1. Brisbane, Queensland: Governemnt of South Australia, 2005.

37.

Dunlap

Cortina

Vaslow

Burke

. Meta-analysis of experiments with matched groups or repeated measures designs. Psychol Methods 1996; 1:170–177.

38.

Moyé

. Multiple analyses in clinical trials: fundamentals for investigators. New York: Springer-Verlag; 2003.

39.

Luk

Brann

Sutherland

Mildred

Birleson

. Training general practitioners in the assessment of childhood mental health problems. Clin Child Psychology Psychiatry 2002; 7:571–579.

40.

Hodges

Wong

Latessa

. Use of the child and adolescent functional assessment scale (CAFAS) as an outcome measure in clinical settings. J Behav Health Serv Res 1998; 25:325–336.

41.

Kiser

Millsap

Hickerson

. Results of treatment one year later: child and adolescent partial hospitalization. J Am Acad Child Adolesc Psychiatry 1996; 35:81–90.

42.

Lambert

. Use of psychological tests for assessing treatment outcome. Maruish

. The use of psychological testing for treatment planning and outcomes assessment, second. New Jersey, Lawrence Erlbaum Associates, 1999:115–152.

43.

Weiss

Catron

Harris

Phung

. The effectiveness of traditional child psychotherapy. J Consult Clin Psychol 1999; 67:82–94.

44.

Weisz

Jensen

. Efficacy and effectiveness of child and adolescent psychotherapy and pharmacotherapy. Ment Health Serv Res 1999:125–157.

45.

Gliner

Morgan

Harmon

. Pretest-posttest comparison group designs: Analysis and interpretation. J Am Acad Child Adolesc Psychiatry 2003; 42:500–503.

On the Meaning of Change in a Clinician's Routine Measure of Outcome: HoNOSCA

Abstract

Keywords

Method

Results

Change in the total score

Change in the total score as an effect size

Change in scales

Change in individual scales by treatment status

The clinical significance of change in HoNOSCA scores

Clinical significance and the total score

The reliable change index and the total score

Who is included in estimates of change?

Discussion

Footnotes

Acknowledgements

References