Abstract
Several large-scale studies conducted over the last two decades have reported high prevalence rates for anxiety and mood disorders both in the general community and in specialist units around the globe. In the first wave of the Epidemiologic Catchment Area (ECA) study conducted in the USA, 14.6% and 8.3% of adults were reported as suffering from anxiety disorders and mood disorders, respectively, at some point in their lifetime [1]. In Germany, the Upper Bavarian Study (UBS) found the 6-month prevalence rates in adults to be 1.6% for anxiety disorders and 6.8% for affective disorders [2]. In New Zealand, Wells et al. [3] reported lifetime prevalence figures of 10.5% for anxiety disorders and 14% for affective disorders in 18–64 year-olds. The Australian Bureau of Statistics [4] reported 12-month prevalence rates for the Australian adult population; 9.7% were found to have anxiety disorders and 5.8% were found to have affective disorders. The high prevalence rates of anxiety and affective disorders have implications not only for individuals and their families, but also for the community. Individually, quality of life may be affected as well as family relationships. Cost of treatment and days lost at work place a financial burden on the community. Unipolar depression, for instance, is one of the 10 leading causes of days out of role [5].
In such large-scale surveys, it is impractical to use experienced clinicians, given their expense, for the timeconsuming interviews. Instead, such surveys typically employ lay interviewers using highly structured and standardized instruments such as the Composite International Diagnostic Interview (CIDI) [6] and its predecessor, the Diagnostic Interview Schedule (DIS) [7]. Computerized versions of these instruments have been developed which allow participants to self-administer the interviews. This raises several issues. First, do the non-clinicians or computerized instruments underdiagnose or overdiagnose patients and/or community participants? Second, do the non-clinicians or computerized interviews diagnose the same people as would clinicians? These raise the question of whether prevalence figures being reported by these official surveys are actually distorted or skewed because either non-clinicians or computerized interviews were used.
This is illustrated by the poor diagnostic agreement between experienced clinicians and instruments such as the CIDI in specialist treatment clinics. For instance, Peters and Andrews [8] assessed the procedural validity of the CIDI-Auto against the Longitudinal, Expert, All data (LEAD) diagnosis which served as the ‘gold standard’. They also compared one expert clinician's diagnosis against the CIDI-Auto (DSM-IV diagnoses) for agoraphobia, panic disorder, social phobia, generalized anxiety disorder (GAD), major depressive episode (MDE) and obsessive compulsive disorder (OCD). With the exception of OCD for which agreement was good (κ = 0.78), agreement was poor for all other diagnoses (κ ≤?0.40).
Similar findings have been reported elsewhere. Rosenman et al. [9] conducted a study within an acute pschiatry unit. They found poor overall agreement at the level of general diagnostic class between the psychiatrists’ principal diagnosis and that of the self-administered CIDI-Auto (using ICD-10) diagnosis (κ = 0.23). This stood in contrast with the good concordance between psychiatrists (κ = 0.69). (General diagnostic class corresponds to one digit of the multi-digit ICD-10 F codes and refers to major diagnostic classes such as mood disorder, substance abuse disorder and schizophrenic disorder.) The agreement rate decreased to κ = 0.14, when based on specific diagnostic category (corresponding to the two digits of the F codes).
The low reliability of clinical diagnoses made in routine practice has long been considered a problem in clinical and research work [10–12]. In an early study, Beck [12] reported great variability between clinicians’ diagnoses. A review of studies examining agreement between clinicians for specific diagnostic categories found poor kappa coefficients for anxiety and affective disorders [13]. It was the consistency of these findings that promoted development of the DSM-III [14] and semistructured diagnostic interviews such as the Structured Clinical Interview for DSM (SCID) [13].
Computerized versions of these structured interviews improve standardization of diagnosis, eliminate clinician bias and also offer high reliability and consistency of administration [15]. These methods are cost effective and time efficient and eliminate errors in data entry and scoring because these programmes automatically score results [15]. Patient comfort in disclosing sensitive information in interviews influences the process of making accurate diagnoses. However, evidence is mixed for participants’ comfort with clinicians or computers. Some studies have found that individuals were more comfortable in relating sensitive information to computers than clinicians [15, 16]. Others have reported that patients with psychiatric illness were more resentful and intimidated by computerized interviews or preferred clinician-based interviews because clinicians were sensitive to their needs and could ask specific questions about feelings [17, 18].
In this study, we report on the use of the CIDI-Auto in a specialist anxiety and mood disorders clinic. Specifically, we examined concordance between clinicians and the CIDI-Auto (using DSM-IV and ICD-10 criteria) diagnosis for six anxiety disorders: social phobia, OCD, panic disorder ± agoraphobia, posttraumatic stress disorder (PTSD), generalized anxiety disorder (GAD), and agoraphobia; and two mood disorders: major depressive episode (MDE) and dysthymia. We also report sensitivity and specificity indices for the clinicians’ diagnoses by treating the CIDI diagnoses as gold standards.
Method
Subjects
Subjects were 262 patients who presented to the Depression and Anxiety Disorders Research and Treatment (DART) Program outpatients clinic, based at the Royal Melbourne Hospital (RMH) in Melbourne, Australia. The sample came from a slightly larger cohort of 287 patients who were consecutively assessed at DART over the 23 months beginning March 1997 through to February 1998. Patients were referred to the clinic if they were thought to suffer from anxiety or depressive disorders. The majority of the referrals were from general practitioners with others referred by psychiatrists in private practice.
Setting
The clinic run by the DART Program was a tertiary referral service situated at the RMH. (At the time of this paper going to press, the clinic was relocated to a rural mental health setting.) The clinic was staffed by academic consultant psychiatrists and clinical psychologists employed by the Departments of Psychiatry and Psychology at the University of Melbourne and clinicians employed by the DART Program. The Program specialized in anxiety and mood disorders with a specific interest in OCD, panic disorder ± agoraphobia, social phobia, and major depressive disorders.
Measures
Clinical interview
Patients first underwent a 1-h clinical assessment conducted by an experienced psychiatrist or clinical psychologist. The interview followed a standard format whereby information was recorded on history of presenting complaint, history of psychiatric and medical treatment, history of drug use, nicotine and alcohol use, current medication, family background and history of illness, personal history (which included information on early development, school/work, relationships, children, current life circumstances and interests, and forensic history), premorbid personality and a mental state examination. Clinicians then provided a descriptive formulation and diagnostic assessment.
CIDI-Auto
The CIDI-Auto, version 2.1 [19] is a highly structured and standardized psychiatric interview. It produces diagnoses according to the ICD-10 [20], and the DSM-IV [21]. The procedural validity of the CIDIAuto has been examined [8] and it has been shown to be more reliable than the standardized Diagnostic Interview Schedule (DIS) from which it was derived [22]. Over the last few years, it has progressed from being a purely epidemiological and research tool to having limited use in clinical practice. The CIDI has been developed from a paper and pencil version to a computerized format (CIDI-Auto) which has enabled extension of its use. Trained lay interviewers are able to administer the test and self-administration is now also an option. These less timeconsuming and more cost-effective options of administering the CIDIAuto make it an increasingly popular research and clinical tool.
Procedure
Following referral, prospective patients underwent the clinical interview. They then returned at a later date to complete the CIDI-Auto over a 1–2-h period. Approximately two-thirds of the patients chose to selfadminister the CIDI-Auto, while a third preferred the research assistant to administer the CIDI-Auto because of their unfamiliarity with computers or difficulty in understanding the questions. For those patients who completed the self-administered form, the research assistant was always available to assist in answering questions from the patients, but did not necessarily sit with the subjects throughout the completion of the interview. The CIDI-Auto-generated diagnoses of interest included in this study are those which occurred within the 6 months before being assessed at the clinic and met exclusion criteria. This allowed comparisons to be made with clinicians’ diagnoses which were also based on exclusion criteria. For example, in considering a diagnosis of major depressive episode, both the CIDI and the clinicians were required to exclude bipolar disorder. Lifetime diagnoses are not reported for this paper.
Results
Two hundred and eighty-seven patients were seen at the clinic over 23 months. Twenty-four patients (non-completers) refused to do the CIDI-Auto and therefore did not have CIDI-Auto results that could be used in the study. An additional patient was also excluded from the study because the time lapse between his clinical and CIDI-Auto interviews was more than 2 months. He was therefore included in the noncompleters group for the purpose of this study. Of the remaining 262 patients whose data have been included in the study, 93.4% completed the CIDI-Auto within 14 days of the clinical interview. For the remaining 6.6%, time taken to complete the CIDI-Auto ranged from 15 to 57 days. The mean time interval between clinical assessment and completion of the CIDI-Auto was 6.83 ± 8.1 days (median 6 days).
Completers versus non-completers
As Table 1 shows, the differences between the non-completers and completers was approaching significance for age. This indicated that non-completers were more likely to be older than completers. Chisquared analysis showed no significant differences between the two groups as regards sex or relationship status (see Table 1). Table 1 also indicates that the group which completed the assessment process was better educated and more likely to be employed than the non-completers.
Demographic data on completers and non-completers
Prevalence figures
Clinical diagnosis
Of the 262 patients who completed assessment, 82.5% were given a primary diagnosis of either an anxiety or depressive disorder by the clinicians. The primary clinical diagnosis for the disorders of interest were as follows: social phobia (10.6%), OCD (19.4%), panic disorder ± agoraphobia (20.9%), PTSD (1.9%), GAD (4.9%), agoraphobia (2.7%), MDE – single/recurrent (18.6%), and dysthymia (1.9%). Of the 262 patients, 11.4% were given a non-anxiety disorder and nondepressive disorder primary diagnosis by the clinicians. These included diagnoses of schizophrenia (n = 1), schizoaffective disorder (n = 3), hypochondrias (n = 3), personality disorders (n = 11), substance abuse (n = 4), eating disorders (n = 2), adjustment disorder (n = 5) and relational problems (n = 1). A further 6.1% either had their diagnosis deferred (n = 3) or were given no diagnosis on axis I (n = 13).
Concordance data
Kappas were calculated only for those disorders with a base rate greater than 10% as it is generally accepted that kappas can be unstable when base rates are too low [23]. Base rate in this study refers to the proportion of patients who were endorsed by the CIDI-Auto as meeting criteria for a particular diagnosis. Concordance was measured by Cohen's kappa. For MDE, agreement between clinicians and the CIDI-Auto results was measured at the level of diagnostic category (i.e. for ICD-10, up to two digits of the F code, and for DSM-IV, up to three digits). The CIDI–ICD-10 and DSM-IV systems generated 2.6 and 2.0 diagnoses per person, respectively, and clinicians on average identified 1.4 diagnoses per person. The results are shown in Table 2 for DSM-IV and Table 3 for ICD-10. Kappa values less than 0.40 indicate poor to fair agreement, 0.40–0.50 would indicate moderate levels of agreement while values greater than or equal to 0.60 can be taken to indicate good to very good agreement [24].
Concordance between CIDI-Auto (DSM-IV) and clinicians’ diagnoses
Concordance between the CIDI-Auto (ICD-10) and clinicians’ diagnoses
Concordance between DSM-IV and clinicians
As can be seen from Table 2 , where kappa could be calculated, agreements between clinicians’ diagnoses and the CIDI–DSM-IV diagnoses were poor (κ < 0.40) for social phobia, MDE and PTSD. There was better agreement between CIDI-Auto and the clinicians when diagnosing OCD.
Concordance between ICD-10 and clinicians
As shown in Table 3, where kappa could be calculated, agreement between clinicians’ diagnoses and the CIDI–ICD-10 diagnoses were also poor for panic disorder ± agoraphobia, MDE, PTSD and GAD, but moderate for OCD and social phobia.
Sensitivity and specificity
Sensitivity is the proportion of positive cases correctly identified by the clinicians and specificity is the proportion of negative cases correctly identified by the clinicians, with the CIDI-Auto diagnoses treated as the gold standard. Table 2 shows that clinicians demonstrated poor ability to identify cases diagnosed according to the CIDI, with the exception of OCD, where clinicians identified 67% of the CIDIdiagnosed cases when DSM-IV diagnoses were used. Table 3 shows that for ICD-10 diagnoses, the best sensitivity figures were for OCD and Social Phobia, with both identifying over 60% of the CIDI-diagnosed cases. For the other ICD-10 diagnoses, and where sensitivity figures could be calculated, clinicians identified between 11% and 58% of the cases diagnosed by the CIDI. Specificity was high for all cases where such data could be collected and for both DSM-IV and for ICD-10.
Secondary concordance analyses
To determine whether applying hierarchical rules (i.e. diagnostic exclusion criteria in addition to diagnostic inclusion criteria) influenced the agreement rates between clinicians and the CIDI-Auto, further analyses were performed. Kappas were recalculated comparing clinicians’ multiple diagnoses to the CIDI-Auto diagnoses when no exclusion criteria were used by the CIDI-Auto, i.e. no hierarchical rules were applied to the CIDI-Auto diagnoses. The only diagnostic category that was influenced when hierarchical rules were not applied was Panic Disorder ± Agoraphobia (and when DSM-IV criteria were used). The kappa value doubled from 0.17 to 0.39 when exclusion criteria were not used by the CIDI-Auto. There was little to no change for all other diagnostic categories.
Discussion
This study measured the rates of six anxiety disorders and two mood disorders as assessed by experienced clinicians at a specialist anxiety and mood disorders clinic. According to CIDI-ICD-10 criteria, MDE had the highest prevalence rate in this sample, followed by panic disorder ± agoraphobia, social phobia, OCD and GAD. According to CIDI-DSM-IV criteria, the disorders with the highest prevalence rates were social phobia, MDE, OCD and PTSD in that order.
Concordance between clinicians and the CIDI for both the ICD-10 and DSM-IV diagnostic codes was generally poor as all kappas were below 0.60.
Agreement between clinicians and the CIDI-Auto (DSM-IV) ranged from poor for social phobia and PTSD to moderate for OCD. Agreement between clinicians and the CIDI-Auto (ICD-10) ranged from poor for MDE to moderate for OCD. Disregarding hierarchical rules for diagnostic criteria did not influence concordance between clinicians and the CIDI-Auto for any of the diagnostic categories with the exception of DSM-IV panic disorder ± agoraphobia where the kappa value doubled.
The sensitivity and specificity values indicate the ability of the clinicians in this study to detect ‘cases’ as identified by the CIDI-Auto and to reject non-cases. Sensitivity values were particularly low for the diagnoses of PTSD and GAD, suggesting that clinicians used a different set of criteria for diagnosing these disorders compared to the CIDI-Auto. Specificity values on the other hand were high for all the disorders, indicating that the clinicians’ criteria for rejecting ‘non-cases’ for particular disorders were similar to those of the CIDI-Auto.
The results of the present study are consistent with those reported by Rosenman et al. [9] who also found poor agreement between the CIDI-Auto and clinicians’ diagnoses, albeit in an acute psychiatric clinic, and notwithstanding methodological variations between the two studies. However, the concordance rates in the present study are lower than those found in a similar study by Peters and Andrews [8] who used more standardized methods of clinical assessment and included fewer clinicians in assessing patients than the current study.
There are several possible factors that may have contributed to the low concordance figures in this study. One is the difference between the information-collecting methods of the CIDI and clinicians [25, 26] as described in our Method section. Another possibility is that in the current study, clinicians may have only recorded those diagnoses that they believed required treatment, whereas the CIDI-Auto will accord any diagnosis if sufficient criteria are met. A third possible factor concerns the number of diagnoses made for each person. The CIDIAuto generated twice as many diagnoses as were made by the clinicians. This finding is consistent with the results of other studies [8, 9] which have found that the CIDI accorded twice as many diagnoses to patients as did psychiatrists. This lends support to the contention that the CIDI may have a lower threshold for diagnosing disorders than clinicians [8]. A fourth possible factor is related to the fact that patients completed the CIDI after the clinical interview. Therefore, they may have been primed to answer the CIDI questions more fluently. Combined with the lack of a time limit for completing the CIDI, patients had more time to think about their answers to the questions. A fifth possible factor is that patients may have answered questions believing that particular symptoms might increase their chances of being accepted into the treatment programmes offered by the DART Clinic. However, the possibility that aspects of a patient's clinical presentation may have changed between the clinical and CIDI-Auto interviews is unlikely, as more than 90% of the patients completed the CIDI-Auto interview within 2 weeks of the clinical assessment.
In the current study the CIDI-Auto was used to make gold standard diagnoses. Clinician diagnoses were compared against the CIDI diagnoses and showed low sensitivity (κ < 0.70) for all the disorders except for OCD (for ICD-10), but high specificity (κ < 0.70) for all the disorders. These values mean the instrument and the clinicians are not identifying the same people as having a specific condition. The major clinical implication of this is that patients could be directed to receive different treatments depending on whether the clinician or CIDIAuto made the diagnosis. So, for example, the CIDIAuto (DSM-IV) might diagnose a patient with social phobia, whereas the clinician might diagnose that same person with panic disorder.
This study has several limitations. The first pertains to the generalizability of the results. The data emanated from a tertiary referral unit specializing in anxiety and mood disorders and so it is understandable that higher rates of these particular disorders would be reported in such a setting than in the community at large. Second, there was no randomization of the order of the clinical interviews and the CIDI interviews to adjust for any potential order effects. Finally, the clinic's referral base was limited to the north-west region of metropolitan Melbourne; it is not known whether the prevalence rates recorded for the anxiety and depressive disorders in the current study, might differ across other regions.
Conclusions
Given increasing demands for research and clinical work against dwindling resources, it is not unreasonable to expect that the use of computerized interviews will increase and become more widespread. Already the CIDIAuto is being used in some clinical settings [8]. However, as the current study has demonstrated, the results generated from such instruments must be viewed with caution as there appears to be noticeable discrepancies between clinician-based diagnoses and those elicited by computerized instruments. A major implication is that if diagnosis alone directed treatment, then patients could receive different treatments, depending on whether the computer, or a clinician, made the diagnosis.
