Sage Journals: Discover world-class research

Abstract

Objective: We aimed to redefine the optimal self-report symptoms of depression suitable for creation of an item bank that could be used in computer adaptive testing or to develop a simplified screening tool for DSM-V.

Method: Four hundred subjects (200 patients with primary depression and 200 non-depressed subjects), living in Iraqi Kurdistan were interviewed. The Mini International Neuropsychiatric Interview (MINI) was used to define the presence of major depression (DSM-IV criteria). We examined symptoms of depression using four well-known scales delivered in Kurdish. The Partial Credit Model was applied to each instrument. Common-item equating was subsequently used to create an item bank and differential item functioning (DIF) explored for known subgroups.

Results: A symptom level Rasch analysis reduced the original 45 items to 24 items of the original after the exclusion of 21 misfitting items. A further six items (CESD13 and CESD17, HADS-D4, HADS-D5 and HADS-D7, and CDSS3 and CDSS4) were removed due to misfit as the items were added together to form the item bank, and two items were subsequently removed following the DIF analysis by diagnosis (CESD20 and CDSS9, both of which were harder to endorse for women). Therefore the remaining optimal item bank consisted of 17 items and produced an area under the curve (AUC) of 0.987. Using a bank restricted to the optimal nine items revealed only minor loss of accuracy (AUC = 0.989, sensitivity 96%, specificity 95%). Finally, when restricted to only four items accuracy was still high (AUC was still 0.976; sensitivity 93%, specificity 96%).

Conclusions: An item bank of 17 items may be useful in computer adaptive testing and nine or even four items may be used to develop a simplified screening tool for DSM-V major depressive disorder (MDD). Further examination of this item bank should be conducted in different cultural settings.

Keywords

depression Rasch item bank computer-adaptive testing DSM-V

There is extensive literature on screening and case-finding for depression but almost all evidence is from non-mental health specialists and primary care using English speaking patients [1]. Depression is a universal disorder, with high rates in both high and low income countries [2]. The prevalence of current major depression in Iraq is substantial, estimated at 7.2% [3].

Screening for depression has been supported by recommendations from the US Preventive Services Task Force [4], the UK National Institute of Clinical Excellence (NICE) [5] and the Canadian Task Force on Preventive Health Care [6]. The New Zealand Guidelines Group recommended focused screening in primary care for those at high risk [7]. A similar approach has been advocated by NICE [8]. An updated US Preventive Services Task Force consensus statement supported screening if staff were in place to assure accurate diagnosis, effective treatment and follow up [9]. A recent review of 16 studies found that the use of screening or case-finding instruments was associated with a modest increase in the recognition of depression by clinicians (relative risk 1.27, 95%CI 1.02 to 1.59), increasing the use of any intervention but not (significantly) prescription of antidepressant medications [10]. However, within primary care there is currently no consensus on which tool to use or whether existing measures are suitable for detecting depression and monitoring severity. Further, it is possible that none of the existing paper and pencil tools are sufficiently acceptable to allow high uptake in practice [11]. Practitioners typically want short or ultra-short tools taking only a few minutes to complete [12]. Paper-and-pencil tools are inflexible, forcing all participants to complete items that may be unnecessary. Modern psychometric methods, such as Rasch analysis may help with refinement of these tests. It has been demonstrated that items may be removed from existing depression instruments, using these models without affecting specificity and sensitivity [13,14]. Several item banks have already been successfully developed and applied to assessing psychological distress and anxiety [15–17]. These unidimensional item banks form the basis for computer adaptive testing (CAT) which may help to overcome the problem with selecting the optimal tool for detecting depression. CAT selects and presents targeted items from a calibrated item bank to the respondent, reducing test length without loss of precision and reliability [18] thereby significant reducing patient burden and potentially enhancing the detection of depression [19].

Our aim was to find the optimal set of symptoms of depression suitable for creation of an item bank that could be used in computer adaptive testing or one that could be considered for testing for (or even redefining) the diagnostic criteria for the forthcoming DSM-V. Ideally, we aimed to find the optimal set of symptoms with no more than nine items, equivalent to the length of the current DSM criteria as well as the optimal set of four symptoms that could be used as an abbreviated assessment method.

Materials and methods

Sample and patients

The recruitment of the sample was undertaken by trained psychiatrists from both inpatient and outpatient clinics in the psychiatric unit of the largest teaching hospital and two health centres which provide outpatient psychiatry service in Erbil. The controls were recruited randomly by dividing the city of Erbil (the capital of Kurdistan) into 10 regions. Standard random selection within each stratum was applied. The data were collected between April 2009 and March 2010.

Tools

We used a Kurdish translation of the following scales: the Center for Epidemiologic Studies Depression scale (CESD); the 9-item Patient Health Questionnaire (PHQ-9); the Hospital Anxiety and Depression Scale (HADS) and the Calgary Depression Scale for Schizophrenia (CDSS). The CESD scale is a short self-report scale designed to measure depressive symptoms in the general population [20]. When tested in household interview surveys and in psychiatric settings it was found to have a high internal consistency and adequate test–retest repeatability. The PHQ-9 [21] is based directly on the diagnostic criteria for major depressive disorder in the DSM-IV. It is viewed as a potentially valuable tool for assisting primary care clinicians in diagnosing depression as well as selecting and monitoring treatment. The HADS self-screening questionnaire for depression and anxiety [22] is a convenient instrument used by hospital patients with both somatic and non-somatic symptoms. It has equally good sensitivity and specificity as other commonly used self-rating screening instruments [23]. The CDSS was developed at the University of Calgary and its use has been evaluated for both relapsed and remitted patients with schizophrenia [24].

Rasch analysis

The family of Rasch models [25] measure latent traits; personal characteristics which are assessed indirectly from groups of questions and which cannot easily be evaluated directly. Rasch models provide estimates of where a person falls along a continuum for the trait, such as for instance, mental health (person ability). They also estimate the location of questions (‘items’) from questionnaires along the same continuum (item difficulty). The person ability and item difficulty parameters are estimated jointly to produce estimates (reported in ‘logits’ or log-odds), which are independent of both the items and sample employed.

Methods: general

Rasch analyses evaluate whether the items included are indeed measuring a unidimensional continuum, using principal components factor analysis. Items measuring a different latest trait are evaluated using ‘fit’ statistics, and misfitting items are removed from an item bank. It is also possible to ascertain whether the item bank is reliable in subgroups in the sample, using differential item functioning (DIF) to estimate item difficulty in subgroups of patients.

Methods: individual instruments

As all instruments contained polytomous response categories the Partial Credit Model was used in the Rasch analysis [26] using Winsteps software [27]. Each instrument was initially analysed separately. Response categories for each item (for each instrument) were evaluated to determine there was no threshold and category disordering. Following this item, fit was assessed against an infit mean square criterion of 1.3. Any items with infit mean square statistics >1.3 were deemed to be misfitting and removed from the analysis [28]. Unidimensionality was initially assessed by the percentage variance explained by the Rasch model, and the eigenvalues for the first contrast (of the residuals). Eigenvalues less than 2 may be assumed to indicate no additional dimensionality.

Methods: item bank

Common item equating was used to develop the item bank [29,30]. This method relies on the separate item pools sharing a minimum of five common items which are held constant (‘anchored’) whilst further items are added. At each stage of this process the item fit and dimensionality are assessed, and misfitting items removed. The process continues and a stable, unidimensional item bank may be achieved. At each iteration item fit was evaluated (as described above for the individual instruments). An item map was produced to highlight the distribution of item locations and person measures. The internal reliability (person separation in Rasch models) was also noted.

Dimensionality was assessed using the percentage variance explained and the eigenvalues. In addition, to this Smith [31] has recommended producing two sets of person measures: one derived from the highest loading positive items (correlation >0.3) onto the first factor derived from a principal components analysis (PCA) of the residuals, and the other from the highest negatively loading items. These can then be evaluated using a series of independent sample t-tests calculating the percentage falling outside the 95% confidence interval. However, the potential problem with this approach is that it is at risk of type I errors, particularly as the number of comparisons increases (i.e. larger sample sizes). We therefore applied a modified approach using a paired samples t-test to assess whether there was any statistical difference between the two sets of person estimates (including the 95% confidence interval). A Bland-Altman plot was also generated including the 95% level of agreement to assess the agreement between the two person measure estimates.

Finally, differential item functioning (DIF) was evaluated for age (<20, 21–30, 31–40, and 40++), gender and diagnosis (depression, no depression).

Results

Individual instruments

Each instrument (PHQ-9, HADS-D, CESD, CDSS) was initially analysed separately with a particular focus on threshold disordering, item misfit, and unidimensionality. When any threshold disordering was observed, categories were collapsed and the analysis re-run. Misfit items were removed and the analysis re-run. Principal components analysis was applied to the final remaining items to determine whether any additional dimensionality was present.

Threshold and category disordering

None of the items demonstrated any threshold disordering. In terms of category disorder, neither the PHQ-9 nor the CDSS demonstrated any disordering. Two items from the HADS (HADS-D1 and D3) and eight items from the CESD (CESD2, 4, 8, 10, 12, 15, 16 and 19) displayed category disordering. Collapsing across categories (to create binary categories) did not improve disordering for the two HADS items or four of the CESD items (CESD4, 8, 12 and 19).

Item misfit

Two items demonstrated misfit for the PHQ-9 (items 3 and 5). For the HADS-D there were three misfitting items (items 1, 3 and 6). Eleven CESD items demonstrated misfit (items 2, 4, 7, 8, 10–12, 15–17 and 19) and five items from the CDSS five items (items 3–7). All the remaining items from each instrument fell below the threshold.

Dimensionality

No further dimensionality was observed for any of the remaining items for all four instruments. The percentage variance explained by the items (i.e. the Rasch model) was as follows: PHQ-9 78%, CESD 75%, HADS 67%, and CDSS 85%). Similarly, the eigenvalues for the first contrasts were: PHQ-9 1.7, CESD 1.8, HADS 1.7 and CDSS 1.5.

Item bank

Item misfit

Following the initial individual Rasch analysis of the instruments there was a total of 24 items remaining of the original 45 (21 misfitting items removed). A further six items (CESD13 and CESD 17, HADS-D4, HADS-D 5 and HADS-D 7, and CDSS3 and CDSS4) were removed due to misfit as the items were added together to form the item bank, and two items were subsequently removed following the DIF analysis (see below) by diagnosis (CESD20 and CDSS9, both of which were harder to endorse for women). Therefore the remaining item bank consisted of 17 items (Table 1). None of the HADS-D items were retained in the final item bank.

Table 1.

The final 17 item bank

Item	Description	Location	Infit MNSQ	Infit ZSTD	Outfit MNSQ	Outfit ZSTD
CDSS8	Suicide	1.69	0.96	−0.4	0.73	−0.9
CESD9	I thought my life had been a failure	1.36	0.99	0	1.22	1.5
PHQ8	I have been moving or speaking slowly or been very fidgety and restless	1.15	0.98	− 0.1	1.03	0.2
PHQ9	I had thoughts that I would be better off dead, or thoughts of hurting myself	1.06	0.75	−3	0.72	−1.8
PHQ6	I have been feeling bad about myself or feeling like a failure	0.78	1.27	3.4	1.14	1.5
CDSS2	Hopelessness	0.5	0.79	−2.7	0.67	−3.1
CESD14	I felt lonely	0.1	1.07	0.9	1.09	0.9
CESD1	I was bothered by things that usually don't bother me	−0.09	1.2	2.3	1.34	2.7
CESD3	I felt that I could not shake off the blues even with help from my family or friends	−0.11	1.09	1.1	0.9	−1
PHQ2	I was feeling down, depressed, or hopeless	−0.4	0.8	−2.7	0.72	−2.9
CDSS1	Depression	−0.46	0.56	−7.1	0.51	−6.5
CESD18	I felt sad	−0.59	0.85	−2.1	0.8	−2.5
PHQ4	I have been feeling tired or having little energy	−0.62	1.23	2.9	1.29	2.8
CESD6	I felt depressed	−0.64	0.8	−2.8	0.67	−3.7
PHQ1	I had little interest or pleasure in doing things	−1.01	1.19	2.5	1.09	1
PHQ7	I had trouble concentrating on things, such as reading the newspaper or watching TV	−1.28	1.22	2.7	1.29	1.7
CESD5	I had trouble keeping my mind on what I was doing	−1.45	1.1	1.4	1	0.1

Infit/outfit MNSQ, Infit/outfit mean square; Infit/outfit ZSTD, Infit/outfit standardized t. Items highlighted in bold were used to create the 9-item set; Items highlighted in bold and italicized were used to form the 4-item set.

Table 1 shows that one question regarding suicidal ideation was hardest to endorse, namely CDSS8 (‘Have you felt that life wasn't worth living?’). The next hardest question to endorse was CESD9 (‘I thought my life had been a failure’), whereas two questions relating to problems concentrating were the easiest to endorse: CESD5 (‘I had trouble keeping my mind on what I was doing’) and PHQ7 (‘I had trouble concentrating on things, such as reading the newspaper or watching TV’). This suggests good face validity for the item bank. Furthermore, the internal reliability of the 17 items was (Cronbach) a = 0.96.

The average person estimate was −1.26 logits (standard deviation (SD) = 3.30, range −7.07 to 8.18 logits), whereas item locations ranged from −1.45 to 1.69 logits. A closer look at the Rasch half-point thresholds indicated a range of threshold from −4.5 to 5.5 logits suggesting greater overlap with the distribution of person measures.

Dimensionality

The percentage variance explained was 78% and the eigenvalue for the first contrast was 2.4, suggesting some additional dimensionality. However, the results of the t-test (including 95%CI) indicated no statistical difference between the two person measure estimates (t(399) =−1.47, p = 0.14, 95% −0.49 to 0.07). Figure 1 shows the Bland-Altman plot. Just over 5% (5.5% or 22/400) of the pairs of estimates fell outside the 95% limits of agreement (−5.90 to 5.48) suggesting good agreement between the two estimates. There is therefore little evidence to suggest the presence of any additional dimensionality.

Figure 1.

Bland-Altman plot of means against the differences of person measures estimated from the positive and negatively loading items.

Differential item functioning

No differential item functioning was detected for age group, diagnosis or gender for the remaining 17 items.

Accuracy against MINI DSM-IV Criteria for MDD

We examined the accuracy of the full item bank (17 items) against DSM-IV MDD. The area under the curve was 0.987. Using a bank restricted to the optimal nine items revealed no loss of accuracy. The area under the curve (AUC) was 0.989 with the following nine items: PHQ4, PHQ6, PHQ8, PHQ9, CESD1, CESD3, CESD9, CDSS2, CDSS8. At the optimal cut point sensitivity was 96% and specificity 95%. Finally, when restricted to only four items, accuracy was still high (AUC was still 0.976; sensitivity 93%, specificity 96%) (Table 2). The four chosen items were the PHQ6, PHQ8, CESD9, CESD14.

Table 2.

ROC data area under the curve, and optimal thresholds, sensitivity and specificity

						Asymptotic 95% CI
Items	Area	Threshold	Sensitivity	Specificity	Standard error	Upper Bound	Lower Bound
17 item bank	0.987	14	0.95	0.98	0.005	0.977	0.997
9 items†	0.989	6	0.96	0.95	0.004	0.981	0.998
4 items‡	0.976	2	0.93	0.96	0.008	0.961	0.992

†

PHQ4, PHQ6, PHQ8, PHQ9, CESD1, CESD3, CESD9, CDSS2, CDSS8;

‡

PHQ6, PHQ8, CESD9, CESD14.

Discussion

We recruited 200 patients with primary depression and 200 non-depressed subjects living in the Kurdistan region of Iraq. A semi-structured interview, the MINI, as applied by trained psychiatrists, was used as a gold standard to define DSM-IV major depression. We found at the instrument level, the PHQ-9, CESD and CDSS performed reasonably well, but the HADS did not have strong diagnostic validity. Two items demonstrated misfit for the PHQ-9 (items 3 and 5) and for the HADS-D there were three misfitting items (items 1, 3 and 6). Eleven CESD items demonstrated misfit (items 2, 4, 7, 8, 10–12, 15–17 and 19) and five items from the CDSS five items (items 3–7). All the remaining items from each instrument fell below the threshold for misfit.

A symptom level Rasch analysis reduced the original 45 items to 24 items after the exclusion of 21 misfitting items. A further six items (CESD13 and 17, HADS-D4, D5 and D7, and CDSS3 and 4) were removed due to misfit as the items were added together to form the item bank, and two items were subsequently removed following the DIF analysis by diagnosis (CESD20 and CDSS9, both of which were harder to endorse for women). Therefore the remaining optimal item bank consisted of 17 items. Interestingly, none of the HADS-D items were retained in the final item bank. We tested these 17 items against DSM-IV criteria for MDD and found high concordance (AUC = 0.987). Indeed, a restricted list of nine symptoms was equally accurate (AUC = 0.989) and a brief set of four items was almost as good (AUC = 0.976). (Figure 2) The four chosen items were the PHQ6, PHQ8, CESD9, CESD14 (Table 2).

Figure 2.

ROC curve for items 17, 9 and 4. Diagonal segments are produced by ties.

Before continuing we wish to acknowledge several important limitations. We conducted this study in Iraq where the rates of illiteracy are high. We often relied upon our trained staff to deliver questions verbally rather than solely using self-report from the translated Kurdish version of the questionnaires. For this reason we did not attempt back translation of the questionnaires. It is therefore uncertain to what extent these findings would be replaced in other settings and other cultures.

One possible development of the item bank is computer-adaptive testing where items are selected from the bank based on individual patients' previous responses. The benefit of these systems is that not only is the assessment of depression potentially more accurate, each assessment is also more relevant to the individual patient. Computer-adaptive tests have already been developed for use identifying depression, for instance in psychiatric populations. Fliege and colleagues [32] have developed a system for measuring depression (D-CAT). This was derived from 11 mental health questionnaires and following an item-response analysis the final item bank comprised 64 items. Two simulation studies demonstrated that levels of depression could be estimated reliably from six items only with high precision and low patient burden. A further validation study demonstrated that the computer-adaptive test was able to distinguish well between diagnostic groups. Similar results have also been shown with a computer adaptive test derived from the 626-item Mood and Anxiety Spectrum Scales (MASS) to identify anxiety and mood disorders in patients attending outpatient clinics [33] From our results a CAT constructed from Table 1 would probably have an average person measure (on the latent trait) of −1.26 logits. The nearest item to this is PHQ7 (−1.28). The order could then be PHQ2 (difference of ++1 logit), then CESD14 (0.5 logit), PHQ9 (0.5 logit) and finally either CDSS8 or CESD9.

This calibrated item bank might be beneficial for developing new fixed-length paper and pencil tests for depression and for development of new criteria for the forthcoming DSM-V or ICD-11. Several authors have made empirical suggestions for redefining mood disorders in DSM-V [34–37]. However, little primary evidence concerning diagnostic symptoms has been published [38,39]. In an analysis of accuracy against DSM-IV criteria for MDD we found that nine items and four items from the bank would offer excellent psychometric properties. Given that these items are not simply a list of DSM/PHQ symptoms but also included CESD9 and CESD14, there appears to be scope for further work to re-examine the fundamental criteria in DSM-IV with a view to creating a new possibly simplified algorithm for DSM-V or a simplified screening and case-finding tool.

Declaration of interest: The initial draft of this paper was produced whilst ABS was employed by the University of Leeds, UK. The authors alone are responsible for the content and writing of the paper.

References

Mitchell

Vaze

Rao

. Clinical diagnosis of depression in primary care: a meta-analysis. Lancet 2009; 374:609–619.

Andrade

Caraveo-Anduaga

Berglund

. The epidemiology of major depressive episodes: results from the International Consortium of Psychiatric Epidemiology (ICPE) surveys. Int J Methods Psychiatr Res 2003; 12:3–21.

Alhasnawi

Sadik

Rasheed

. The prevalence and correlates of DSM-IV disorders in the Iraq Mental Health Survey (IMHS). World Psychiatry 2009; 8:97–109.

Agency for Healthcare Research and Quality. Screening for depression: systematic evidence review number 6. Rockville, MD: Agency for Healthcare Research and Quality, 2002.

National Institute for Clinical Excellence. Depression: core interventions in the management of depression in primary and secondary care. London: HMSO, 2004.

MacMillan

Patterson

CJS

Wathen

. The Canadian Task Force on Preventive Health Care. Screening for depression in primary care: recommendation statement from the Canadian Task Force on Preventive Health Care. Can Med Assoc J 2005; 172:33–35.

New Zealand Guidelines Group. Available at URL: http://www.spinz.org.nz/file/downloads/pdf/file_225.pdf

National Institute for Clinical Excellence. Depression: core interventions in the management of depression in primary and secondary care. London: HMSO, 2004, 2009.

US Preventive Services Task Force. Screening for depression in adults: US Preventive Services Task Force Recommendation Statement. Ann Int Med 2009; 151:784–792.

10.

Gilbody

Sheldon

House

. Screening and case-finding instruments for depression: a meta-analysis. Can Med Assoc J 2008; 178:997–1003.

11.

Mitchell

Coyne

. Screening for postnatal depression: barriers to success. Br J Ob Gyn 2009; 116:11–14.

12.

Mitchell

Coyne

. Do ultra-short screening instruments accurately detect depression in primary care? A pooled analysis and meta-analysis of 22 studies. Br J Gen Pract 2007; 57: 144–151.

13.

Smith

Wright

Rush

Stark

Velikova

Selby

. Rasch analysis of the dimensional structure of the Hospital Anxiety and Depression Scale. Psycho-Oncol 2006; 15: 817–827.

14.

Tang

Wong

Chiu

. The Geriatric Depression Scale should be shortened: results of Rasch analysis. Int J Geriatr Psychiatry 2005; 20:783–789.

15.

Fliege

Becker

Walter

Bjorner

Klapp

Rose

. Development of a computer-adaptive test for depression (D-CAT). Qual Life Res 2005; 14:2277–2291.

16.

Smith

Rush

Wright

Stark

Velikova

Sharpe

. Validation of an item bank for detecting & assessing psychological distress in cancer patients. Psycho-Oncol 2009; 18:195–199.

17.

Becker

Fliege

Kocalevent

. Functioning and validity of a computerized adaptive test to measure anxiety (A-CAT). Depress Anxiety 2008; 25:E182–194.

18.

Gershon

. Computer adaptive testing. J Appl Meas 2005; 6: 109–127.

19.

Forkmann

Boecker

Norra

. Development of an item bank for the assessment of depression in persons with mental illnesses and physical diseases using Rasch analysis. Rehabil Psychol 2009; 54:186–197.

20.

Radloff

. The CES-D scale: a self-report depression scale for research in the general population. Appl Psychol Meas 1977; 1: 385–401.

21.

Spitzer

Williams

Kroenke

. Utility of a new procedure for diagnosing mental disorders in primary care. The PRIME-MD 1000 study. JAMA 1994; 272:1749–1756.

22.

Zigmond

Snaith

. The Hospital Anxiety and Depression Scale. Acta Psych Scand 1983; 67:361–370.

23.

Bjelland

Dahl

Haug

Neckelmann

. The validity of the Hospital Anxiety and Depression Scale. An updated literature review. J Psychosom Res 2002; 52:69–77.

24.

Addington

Maticka-Tyndale

Joyce

. Reliability and validity of a depression rating scale for schizophrenics. Schizophr Res 1992; 6:201–208.

25.

Rasch

. Probabilistic models for some intelligence and attainment tests. Chicago, IL: University of Chicago Press, 1980.

26.

Masters

. A Rasch model for partial credit scoring. Psychometrika 1982; 47:149–174.

27.

Linacre

. A user's guide to Winsteps/Ministeps Rasch-model programs. Chicago, IL: MESA Press, 2005.

28.

Smith

. Detecting and evaluation the impact of multidimensionality using item fit statistics and principal component analysis of residuals. J Appl Meas 2002; 3:205–231.

29.

Bode

Lai

Dineen

Heinemann

Shevrin

Von Roenn

Cella

. Expansion of a physical function item bank and development of an abbreviated form for clinical research. J Appl Meas 2006; 7:1–15.

30.

Smith

Rush

Velikova

. The initial development of an item bank to assess and screen for psychological distress in cancer patients. Psycho-Oncol 2007; 16:724–732.

31.

Smith

Rush

Fallowfield

Velikova

Sharpe

. Rasch fit statistics and sample size considerations for polytomous data. BMC Med Res Methodol 2008; 8:33.

32.

Fliege

Becker

Walter

Rose

Bjorner

Klapp

. Evaluation of a computer-adaptive test for the assessment of depression (D-CAT) in clinical application. Int J Methods Psychiatr Res 2009; 18:23–36.

33.

Gibbons

Weiss

Kupfer

. Using computerized adaptive testing to reduce the burden of mental health assessment. Psychiatr Serv 2008; 59:361–368.

34.

Regier

Narrow

Kuhl

. The conceptual development of DSM-V. Am J Psych 2009; 166: 645–650.

35.

Joyce

. Classification of mood disorders in DSM-V and DSM-VI. Aust N Z J Psych 2008; 42:851–862.

36.

Klein

. Classification of depressive disorders in the DSM-V: Proposal for a two-dimension system. J Abnorm Psychol 2008; 117:552–560.

37.

Parker

. How should mood disorders be modelled? Aust N Z J Psych 2008; 42:841–850.

38.

Mitchell

McGlinchey

Young

. Accuracy of specific symptoms in the diagnosis of major depressive disorder in psychiatric out-patients: data from the MIDAS project. Psychol Med 2009; 39:1107–1116.

39.

Zimmerman

Galione

Chelminski . A simpler definition of major depressive disorder. Psychol Med 2010; 40:451–457.

Redefining Diagnostic Symptoms of Depression Using Rasch Analysis: Testing an Item Bank Suitable for DSM-V and Computer Adaptive Testing

Abstract

Keywords

Materials and methods

Sample and patients

Tools

Rasch analysis

Methods: general

Methods: individual instruments

Methods: item bank

Results

Individual instruments

Threshold and category disordering

Item misfit

Dimensionality

Item bank

Item misfit

Dimensionality

Differential item functioning

Accuracy against MINI DSM-IV Criteria for MDD

Discussion

References