Sage Journals: Discover world-class research

Abstract

Background: In population surveys, the assessment of mania is commonly done by trained lay interviewers using structured diagnostic instruments: the validity of this approach has been questioned. We examined the criterion validity and prevalence of lifetime mania in a survey of Swedish twins conducted with interview methodology usually applied in psychiatric epidemiology.

Methods: 41 838 individuals in the Swedish Twin Registry were evaluated via a telephone interview that included the eight DSM-IV mania items, and these data were merged with inpatient hospitalization discharge diagnoses from two comprehensive national registries (the criterion). An algorithm with eight cut-points was used to diagnose lifetime mania, and compared by a receiver operator characteristic curve to the criterion. The algorithm requiring at least four positive items resembling a DSM-IV diagnosis.

Results: History of hospitalization for a psychiatric condition that included a manic episode was present for 0.7% of all living twins, and predicted non-response to the survey (OR = 0.5; 95% CI = 0.4–0.6). The incidence rate for first hospitalization was 2.1/10 000 year ^–1. For ≥1 symptom (first cut-point), the prevalence, sensitivity and specificity were 3.6%, 39.0% and 96.6%; for ≥ 4 symptoms (DSM-IV-like cut-point) 2.6%, 36.5% and 97.6%; and for eight symptoms 0.3%, 18.0% and 99.8%. Positive predictive values were, respectively, 5.5%, 7.0% and 29.8%.

Conclusions: The performance of the telephone screening for mania by lay interviewers in terms of positive predictive power was not satisfactory; despite a high specificity, the false positive rate was high. The low population prevalence of mania, non-response bias, criterion choice and inherent limitations of the interviewing method are among the explanations. Assessment of a lifetime manic episode based on lay interviewer screening may yield misleading data.

Keywords

epidemiology incidence bipolar disorder population screening reliability survey research.

In psychiatric epidemiology, mania is commonly assessed via personal interviews with structured diagnostic instruments conducted by ‘lay’ personnel (i.e. individuals with some relevant training but who are not experienced clinicians). The validity of this approach has rarely been studied and some data suggest that it may be poor. In a small clinical reappraisal study, Kessler and coworkers [1] found that 70.8% of community respondents classified as bipolar type-I in the National Comorbidity Survey (NCS) [2] were not confirmed when reinterviews were carried out by trained clinicians using the Structured Clinical Interview for DSM-III-R (SCID) [3].

In the present report, we investigate the accuracy of the diagnosis of a lifetime manic episode from the telephone screening of a national survey of Swedish twins, the Screening Across the Lifespan Twin (SALT) study. In addition to evaluating the validity of lifetime prevalence estimates of mania in this sample of the general population, we sought to evaluate the performance of different diagnostic algorithms.

The Swedish Twin Registry (STR) is a longitudinal database as well as the largest twin registry in the world. It has been the source of a large number of epidemiological, genetic epidemiological, and molecular genetic studies. Its structure has been described in detail elsewhere: the registry is composed by three cohorts including all twins, respectively, born 1886–1925 (‘Old cohort’), 1926–1958 (‘Middle cohort’) and 1959–1990 [4], [5].

This report has four goals: (i) to assess participation bias, with relation to hospitalization for a psychiatric condition that included a manic episode, by comparing participants and non-participants in the SALT survey; (ii) to measure the concordance between lifetime mania, as assessed by telephone interview, with hospitalization data (i.e. postdictive criterion validity analysis) [6]; (iii) to understand which individual characteristics (e.g. age and gender) predict agreement; and (iv) to estimate prevalence and incidence of hospitalization history for a psychiatric condition consistent with the presence of a manic episode, and the lifetime prevalence of mania at the interview in this nationally comprehensive twin sample.

Method

Sample, databases, and criterion assessment

All twins born in Sweden 1958 and earlier (the ‘Old and Middle cohorts’ of the STR) who were still alive at the beginning of 1998 were invited to participate in a computer-assisted telephone interview. Psychiatric conditions screened for include lifetime manic and depressive episodes. The criterion to which we compared diagnoses derived from the telephone interview was inpatient hospitalization for a psychiatric condition that included a manic episode. We linked, by means of personal identification numbers, all twins in the SALT target sample (regardless of participation in SALT) to hospitalization records derived either from the Swedish National Psychiatric Registry (years 1969–1986) or the National Inpatient Registry (years 1983-present): the 4-year overlap is due to a transition period in which the hospitalization data changed registry of destination following a regional pattern. These two registries are essentially a complete listing of all inpatient admissions in Sweden and have been the source of numerous publications on the epidemiology of psychiatric disorders [7], [8]. The criterion was considered present if the hospital discharge diagnoses included a standardized code consistent with the presence of a manic episode. During the years included in this study, for bipolar disorder (BD), these codes included ICD-8 (296.1, 296.2, and 296.3), ICD-9 (296.0, 296.1, 296.4-296.7), and ICD-10 (F30.0, 30.2 and 30.9 and F31.0-31.9). As schizoaffective disorder (SAD) can also include a manic episode, we included the relevant ICD-8/ICD-9 (295.7), and ICD-10 codes (F25.0) [9], [10]. The data collection procedures were reviewed and approved by the Swedish Data Inspection Board and the Regional Ethics Committee of the Karolinska Institutet. All subjects provided verbal informed consent during the telephone interview which was later confirmed by postcard.

Diagnostic assessment

The SALT interview was constructed in a branching format that allowed skipping follow-up questions whenever a person answered negatively to key introductory items. Lay interviewers, appropriately trained and possessing adequate medical background, assessed the presence or absence of manic symptoms. The eight questions used to pose a diagnosis of mania over the lifetime are based on the ICD-10 and the DSM-IV criteria of manic episode [10], [11] and were adapted from the Composite International Diagnostic Interview (CIDI) [4], [12]. The SALT interview, akin to the CIDI, has a stem question based on the DSM-IV ‘A’ criterion for mania: ‘Has there ever been a period of 1 week or more when you were so happy or excited or high that other people did not think you were your usual self?’ Respondents who answered positively were asked seven additional questions and those who answered negatively skipped to the next section. The other seven diagnostic items cover the DSM-IV ‘B’ criterion for a manic episode (excessive self-confidence, reduced need for sleep, increased talkativeness, flight of ideas, distractibility, augmented activity, and involvement in pleasurable/risky behaviours). The presence of mixed (‘C’ criterion) or psychotic features, impairment or hospitalization as consequences of the episode (‘D’), and the presence of concomitant medical conditions or treatments (‘E’) were not assessed.

A diagnostic algorithm with eight cut-points for the definition of mania was tested with the number of items ranging from one (criterion ‘A’ stem question only positive) to eight (criterion ‘A’ plus all seven items of the ‘B’ criterion positive). The algorithm requiring at least three positive ‘B’ items had a DSM-IV-like structure.

Analytic procedures

Characteristics of participants and non-participants were compared using simple statistics and logistic regression [13].

We conducted two separate receiver operating characteristic (ROC) analysis, one for the primary criterion ‘≥ 1 hospitalization for BD or SAD’ and another for the secondary criterion ‘≥ 2 hospitalizations’, to assess specificity and sensitivity at each of eight cut-points [14]. Positive and negative predictive values [15] and κ statistics [16] were calculated.

Characteristics of true and false positives were then compared fitting a logistic regression model to the data [13], [17]. Models took into account the twin-structure of data and correlation of measures among family members. Discrimination of the final logistic models was tested through C-statistics, and calibration of the models through Hosmer-Lemeshow χ² goodness-of-fit statistics [13]. Lifetime prevalences were computed for individuals hospitalized with a discharge diagnosis consistent with the presence of a manic episode, as well as for those diagnosed with a lifetime manic episode through the diagnostic algorithm. Using first admission dates for BD or SAD, we calculated incidence rates (IR) of first hospitalization and smoothed hazard estimate curves. All analyses were performed using the Intercooled Stata 8.2 software package [18].

Results

60 236 twins were eligible for participation and 41 838 individuals (69.46%) had completed at least part of the SALT interview. As shown in Table 1, participants in the study were more likely than-non-participants to be female, to be a member of a monozygotic twin pair, to be like-sexed when dizygotic twins, and not to have had a history of hospitalization with a discharge diagnosis consistent with the presence of a manic episode. These results did not change considerably when each association was adjusted in a multivariate fashion for the other measured covariates.

Table 1.

Characteristics of 60 236 twins who did and did not participate in the SALT study

	Participants 41 838 n (%)	Non participants 18 398 n (%)	OR (95% CI)
Female sex	22 487 (53.75)	9406 (51.13)	1.11 (1.07–1.15)
Age (mean ± SD)	58.05 ± 11.97 years	58.20 ± 13.30 years	–0.15 years (−0.36–0.06)^†
Twin zygosity
Monozygotic	10 155 (24.27)	3205 (17.42)	1.23 (1.18–1.29)
Dizygotic
Same sex	15 906 (38.02)	5623 (30.56)	1.20 (1.15–1.26)
Opposite sex	14 937 (35.70)	6359 (34.56)
Unknown	840 (2.01)	3211 (17.45)
Hospitalizations for mania
At least one	224 (0.54)	197 (1.07)	0.50 (0.41–0.61)
At least two	114 (0.27)	120 (0.65)	0.42 (0.32–0.54)

^†Expressed as difference of mean values (95% CI)

Characteristics of twins negative and positive for a lifetime manic episode (DSM-IV-like algorithm) are reported in Table 2. Twins positive for mania tended to be male, younger, unmarried and with lower self-reported health, and were about six times more likely to be positive for a depressive lifetime episode at the interview.

Table 2.

Characteristics of 41 838 participating twins positive or negative for lifetime mania according to the SALT telephone interview (DSM-IV-like algorithm)

	Positive for mania 1045 n (%)	Negative for mania 38 762 n (%)	OR (95% CI)
Female sex	528 (50.53)	20 845 (53.78)	0.88 (0.77–0.99)
Age at interview (mean ± SD)	54.75 ± 8.46 years	59.95 ± 10.65 years	–5.20 years (−5.85– −4.55)^†
Twin zygosity
Monozygotic	253 (24.21)	9436 (24.34)	0.99 (0.86–1.15)
Dizygotic
Same sex	368 (35.22)	14 704 (37.93)	0.89 (0.77–1.03)
Opposite sex	392 (37.51)	13 919 (35.91)
Unknown	32 (3.06)	703 (1.81)
Marital status
Single	152 (14.55)	3666 (9.46)	1.63 (1.36–1.95)
Partner	162 (15.50)	4410 (11.38)	1.43 (1.20–1.70)
Married	495 (47.37)	24 271 (62.62)	0.54 (0.47–0.61)
Separated	18 (1.72)	197 (0.51)	3.43 (1.98–5.59)
Divorced	157 (15.02)	2787 (7.19)	2.28 (1.91–2.73)
Widow	60 (5.74)	3419 (8.82)	0.63 (0.48–0.82)
Unknown	1 (0.10)	12 (0.03)
Living arrangement
House (villa)	488 (46.70)	22 391 (57.77)	0.64 (0.57–0.73)
Town-house	75 (7.18)	3041 (7.85)	0.91 (0.71–1.15)
Flat	469 (44.88)	13 065 (33.71)	1.60 (1.41–1.82)
Other	12 (1.15)	252 (0.65)	1.78 (0.90–3.17)
Unknown	1 (0.10)	13 (0.03)
Self reported health
Excellent	241 (23.06)	12 549 (32.37)	0.63 (0.54–0.73)
Good	306 (29.28)	14 170 (36.56)	0.72 (0.63–0.82)
Average	272 (26.03)	8089 (20.87)	1.33 (1.16–1.54)
Not good	165 (15.79)	3026 (7.81)	2.21 (1.86–2.63)
Bad	59 (5.65)	841 (2.17)	2.70 (2.02–3.55)
Unknown	2 (0.19)	87 (0.22)
Positive for major depression^‡	589 (56.36)	6973 (17.99)	5.89 (5.19–6.69)
Unknown	12 (1.15)	516 (1.33)

^†Expressed as difference of mean values (95% CI); ^‡assessed at the SALT interview with a DSM-IV-like algorithm

Of twins who endorsed both the mania and the depression questions, about 43% of those with a diagnosis of manic episode (1.10% of respondents) were negative for lifetime depression: history of hospitalization among them was only 3.20%, against 10.00% for those positive to both mania and depression at the interview. Among 63 individuals who reported being on medications for mania, only 24 (38.10%) had a history of hospitalization for BD or SAD.

To test validity of lifetime prevalence estimates of mania via telephone interview we used two criteria. The first required ≥1 hospitalization with a discharge diagnosis consistent with the presence of a manic episode and the second required ≥2 hospitalizations. We then conducted ROC analysis using the criterion of ‘at least one hospitalization’ for eight cut-points, from the least (criterion ‘A’ only) to the most restrictive (criterion ‘A’ plus all seven positive ‘B’ items). The most accurate overall cut-point (i.e. closest to the upper-left corner in the graph of the ROC curve) was the least restrictive (criterion ‘A’ only) (39.00% and 96.61%) while the DSM-IV-like cut-point showed a lower level of total accuracy (36.50% and 97.55%), but was more specific. The area under the curve (AUC) was 0.68.

The prevalence of twins having at least one lifetime hospitalization for BD or SAD is 0.54% among those who took part in the SALT study and 0.70% overall, while 3.57% is the prevalence of those who responded positively to the stem question: this cut-point showed a negative predictive power (NPP) of 99.68% and a positive predictive power (PPP) of 5.49%, indicating that only 5–6 subjects out of 100 who are diagnosed using the stem question alone were ever hospitalized. The κ statistic was 0.09. The DSM-IV-like cut-point had a marginally better PPP (6.99%) and an equivalent very high NPP (99.67%), while κ was 0.11.

Another ROC curve was computed to test the criterion of ‘≥ 2 hospitalizations.’ As expected, the sensitivity of the algorithm increased and its better performance is mirrored in an AUC of 0.76. The most accurate cut-point was again the first, requiring just the stem question.

Adopting a more selective criterion, the prevalence of twins with a history of hospitalization decreased to 0.27% among SALT participants, and to 0.39% for all subjects in the STR. The first cut-point had a NPP of 99.88% and a PPP of 3.94%: only 4 out of 100 who were considered positive by virtue of the answers to the telephone interview had been hospitalized at least twice for a condition consistent with the presence of a manic episode. κ for this comparison was 0.07.

Predictors of agreement between mania at the interview (DSM-IV-like algorithm) and history of hospitalization (at least one) were identified comparing true versus false positives in a logistic regression model: lifetime depression at the interview (adjusted odds ratio [OR] = 3.23), absence of a spouse or partner (1.92), low self-reported health (1.91), female sex (1.41) and older age at interview (1.30, for each 10-year increase in age) predicted convergent diagnoses. Variables tested but not included in the final model comprise zygosity, housing, tobacco, coffee and alcohol consumption. Model discrimination between true and false positives was acceptable (C-statistics = 0.73) and the Hosmer-Lemeshow χ² goodness-of-fit statistics, that there is no difference between the observed and the predicted values, was not rejected (H-L χ² = 5.83, p = 0.67) [13].

History of hospitalization for BD or SAD was present for 0.70% of all living twins, while lifetime prevalence of mania at the interview (DSM-IV-like algorithm) was 2.63%.

The incidence rate (IR) for first hospitalization for BD or SAD was 2.08/10 000 year ^–1 (95% CI = 1.88–2.29) over a 32-year period (1969–2000). IR for males was 1.39/10 000 year ^–1 (95% CI = 1.17–1.66), and almost twice as high for females: 2.69/10 000 year ^–1 (95% CI = 2.39–3.03).

Discussion

Validity and predictors of agreement

We compared two possible definitions of the gold standard or criterion, against eight possible definitions of mania based on different cut-offs at the interview, for a total 16 possible combinations. As shown in Figure 1 and Table 3, the validation analysis encompassed sensitivity and specificity, negative and positive predictive values and κ statistics, allowing an appreciation of the dynamics of the assessment of mania in a community sample.

Figure 1.

Prevalence of positive items for mania. Sensitivity, specificity, negative and positive predictive power at each cut-point for two different definitions of the criterion. A vertical line is drawn at the DSM-IV-like cut-point.

Table 3.

Prevalence, ROC analysis, NPP, PPP and? statistics comparing the eight cut-points and the two criteria of ‘at least one’ (primary) and ‘at least two’ (secondary) lifetime hospitalizations for mania†

Cut-point	Prevalence	Sensitivity	Specificity	NPP	PPP	κ
Primary criterion (prevalence 0.54%)
I	3.57	39.00	96.61	99.68	5.49	0.088
	(3.39–3.75)	(32.20–46.13)	(96.42–96.78)	(99.62–99.74)	(4.36–6.80)	(0.003)
II	3.47	38.00	96.71	99.68	5.50	0.088
	(3.29–3.65)	(31.25–45.11)	(96.52–96.88)	(99.62–99.73)	(4.36–6.84)	(0.003)
III	3.20	38.00	96.97	99.68	5.96	0.095
	(3.03–3.37)	(31.25–45.11)	(96.80–97.14)	(99.62–99.73)	(4.72–7.40)	(0.003)
(DSM-IV-like cut-point)
IV	2.63	36.50	97.55	99.67	6.99	0.110
	(2.47–2.79)	(29.82–43.58)	(97.39–97.70)	(99.61–99.73)	(5.52–8.70)	(0.004)
V	1.97	33.50	98.18	99.66	8.52	0.129
	(1.83–2.11)	(27.00–40.50)	(98.05–98.31)	(99.60–99.71)	(6.67–10.70)	(0.004)
VI	1.24	29.00	98.90	99.64	11.74	0.161
	(1.13–1.35)	(22.82–35.82)	(98.79–99.00)	(99.57–99.70)	(9.04–14.91)	(0.005)
VII	0.73	24.50	99.39	99.62	16.96	0.196
	(0.65–0.81)	(18.71–31.06)	(99.31–99.47)	(99.55–99.68)	(12.81–21.79)	(0.005)
VIII	0.30	18.00	99.79	99.59	29.75	0.221
	(0.25–0.35)	(12.94–24.04)	(99.73–99.83)	(99.52–99.65)	(21.79–38.74)	(0.005)
Secondary criterion (prevalence 0.27%)
I	3.57	54.37	96.56	99.88	3.94	0.069
	(3.39–3.75)	(44.26–64.22)	(96.38–96.74)	(99.84–99.91)	(2.99–5.08)	(0.003)
II	3.47	52.43	96.66	99.87	3.91	0.068
	(3.29–3.65)	(42.35–62.36)	(96.48–96.83)	(99.83–99.91)	(2.95–5.07)	(0.003)
III	3.20	52.43	96.92	99.87	4.24	0.074
	(3.03–3.37)	(42.35–62.36)	(96.75–97.09)	(99.83–99.91)	(3.20–5.49)	(0.003)
(DSM-IV-like cut-point)
IV	2.63	51.46	97.50	99.87	5.07	0.088
	(2.47–2.79)	(41.40–61.42)	(97.34–97.65)	(99.83–99.90)	(3.82–6.58)	(0.003)
V	1.97	48.54	98.15	99.86	6.36	0.108
	(1.83–2.11)	(38.58–58.60)	(98.01–98.28)	(99.82–99.90)	(4.76–8.30)	(0.003)
VI	1.24	39.81	98.86	99.84	8.30	0.134
	(1.13–1.35)	(30.29–49.92)	(98.75–98.96)	(99.80–99.88)	(6.02–11.09)	(0.004)
VII	0.73	33.98	99.36	99.83	12.11	0.175
	(0.65–0.81)	(24.94–43.97)	(99.28–99.44)	(99.78–99.87)	(8.58–16.44)	(0.004)
VIII	0.30	24.27	99.76	99.80	20.66	0.221
	(0.25–0.35)	(16.36–33.71)	(99.70–99.80)	(99.75–99.84)	(13.84–28.97)	(0.005)

^†Values expressed as percentage (95% CI); κstatistics (SE); ROC, receiver operating characteristic; NPP, negative predictive power; PPP, positive predictive power

With the ROC analysis we have looked at all the possible combinations between the two most plausible inpatient criteria and the eight cut-points corresponding to the number of positive items. In both ROC curves, using ‘≥ 1’ or ‘≥ 2’ hospitalizations, the most accurate overall cut-point was just with a positive A-criterion. Because of low PPP and high false positive rate, the corresponding interview-based algorithm would not be optimal for future studies of mania that may require more specific diagnoses. In such a case a diagnostic procedure with positive ‘B’ items is warranted, even if it implies reducing the number of eligible subjects for the study.

Using the second criterion (‘≥ 2 hospitalizations’) means reducing criterion prevalence and the test appears to have a better sensitivity (19]: likely, hospitalization is a specific but not highly sensitive measure, probably fairly sensitive for BD type-I (i.e. mania) but less for type-II (i.e. hypomania). The second criterion, being more restrictive, is more specific but less sensitive than the first criterion for ‘true’ (i.e. clinically diagnosable) mania. Indeed, if we consider the triple truth-criterion-test relation, there is a trade off between a more sensitive truth-criterion relation and a more sensitive criterion-test relation.

Reasons for poor agreement between the survey based diagnosis of mania and the criterion can be identified in low prevalence or base-rate of the illness [15] and of the corresponding hospitalization criterion, further decreased in typical survey samples by non-response bias, since subjects with a history of BD or SAD are less likely to participate to a survey. Additional explanations for the difficulty to diagnose mania accurately in the community are employment of lay interviewers, inadequate patient recollection, illness denial and poor insight: the structured nature of interviews performed by nonclinicians does not allow further expert exploration of patient answers.

In our study PPP for DSM-IV mania diagnosis (7.0%) was much lower than in the NCS study (29.2%) [1]: reasons can be identified in criterion choice, interview modality and sample size. On one hand, in the small NCS clinical reappraisal study the criterion used was the gold standard of a clinically diagnosed condition; also, a more sophisticated approach to question formulation and interpretation at the interview was used [20], while we employed a basic set of CIDI questions [4]. In addition the study by Kessler et al. like previous surveys of mania [21], adopted face-to-face interviews [2], while our data are based on telephone interviews, allowing cost-efficient screening of a very large number of individuals. It has been shown that agreement of telephone and face-to-face interviews for assessment of major mood disorders is good, and minor disagreements are counterbalanced by economic and logistic advantages [22]. On the other hand, the NCS small clinical reappraisal sample size allows imprecise estimates due to sampling variability, while our STR based estimates are calculated on a very large study population. Cross-national differences in prevalence of lifetime mania between the Swedish and the U.S. populations could in part account for different results.

Predictors of false positive status for lifetime mania include characteristics that can be seen related with wellbeing or enhanced psychomotor activity such as no lifetime depression at the interview, presence of a spouse or partner, high self-reported health, male gender, and younger age. Subjects who are positive to lifetime mania but not to lifetime depression at the interview could include a few cases with a history of unipolar mania [23].

Prevalence and incidence

Lifetime prevalence of DSM-IV mania diagnosed via telephone interview was 2.63%. Because of non-response, this estimate is calculated on the population of participants, which has 0.54% prevalence of hospitalization as shown in 1], compared to the previously reported 1.6% [2]. The earlier epidemiologic catchment area (ECA) study estimated a 0.8% lifetime prevalence of manic episodes [21]. High false positive BD rates in community surveys based on the CIDI questionnaire have been also reported in a Dutch study where only 11 of 49 subjects where reconfirmed as having BD type-I when reinterviewed by trained clinicians with the SCID [24].

A recent study that employed the Mood Disorder Questionnaire (MDQ) estimated the lifetime prevalence of BD to be as high as 3.7% in US adults [25]. The limitations of the MDQ study have been debated [26]. The authors considered the 3.7% estimate to be conservative, acknowledging that limitations of the study were likely to underestimate the prevalence of BD [25]. In the MDQ community-based validation study, Hirschfeld and collaborators [27] reported sensitivity and specificity of the screening instrument to be 28.1% and 97.2%. Even a specificity that is very close to 100% can yield high false positive rates with low base-rates, as it is the case of lifetime prevalence of manic episodes in the community.

As Baldessarini and colleagues pointed out, ‘along with sensitivity and specificity, the prevalence rate of the target diagnosis in the test population is of critical importance in estimating a test's predictive power or utility’ [15]. Disregarding base-rates in evaluating the results of diagnostic tests seems common [28]. Focusing solely on sensitivity and specificity of a test can induce the reader to mistake inflated prevalence estimates for accurate estimates [29].

Finally, IR of first hospitalization for BD or SAD was for males 1.39/10 000 year ^–1 (95% CI = 1.17–1.66), and almost twice as much for females: 2.69/ 10 000 year ^–1 (95% CI = 2.39–3.03). These results are consistent with the existing literature on incidence rates of BD [30].

Criterion and participation bias

To our knowledge, this is the first study comparing the performance of a telephone screening interview using hospitalization history as a criterion. The choice of the hospitalization criterion has strengths and weaknesses. Among the strengths there is the unique opportunity offered by the Swedish network of registries and by the universal health coverage in the country, which makes hospital discharge records useful. Indeed, as far as access to care is maximized, hospitalization is a reasonably accurate measure of the lifetime prevalence of manic episodes. The hospitalization criterion allows estimating participation bias, an occurrence that is usually difficult to assess in psychiatric survey studies.

One weakness is the necessity to assume as a criterion an entity that is less prevalent than the illness itself. The low base-rate of mania is a major source of inaccuracy [15]. Subjects who were still alive in 1998 and who had a history of hospitalization limited to the period preceding 1969 are negative to our criterion for mania. Other phenomena that could account for an underestimation of lifetime manic episodes through the hospitalization criterion are the closing of some psychiatric hospitals in Sweden, and that in the scarcely populated northern areas of the country there are often large distances from the closest medical facility. Furthermore, only part of those individuals who reported being on medications for mania had a history of hospitalization for a psychiatric condition consistent with the presence of a manic episode: this may suggest that the hospitalization criterion has limited accuracy. Sources of measurement inaccuracy in terms of either a manic episode classified otherwise or a non-manic patient (e.g. schizophrenia or agitated depression) classified as bipolar or schizoaffective at hospital discharge may encompass limited reliability of ICD diagnostic categorizations as appraised by hospital psychiatrists, or an error in transcribing the corresponding codes in the registry system.

Hospitalization for a psychiatric condition that included a manic episode was two fold higher in nonresponders as those affected by a current or a past psychiatric condition are less predisposed to participate in surveys. This finding is consistent with previous studies [31].

Regarding generalizability, Swedish twin population is not distinguishable from singletons in terms of incidence of treated psychotic and affective illness [32]. Our conclusions appear thus to be fairly applicable to all national residents.

Conclusions

The present analyses of the STR data of mania suggest caution in interpreting results of surveys investigating lifetime prevalence of mania or BD and emphasize limitations of current methods used for population screening. In reading and understanding results from such studies special attention should be given to the possibly high rate of false positives for a lifetime manic episode, which may overestimate the true prevalence of this condition.

Our ascertainment of non-response among those with and without a history of hospitalization for a psychiatric condition that included a manic episode provides an OR estimate of 0.5 that could be used as a correction factor in analysing or interpreting surveys of lifetime mania or BD where no valid direct estimate of this important source of bias is available.

Footnotes

Acknowledgements

Presented in part at the 156th Annual Meeting of the American Psychiatric Association, May 17–22, 2003, San Francisco, CA, and at the Fifth International Conference on Bipolar Disorder, 12–14 June, 2003, Pittsburgh, PA, US.

This report is based on data collected in the Screening Across the Lifespan Twin (SALT) study. PFS & NLP were supported by NIH grants NS-031483 and CA085739. NLP and SALT data collection also received support from NIH grant AG-08724 and a grant from the Swedish Scientific Council. FS was supported by a grant from the Swedish Foundation for International Cooperation in Research and Higher Education (STINT Institutional Grant Program, exchange Harvard Biostatistics-Stockholm Biostatistics Research Group, Karolinska Institutet; IG2001-038) and by the PhD program of the University of Pisa (Dept. of Psychiatry, School of Medicine). The authors have no conflicts of interest in relation to material presented in the manuscript.

The authors are especially grateful to Drs Deborah Blacker and Stefano Pini for their insightful suggestions and to Dr Ronald Kessler for his critique of the manuscript.

References

Kessler

R C

Rubinow

D R

Holmes

Abelson

J M

Zhao

. The epidemiology of DSM-III-R bipolar I disorder in a general population survey. Psychological Medicine 1997; 27: 1079–1089.

Kessler

R C

McGonagle

K A

Zhao

. Lifetime and 12 month prevalence of DSM-III-R psychiatric disorders in the United States: results from the National Comorbidity Survey. Archives of General Psychiatry 1994; 51: 8–19.

Spitzer

R L

Williams

J BW

Gibbon

First

M B

. The Structured Clinical Interview for DSM-III-R (SCID). I: History, rationale, and description. Archives of General Psychiatry 1992; 49: 624–629.

Lichtenstein

De Faire

Floderus

Svartengren

Svedberg

Pedersen

N L

. The Swedish Twin Registry: a unique resource for clinical, epidemiological and genetic studies. Journal of Internal Medicine 2002; 252: 184–205.

Pedersen

N L

Lichtenstein

Svedberg

. The Swedish Twin Registry in the third millennium. Twin Research 2002; 5: 427–432.

Goldstein

J M

Simpson

J C

. Validity: definitions and applications to psychiatric research. Textbook of psychiatric epidemiology, Tsuang

M T

Tohen

Zahner

. Wiley-Liss, New York 1995; 229–242.

Hultman

C M

Sparen

Takei

Murray

R M

Cnattingius

. Prenatal and perinatal risk factors for schizophrenia, affective psychosis, and reactive psychosis of early onset: case-control study. British Medical Journal 1999; 318: 421–426.

Osby

Brandt

Correia

Ekbom

Sparen

. Excess mortality in bipolar and unipolar disorder in Sweden. Archives of General Psychiatry 2001; 58: 844–850.

World Health Organization . International classification of disease. World Health Organization, Geneva 1967 and 1978.

10.

World Health Organization . The ICD-10 classification of mental and behavioural disorders: diagnostic criteria for research. World Health Organization, Geneva 1993.

11.

American Psychiatric Association . Diagnostic and statistical manual of mental disorders (DSM-IV) 4th edn. American Psychiatric Association, Washington, DC 1994.

12.

Robins

L N

Wing

Wittchen

H-U

Helzer

J E

. The Composite International Diagnostic Interview. An epidemiologic instrument suitable for use in conjunction with different diagnostic systems and in different cultures. Archives of General Psychiatry 1988; 45: 1069–1077.

13.

Hosmer

D W

Lemeshow

. Applied logistic regression. Wiley, New York 2000.

14.

Murphy

J M

Berwick

D M

Weinstein

M C

Borus

J F

Budman

S H

Klerman

G L

. Performance of screening and diagnostic tests. Application of receiver operating characteristic analysis. Archives of General Psychiatry 1987; 44: 550–555.

15.

Baldessarini

R J

Finklestein

Arana

G W

. The predictive power of diagnostic tests and the effect of prevalence of illness. Archives of General Psychiatry 1983; 40: 569–573.

16.

Cohen

. A coefficient of agreement for nominal scales. Educational and Psychological Measurement 1960; 20: 37–46.

17.

Beekman

A T

Deeg

D J

Van Limbeek

Braam

A W

De Vries

M Z

Van Tilburg

. Criterion validity of the Center for Epidemiologic Studies Depression scale (CES-D): results from a community-based sample of older subjects in the Netherlands. Psychological Medicine 1997; 27: 231–235.

18.

StataCorp . Stata Statistical Software: release 8. Stata Software, College Station, TX 2003.

19.

Leckman

J F

Sholomskas

Thompson

W D

Belanger

Weissman

M M

. Best estimate of lifetime psychiatric diagnosis: a methodological study. Archives of General Psychiatry 1982; 39: 879–883.

20.

Kessler

R C

Wittchen

H-U

Abelson

J M

. Methodological studies of the Composite International Diagnostic Interview (CIDI) in the US National Comorbidity Survey (NCS). International Journal of Methods in Psychiatric Research 1997; 7: 33–55.

21.

Robins

L N

Helzer

J E

Weissman

M M

. Lifetime prevalence of specific psychiatric disorders in three sites. Archives of General Psychiatry 1984; 41: 949–958.

22.

Rohde

Lewinsohn

P M

Seeley

J R

. Comparability of telephone and face-to-face interviews in assessing axis I and II disorders. American Journal of Psychiatry 1997; 154: 1593–1598.

23.

Solomon

D A

Leon

A C

Endicott

. Unipolar mania over the course of a 20-year follow-up study. American Journal of Psychiatry 2003; 160: 2049–2051.

24.

Regeer

E J

Rosso

M L

ten Have

Vollebergh

Nolen

W A

. Prevalence of bipolar disorder: a further study in The Netherlands. Bipolar Disorders 2002; 4(Suppl. 1)37–38.

25.

Hirschfeld

R M

Calabrese

J R

Weissman

M M

. Screening for bipolar disorder in the community. Journal of Clinical Psychiatry 2003; 64: 53–59.

26.

Zimmerman

Posternak

M A

Chelminski

Solomon

D A

. Using questionnaires to screen for psychiatric disorders: a comment on a study of screening for bipolar disorder in the community. Journal of Clinical Psychiatry 2004; 65: 605–610;, discussion 721.

27.

Hirschfeld

R M

Holzer

Calabrese

J R

. Validity of the Mood Disorder Questionnaire: a general population study. American Journal of Psychiatry 2003; 160: 178–180.

28.

Casscells

Schoenberger

Graboys

T B

. Interpretation by physicians of clinical laboratory results. New England Journal of Medicine 1978; 299: 999–1001.

29.

Tversky

Kahneman

. Evidential impact of base rates. Judgment under uncertainty: heuristics and biases, Kahneman

Slovic

Tversky

. Cambridge University Press, New York 1982; 153–160.

30.

Tohen

Goodwin

F K

. Epidemiology of bipolar disorder. Textbook of psychiatric epidemiology, Tsuang

M T

Tohen

Zahner

. Wiley-Liss, New York 1995; 301–315.

31.

Kendler

K S

Robinette

C D

. Schizophrenia in the National Academy of Sciences-National Research Council Twin Registry: a 16-year update. American Journal of Psychiatry 1983; 140: 1551–1563.

32.

Kendler

K S

Pedersen

N L

Farahmand

B Y

Persson

P-G

. The treated incidence of psychotic and affective illness in twins compared to population expectation. Psychological Medicine 1996; 26: 1135–1144.

Mania in the Swedish Twin Registry: Criterion Validity and Prevalence

Abstract

Keywords

Method

Sample, databases, and criterion assessment

Diagnostic assessment

Analytic procedures

Results

Discussion

Validity and predictors of agreement

Prevalence and incidence

Criterion and participation bias

Conclusions

Footnotes

Acknowledgements

References