Abstract
Suicidal ideation, suicide attempt and completed suicide are widely considered outcomes of depressive illness. Indeed, an extensive body of research illustrates that meeting the criteria for depression significantly increases the likelihood of being suicidal and is one of the strongest predictors of suicidal thoughts and behaviours [1–4]. Many depression assessment tools, possibly as a consequence of research supporting the connection between suicidality (i.e. suicidal ideation, suicide attempt and completion) and depression, also incorporate items assessing suicidality. These include the Hamilton Rating Scale for Depression [5], Beck Depression Inventory [6], and the Composite International Diagnostic Interview [7]. Furthermore, the DSM-IV lists suicidal thoughts and behaviour as one of nine item criteria for major depressive episodes [8]. Thus, as it stands, this DSM criterion has the potential to obscure the empirical differentiation of mood disorders and suicidality. To date, few epidemiological studies have investigated the relationship between suicidality and depression within unselected community-based samples, and thus currently there is little information about the underlying construct(s).
Further, while many studies show a variation in prevalence of depression (and anxiety) across age [9], there is scant research focusing on the effects of age and gender on suicidal thoughts and behaviour beyond prevalence. In particular, no population-level investigation of the relationship of suicidality with other related constructs such as depression could be identified in the literature, nor could any study considering construct-level age and gender-based variation of this relationship. Prevalence data indicate that suicidal thoughts and behaviours occur more commonly in the late teens and early adulthood and abate steadily with increasing age [10–13]. Basic prevalence rates also indicate that more female subjects report having thoughts of suicide or associated behaviours [11, 13–15]. The body of evidence identifying differences in symptoms of depression and anxiety across the lifespan may be related to age-related variation in the underlying constructs [16], the dissimilarities in prevalence statistics described here, and varying risk factors across age and gender for suicidal behaviour. It also suggests that age and gender may affect the underlying experience of suicidality.
Christensen et al. presented findings to suggest that depression manifests differently across the lifespan [16]. To some extent, this could explain both differences in endorsement rates for various depressive symptoms between age groups. This may, at least partially, account for observed differences in depression prevalence in different age groups. The present study attempts to clarify whether the decreasing prevalence of suicidal behaviour over the lifespan and differences across gender are also due to age and gender-based disparities in the manifestation of suicidality. It will assist in determining whether or not the underlying experience of suicidality is the same throughout life for male and female individuals and if suicidality and depression co-vary differently across gender and age. This objective is achieved by investigating whether there is variation across groups in the underlying constructs of depression and suicidality.
One of the most appropriate methods for investigating the issues outlined in the foregoing paragraph is confirmatory factor analysis (CFA). This methodology is often used to assess the validity of scales and questionnaires, but until now CFA has not been used to explore the nature of the relationship between depression and suicidality constructs, or the stability of these latent structures across gender and age groups. CFA requires that a theoretically driven factor model is specified, such as the implied two-factor model representing depression and suicidality. In addition to estimating the strengths of the association between the observed symptoms and their putative latent dimensions and between dimensions themselves, CFA yields information regarding how well and how parsimoniously the proposed model fits the data [17, 18].
Binary and ordinal variables should not be treated as if they were continuous scales in factor analyses. This may result in factors that reflect the differences in endorsement patterns of response categories [19–21]. Such variables may be conceptualized as resulting from the imposition of cut-offs or thresholds on an underlying normally distributed construct. Estimated correlations between these constructs are referred to as tetrachoric correlations (for binary variables) or polychoric correlations (for multiple categories) [19]. MPlus [20] is a structural equation modelling program particularly suited to CFA with binary and categorical data [21, 22].
The single-sample CFA model testing procedure, as described here, may be extended to accommodate the simultaneous analysis of multiple groups. This allows the formal comparison of models from two or more groups (e.g. multiple age cohorts). Evaluation of measurement invariance involves conducting a series of analyses to examine whether there is significant deterioration in model fit if model parameters are constrained to be equal across groups [23]. When this does not result in significant deterioration in model fit, it can be concluded that the instrument behaves the same way in each group and thus assesses the same construct [18, 24].
In gaining an understanding of the relationship between the depression and suicidality constructs, the two objectives of the present study will be realized. First, two measurement models of the Goldberg depression scale [25] and the suicidality subscale of the Psychiatric Symptom Frequency Scale [26] will be compared to determine whether a single- or two-factor model provides the most adequate fit. Second, the better fitting model will be investigated for any significant variation in model parameters across different gender and age groups.
Methods
Participants
The sample was drawn from the first wave of the Personality and Total Health (PATH) Through Life Project, a longitudinal community survey concerned with the health and well-being of people within three age groups residing in Canberra or the neighbouring town of Queanbeyan. Individuals were selected at random from the Australian Electoral Roll. Participants numbered 1163 men and 1241 women aged 20–24, 1192 men and 1338 women aged 40–44, and 1319 men and 1232 women aged 60–64.
Procedure
Full details of survey methodology for Wave 1 participant assessment have previously been reported [11, 27]. In brief, data collection for the first wave of the PATH survey commenced in 1999. Response rates for the 20–24 years, 40–44 years and 60–64 years age groups were 58.6%, 64.6% and 58.3%, respectively. Participants agreeing to take part in the project were assessed in their home or the Centre for Mental Health Research. The majority of the interview was self-completed on a Hewett-Packard 620LX palmtop personal computer using Surveycraft software for computer-assisted personal interviewing (SPSS, Chicago, IL, USA).
Instruments
The Goldberg depression and anxiety scales consist of nine items for each scale, which measure presence or absence of depressive and anxiety symptoms [25]. This investigation utilizes the depression scale items. Items are coded 0 (No) and 1 (Yes) in response to questions about symptoms over the past month. Higher scores indicate greater probability of clinically important depressive symptoms.
The Psychiatric Symptom Frequency Scale–suicidality subscale consists of the following items concerning suicidal ideation and behaviour (0=No, 1=Yes) ‘in the last year have you felt life was hardly worth living’, ‘in the last year have you thought you would be better off dead’, ‘in the last year have you ever thought about taking your own life’ [26]. Participants endorsing the third item were also asked ‘in the last year have you made any plans to take your own life’ and ‘in the last year have you ever attempted to take your own life’. Because the last three items were endorsed too infrequently to conduct some of the analyses and because the contingent method of enquiry induces structural correlations between the items, they were combined into a single ordered polychotomous variable representing serious suicidal symptoms (1=no ideation; 2=ideation only; 3=ideation and plan; 4=ideation, with or without plan, and an attempt). The third and fourth categories were coded in an effort to both acknowledge increased seriousness of a suicide attempt over suicide plans, and to represent all individuals who attempt suicide regardless of their suicide plan status.
Data analysis
Descriptive analyses investigating item endorsement for gender and age were obtained via cross-tabulation using SPSS version 14.0 (SPSS, Chicago, IL, USA). Adjusted residual estimates were used to examine whether the observed frequency in an age or gender group was significantly higher/lower than would be predicted if there was no association between the groups. Adjusted residuals are derived by dividing the residual (i.e. subtracting the expected frequency from the observed frequency) by an estimate of its standard error, and the resultant value is expressed in standard deviation units, and indicates variation above or below the mean [28].
Structural equation modelling was conducted with MPlus v4.1 [20] and involved two phases of CFA. The mean and variance adjusted weighted least squares (WLSMV) estimator was used because it is appropriate for categorical data [29]. CFA assesses whether an a priori theoretical structure obtains an acceptable fit to sample data. The indices used to gauge model fit in CFA were χ2 test, comparative fit index (CFI), Tucker–Lewis index (TLI is also known as the non-normed fit index) and the root mean square error of approximation (RMSEA). Conventionally, if the CFI and TLI>0.90 and the RMSEA are<0.08, the model is said to have satisfactory fit to the data [30].
The first phase of CFA involved testing single- and two-factor models to determine which provided the most parsimonious fit to the data. The base model consisted of one latent variable that represented a combined depression–suicidality construct. The alternative model separated depression from suicidality, proposing two distinct, although correlated, constructs. Models are represented as per Figures 1 and 2. Formal comparison of models requires that they be nested such that the simpler model is a restricted form of the more complex one. This can be achieved for the current models by setting the correlation of depression and suicidality factors in Figure 2 to Figure 1. This effectively equates the factors, producing a single-factor model.
Single-factor model combining depression with suicidality showing factor loadings at phase 1 (italics=suicidality items). Dual-factor model in which depression and suicidality are individual although correlated constructs showing factor loadings at phase 1.

Utilizing the model with most favourable fit, the second phase of the CFA involved multi-group analysis using the general approach prescribed by Millsap and Yun-Tien [31]. Tests of measurement invariance were undertaken separately for gender and then age groups; age×gender tests were not possible due to the low endorsement rates, and thus, very small numbers of positive responses, in some cells. Multi-group analysis allows the estimation and comparison of multiple models based on two or more different samples. The null hypothesis contests that model parameters (consisting of both factor loadings and thresholds) from each group do not differ, hence demonstrating that the same item structure applies across the subpopulations sampled [18]. This would be evident when, for instance, the model parameters are constrained to equality across the male and female samples and the appropriate tests show no significant deterioration in model fit. This demonstrates invariance in factor structure. Factorial invariance means that the latent variable is equivalent or comparable across groups in which the factor loadings (factor-item regressions), and sometimes, the unique means, are equal [24].
Generally, multi-group analysis testing factorial invariance consists of four main steps and begins with utilizing CFA to develop well-fitting models for each group. This establishes the presence of configural invariance, and requires equal number of latent variables, and the same number of factor loadings across groups. Effectively, this is the baseline model, which is compared to a series of nested, progressively more constrained, models (invariance hypothesis) [23]. Second, determining the presence of weak factorial invariance necessitates that factor loadings are constrained across groups, but factor variances and covariances are allowed to vary. Third, strong factorial invariance requires that, in addition to factor loadings, mean intercepts or thresholds are held constant across groups. Then finally, strict factorial invariance entails the additional constraint of the unique variances (error terms). This forces the nested model to make the specific and random error component for each variable the same across groups. In other words, differences can be expressed only at a latent variable level [24]. When the nested model (possessing greater across-group equivalence constraints than the baseline model, involving factor loadings, thresholds, and random errors) has no significant deterioration in fit (e.g. assessed by goodness of fit indices) the invariance hypothesis is retained.
The use of the WLSMV estimator appropriate for categorical data precludes comparison of nested models using the χ2 likelihood ratio test (direct difference of χ2). Muthén and Muthén developed a procedure (DIFFTEST) that calculates a χ2 test of statistical significance for the deterioration in fit that occurs when imposing the constraints in nested models fitted using WLSMV [20]. χ2 test statistics are sensitive to sample size, and often inconsequential differences between a baseline model and an alternative model are flagged as significant in large samples [32, 33]. Therefore, it is essential to utilize additional goodness-of-fit indices (CFI, TLI, and RMSEA) to determine whether a constrained model is appreciatively degraded in comparison to the baseline model. A streamlined approach was adopted for the multi-group analysis in which strict factorial invariance was assessed on the outset, so that only an absence of invariance would necessitate further testing for strong and weak factorial invariance.
Results
Descriptive statistics
Previous investigations have found the factor loadings for the Goldberg item ‘early waking’ to be uniformly low across samples [34], thus it was decided that this item should be omitted from the depression factor model in all analyses. To examine the current sample item endorsements across gender and age groups, the percentages of positive responses to the Goldberg depression scale items and suicidality subscale items of the Psychiatric Symptom Frequency Scale are presented in Tables 1 and 2.
Endorsement rates† for PATH participants (n=7440)
†Goldberg depression and Psychiatric Symptom Frequency Scale suicidality items.
∗p<0.05, ∗∗p<0.01, ∗∗∗p<0.001.
Endorsement rates† vs age in PATH sample (n=7440)
†Goldberg depression and Psychiatric Symptom Frequency Scale suicidality items.
All tests of difference in endorsement rates between age groups statistically significant (p<0.001).
Pearson χ2 analyses utilizing adjusted residuals ascertained significant group differences in item endorsement between groups, paralleling current prevalence trends. Table 1 indicates that a significantly greater proportion of women endorsed items indicating depressive symptoms on the Goldberg depression scale, except ‘lost interest’ and ‘lost weight’. Similarly, a significantly greater proportion of women responded positively to the suicidality subscale items ‘felt life was not worth living’ and ‘thought that you would be better off dead’. Table 2 also shows that the proportion of positive responses to depression and suicidality items was significantly lower in the older age groups.
Phase 1: comparison of single- and two-factor measurement models for entire sample
The single-factor model had poor fit to the data (χ2(21)=2620.23, p<0.0001, TLI=0.944, CFI=0.934, RMSEA=0.129). The alternate model fitted the data better (χ2(29)=984.19, p<0.0001, TLI=0.985, CFI=0.976, RMSEA=0.067). Most notable was the decrease in RMSEA to a value consistent with an acceptable fit. The χ2 DIFFTEST indicated a significant disparity between the single- and two-factor models (χ2(1)=650.77, p<0.0001). Factor loadings for the single- and two-factor models are presented in Figures 1 and 2. The depression factor had a moderately strong correlation with the suicidality factor (0.67). Fit indices for the two-factor model were uniformly more favourable than for the single-factor model, and indicated that the two-factor model should be retained for further analyses.
Phase 2: multi-group analysis
Multiple group models, using the two-factor model from the previous analysis, were fitted to the PATH data to determine whether the measurement structure was invariant across gender. The baseline model consisted of the two gender groups with no constraints on factor loadings (although it was necessary to constrain thresholds to achieve identification). In the subsequent models, additional parameters were constrained (i.e. factor loadings, and unique variances for each variable). The χ2 DIFFTEST for gender was small but significant (χ2(11)=48.88, p<0.001), but this can be principally attributed to the large sample size.
Differences between the unconstrained and constrained model parameters were negligible, and the fit indices indicated that both models fitted the data well (unconstrained: CFI=0.976, TLI=0.985, RMSEA=0.066; constrained: CFI=0.981, TLI=0.988, RMSEA=0.059). These results imply that strict measurement invariance for the two-factor model applies across genders. The same procedure was used to investigate the applicability of the model structure and parameters across the three age groups. Once again, while the χ2 DIFFTEST was significant (χ2(23)=207.09, p<0.001), fit indices indicated good fit and negligible differences between unconstrained (CFI=0.977, TLI=0.985, RMSEA=0.064) and constrained models (CFI=0.977, TLI=0.986, RMSEA=0.061). Also supporting these findings are the relatively small differences between each age and gender group in the unconstrained model parameters. The largest difference in standardized loadings between men and women was 0.057 (0.550 for men vs 0.493 for women, for the item ‘Have you lost weight (due to poor appetite)?’). For age, the greatest difference between groups was 0.129 (0.541 for those aged 20–24 years vs 0.670 for those aged 40–44 years, for the item ‘Have you tended to feel worse in the morning?’). The results show that, overall, strict measurement invariance applies across both gender and age groups.
Joint distribution of depression and suicidality factor scores
Examination of the bivariate distribution of depression and suicidality factor scores was utilized to assess the distribution of depression at different levels of suicidality, and similarly, of suicidality for different levels of depression. Critically, although most people with high suicidality (highest quartile) were also highly depressed (in the highest quartile), more than one-quarter were not, including 8% who had depression scores below the median for the sample.
Discussion
Using data collected by the PATH Through Life Project, this investigation presents new evidence to support suicidality being considered a construct that is distinguishable from depression, although substantially correlated with it. This structure is stable across gender and age.
Significantly, the first of these findings bring into question the conceptualization and promulgation of suicidality simply as a symptom of severe depression, and suggests that suicidality might usefully be viewed as a separate syndrome. Despite research indicating the extent to which suicidality and depression co-exist [35–37], the present study shows that people can experience suicidal thoughts and behaviours independent of depression, just as people can also experience depression without anxiety or suicidal symptoms.
The literature on anxiety and depression provide a precedent for considering strongly correlated constructs as distinct syndromes. This is especially so when there is a need to treat them as clinically discrete from each other, despite often presenting comorbidly. The finding that the relationship between suicidality and depression (0.67) is weaker than that between depression and anxiety (0.86) [16] serves to strengthen the argument for regarding suicidality as a separable construct, although one related to depression.
Further, this study demonstrates that the stability of these constructs across age and gender, implying that the makeup of the suicidality construct and its relation to depression remains constant for different age and gender groups. At first glance this may seem incongruous given the item endorsement statistics shown in Tables 1 and 2, especially when juxtaposed with the large body of literature reporting significant differences in factors predicting suicidality for young, middle-aged and older and gender groups [10, 11, 13, 38]. Similar item trends are found for anxiety and depression [39–42]. Nevertheless, one should appreciate that changes in item endorsement rates (i.e. the prevalence of the symptom) can occur while relationships between underlying constructs remain constant. This concept can be illustrated by imagining age group item means (e.g. for suicidality) being located at different dimensional positions, yet they are all found in the same dimension (i.e. suicidality construct). Importantly, in contrast with anxiety and depression [16], the present study indicates that the underlying relationship of suicidality with depression is stable across the lifespan or gender. This suggests that the decrease in non-fatal suicidal behaviour in older people is not due to change in the association between these two constructs. Theories seeking to explain age group differences in depression and anxiety include decreased emotional responsiveness [43, 44], increased emotional control [43, 45–47] and psychological immunization [48, 49]. Others maintain the view that prevalence of syndromes vary across the lifespan and gender as a reflection of the differential distribution of risk factors [9]. Although developed in response to the age-related decrease in the prevalence of depression and anxiety, the theories may also adequately capture issues resulting in the decline of suicidal behaviour across the lifespan.
This investigation has two main implications for suicidology and mental health. First, it highlights the need for suicidality to be reconceptualized as a separate syndrome. This would be facilitated by the recognition of suicidality as distinguishable from depression in the DSM-IV and ICD classification systems. The need to establish levels of suicidality also is significant in the clinical setting. The present study supports the agency of assessing individuals for suicidal symptoms independently from depression, much as health professionals already do in ascertaining symptoms of anxiety in contrast to depressive symptomatology. Second, while different methods may need to be used to identify the presence of suicidality, the underlying experience of suicidality and its relationship with depression appears to remain the same for people in early, mid and late adulthood.
The current results also have ramifications for inventories with ‘gated’ or contingent enquiry structures, where items concerning suicidality are presented only upon positive responses to items enquiring about depression. With the exception of the second National Survey of Mental Health and Wellbeing that utilized the CIDI 3, it is perhaps unfortunate that recent national surveys have not enquired about suicidality independent of depression [50]. Modifications to questionnaires so that all respondents encounter items concerning suicidal symptoms regardless of their responses to depression would be highly informative and should be considered in future work.
The current findings of these analyses are strengthened by the fact that participants were derived from a random sample of the general community, and the sample size is relatively large. There are a number of limitations, however, that need to be acknowledged. First, there are differences in the symptom-reporting timeframe for the Goldberg depression scale (previous month) [25] and the suicidality subscale of the Psychiatric Symptom Frequency Scale (previous 12 months) [26]. Further, because information disclosed by participants was retrospective and self-reported, it is possible that suicidal symptoms were under-reported due to the protracted timeframe that participants were asked to review. Because the data do not extend into old age (i.e. ≥70 years) caution should be taken to avoid generalizing beyond the current study sample age ranges. Cohorts also have narrow bands, sampling only those who were 20–24, 40–44 or 60–64 years of age. Finally, although the overall sample was sizable, there were some cells with insufficient numbers that prevented multi-group analyses being conducted for age×gender categories. It is also recommended that further research should be conducted using different inventories assessing suicidal and depressive symptomatology to validate the current findings.
There were two key outcomes of this investigation. First, suicidality and depression appear to be distinct entities. If subsequent investigations replicate these results, it suggests the need for suicidality to be identified as an individual syndrome in diagnostic and classification manuals. Second, the relationship between the latent suicidality and depression factors was found to be invariant across early, mid and later adulthood (i.e. 20s, 40s and 60s). This suggests that variations in correlates or risk factors across age and gender do not represent variation at the construct level for suicidality.
Footnotes
Acknowledgements
We wish to thank Trish Jacomb, Karen Maxwell and the PATH interviewers for their assistance with the study. Funding was provided by National Health and Medical Research Council Grants 179805 and 79839, a grant from the Alcohol-Related Medical Research Grant Scheme of the Australian Brewers’ Foundation and a grant from the Australian Rotary Health Research Fund. Associate Professor Kaarin Anstey was supported by National Health and Medical Research Council Fellowship Grant (366756). Dr Kate Fairweather-Schmidt was partially supported by an AFFIRM scholarship. We would also like to acknowledge Professor Tony Jorm, Professor Helen Christensen and Professor Bryan Rodgers, who are also chief investigators of the PATH Through Life Project.
