Abstract

This issue of the Canadian Journal of Psychiatry (CJP) contains a series of articles on the topic of suicide in Canada. These studies have employed a variety of investigative strategies, but this series of articles draws out several methodological themes and concerns. Rahme et al. 1 describe correlates of suicide attempts in 2 Montreal hospitals, but perhaps the most important contribution made by their study is in an examination of the validity of diagnostic codes, in this case codes for intentional self-harm from the 10th edition of the International Classification of Diseases (ICD-10). While suicide is among the top 10 causes of death in Canada, 2 completed suicide is not, in absolute terms, a common event. The latest national estimate from Statistics Canada is 11.3/100,000 person-years. 3 Research into suicide and suicide attempts must use large, available data sets such as the National Mortality Database, Hospital Discharge Abstracts, and data from the National Ambulatory Care Reporting System. The ICD-10 (and to a diminishing extent, the ICD-9, in some provinces) is a widely used source of data in suicide research.
The validity of data coding is important because of the need to avoid the type of bias that arises from inaccurate measurement. When this involves a categorical outcome (such as whether a death is due to suicide or not, etc.) this is an issue of classification and the bias arising is called misclassification bias. Problems with the accuracy of measurements used in research and surveillance translate into biased estimates such as underestimation (in the case of low sensitivity) or overestimation (in the case of low specificity) of a suicide-related rate. These parameters that quantify measurement errors are prominent in this month’s articles, especially sensitivity and specificity. Sensitivity is the probability that a measure will detect an outcome that has actually occurred. The complementary probability (1 – Se) is the false-negative rate. This “rate” is actually a proportion and not a true rate (true rates should have person-time units, as in a suicide-specific mortality rate of 11.3/100,000 person-years, noted above), but for ease of language, such proportions are referred to as rates in this commentary. Specificity is the probability that a measure correctly classifies, as negative, a person who has not had the outcome under study. Its complement is the false-positive rate. When a study undertakes assessment of the rate of suicide-related behaviours 4 or a prediction that a repeat suicide attempt may occur, 5 these frequencies must be calculated using a counted number of outcomes (in a specified population during 1 year) in the numerator and the population at risk (as person-time, usually estimated using the mid-year population) in the denominator. When a measure is insensitive, it has a high-false negative rate, so some people who have attempted or died by suicide will not appear in the numerator, and the rate will be biased downwards. On the other hand, if it is nonspecific, there will be false positives in the numerator and the rate will be overestimated. Parameters such as sensitivity and specificity connect the concepts of measurement error to the systematic distortion of estimates known as misclassification bias.
Rahme et al. 1 validate their set of ICD-10 codes against a detailed chart review. They report that ICD-10 codes for “intentional self-harm” are only about 46% sensitive. This means that only about half of the patients with intentional self-harm are coded correctly (as true positives) in the database they used. This substantial degree of misclassification will lead to an underestimation of the rate of intentional self-harm. This degree of misclassification is also very relevant to studies that seek to identify determinants of self-harm. When studies compare groups (e.g., comparing patients exposed or not exposed to various risk factors, as seen in a study of repeat suicide attempts also published in this issue 5 ), misclassification of the outcome variable will often result in a dilution of observed differential risk (e.g., odds ratios that are closer to their “null” value of 1.0 than they should be). This could result in an opportunity for prevention being missed as a result of an important determinant seeming (misleadingly) to be weakly or nonsignificantly associated with these outcomes.
The results of these studies suggest a path forward: record linkage. Rahme et al. 1 use manual chart reviews to validate ICD-10 codes in an administrative database containing hospital discharge abstracts. In the future, it is likely that data from electronic health records will be combined with data from administrative sources to produce ever more accurate sources of data. Indeed, increasing linkage of data sets is already gaining momentum, as is apparent in articles published in this issue. The study by Newton et al. 4 examines trends in suicide-related behaviours using data from the Ambulatory Care Classification System database. However, to measure income (which is not recorded in this system), they used the postal code conversion file 6 to link place of residence to socioeconomic data deriving ultimately from the census. Of course, this kind of record linkage is “ecological” in the sense that average income within a geographical area is linked to individual-level outcomes (suicide-related behaviour). The authors of this article acknowledge that many attempts to link provincial trends in suicide-related rates to potential determinants will be ecological by necessity. Indeed, their main finding is that emergency department visits for suicide-related behaviours declined in the first half of the decade and then stopped declining around the time when a new suicide prevention strategy (an ecological rather than individual-level intervention) was introduced. Nevertheless, they speculate that the strategy had a positive impact because they suspect that the rates might have increased again in its absence. 4
The article by Puyat et al. 7 in this issue also makes use of record linkage to address its aim of estimating the proportion of Canadians with major depression who receive minimally adequate treatment. Table 1 in this article lists 5 different data sources that were linked in support of the analysis. Of special note is the use of Vital Statistics Deaths and its role in studies of not only suicide but also suicide attempts and suicide-related behaviours. In the Wang et al. 5 study, some of the patients identified as being at high risk of suicide but who were without evidence of repeated suicide attempts may have died by suicide in the interim; increasingly extensive record linkage in the “big data” future will help to clarify such issues.
Skinner et al. 8 address a worrisome and very important aspect of misclassification bias. They note that as suicide rates have declined, rates of death due to unintentional and undetermined poisonings have increased significantly. They point out that distinguishing between these 2 possibilities is challenging and that in the absence of clear evidence of suicidal intent, indeterminate codes (e.g., ICD-10 codes that are not explicit in stating intentionality) are likely to be used. For the reasons described above, such misclassification could seriously distort estimates of suicide rates. The danger is that a sense of complacency could arise as rates decline when in reality no decline, or a smaller decline, is actually occurring. As noted in the article, this phenomenon appears to have unfolded both in the United States and South Korea. This type of problem is not unique to suicide research. Uninformative and indeterminate codes are often referred to as “garbage codes,” and the Global Burden of Disease Study uses ensemble modelling (implemented using an approach they call CODEm), which uses other data to redistribute these garbage codes into the right mix of meaningful categories. 9
Accurate information about suicide is difficult to come by. While problems are prominent in the current context, these studies show a movement in the direction of better data availability and ultimately a more evidence-informed clinical and public health response to the problem. Efforts unfolding in Canadian research to address these problems are moving forward. The CJP is proud to highlight this progress in this series of studies.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
