Abstract
Keywords
Introduction
Neuroleptic malignant syndrome (NMS) is a rare and potentially fatal complication of treatment with antipsychotics [Strawn et al. 2007], which is characterised by four domains of signs and symptoms: rigidity, fever, dysfunction of the autonomic nervous system and alterations in consciousness. Researchers and clinicians face the difficulty of distinguishing its signs and symptoms not only from the mental disorder being treated but also from common side effects of psychotropic medication. Moreover, scientific identification of cases is made difficult by its nature as a diagnosis of exclusion with many conditions, common and rare, in the differential. Despite some evidence for a broad consensus on NMS [Strawn et al. 2007, 2008], much remains controversial and obscure about the aetiology, pathophysiology and treatment [Picard et al. 2008; Margetić and Aukst Margetić, 2010] of this condition which is ‘heterogeneous in onset, presentation, progression and outcome’ [Strawn et al. 2007].
Diagnosis remains similarly controversial, several criteria having been proposed with different conceptual commitments, forms and functions. Gurrera and colleagues’ meta-analysis of incidence studies found five published diagnostic criteria, two modifications of previously published criteria, and a further four idiosyncratic and four undisclosed sets distributed among 26 eligible studies published between 1960 and 2003 [Gurrera et al. 2007]. Although they found no association between stringency of criteria and estimated incidence, this is likely to be due to the lack of power afforded by typical case numbers. Adityanjee and colleagues reviewed the existing criteria in 1999, discussing or mentioning some 17 sets, or modifications of sets, of criteria [Adityanjee et al. 1999]. They divided these into six groups:
Levenson [Levenson 1985, 1986];
Addonizio and colleagues [Addonizio et al. 1986];
Pope and colleagues [Pope et al. 1986; Keck et al. 1989];
Adityanjee and colleagues [Adityanjee et al. 1988];
Friedman and colleagues [Friedman et al. 1988];
Caroff and colleagues [Lazarus et al. 1989; Caroff et al. 1991; Caroff and Mann, 1993], classifying Diagnostic and Statistical Manual of Mental Disorders, fourth edition (DSM-IV) criteria for NMS [American Psychiatric Association, 1994] as a modified version of these.
The authors of this review further proposed a more stringent set of research diagnostic criteria.
Two attempts to quantify the agreement among sets of criteria have been published. Caroff and colleagues [Caroff et al. 2000], reviewing the aetiological role of atypical antipsychotics in NMS, employed three sets: those of Levenson [Levenson, 1985], DSM-IV [American Psychiatric Association, 1994] and their own [Caroff and Mann, 1993]. Their primary aim, however, was to identify atypical or partial forms of NMS, rather than to compare the sets themselves. Gurrera and colleagues calculated κ and intraclass correlation (ICC) coefficients for three different sets [Gurrera et al. 1992]: those of Levenson [Levenson, 1985], Addonizio [Addonizio et al. 1986] and Pope [Pope et al. 1986], as well as their own modifications of the Levenson criteria to allow closer comparison with the ‘probable’ diagnosis allowed by the Pope criteria. They reported ‘only modest’ diagnostic agreement.
In this study, using information from a large case register resource and as a preliminary step in a larger programme of work investigating NMS, we attempted to quantify the level of agreement over a wider range of criteria sets – the six presented in the appendix of Adityanjee and colleagues [Adityanjee et al. 1999] and those of DSM-IV [American Psychiatric Association, 2000].
Methods
Source data
We derived NMS cases from the South London and Maudsley NHS Foundation Trust Biomedical Research Centre (SLAM BRC) Case Register by a systematic record review. Full details of this data resource have been provided elsewhere in an open-access publication [Stewart et al. 2009]. Briefly, the Case Register Interactive Search (CRIS) programme allows researcher to access the electronic clinical records of SLAM, robustly de-identified (including masking of identifiers in free text) and secured, providing the capability for free-text searching and database assembly for export into standard tools for analysis. SLAM is the largest unit mental healthcare provider in Europe, providing comprehensive services to a geographic catchment of approximately 1.2 million in four London boroughs (Croydon, Lambeth, Lewisham and Southwark), in addition to several specialist national services. In common with other mental healthcare providers in the British National Health Service, mental health trusts provide nearly 100% of secondary mental healthcare to their geographical catchments in a model which is free at the point of delivery. Since 2006, SLAM has used a single electronic clinical record system throughout all its services, the Patient Journey System, which was developed within SLAM to support the recording and communication of all clinical information and the capture of administrative data; legacy data from earlier electronic records systems were imported during its implementation. Therefore, full but anonymized clinical information on every person who has received any service from SLAM since at least 2006 is captured in the case register. When the NMS case group was assembled, CRIS contained records on over 150,000 people. CRIS was approved as an anonymized dataset for secondary analyses by the Oxfordshire Research Ethics Committee C (reference 08/H0606/71).
Ascertainment of potential neuroleptic malignant syndrome cases
All records in CRIS on the 28 February 2010 were searched for the text strings ‘NMS’, ‘neuroleptic malignant syndrome’, and variants of these (including misspellings). The automatic search and the subsequent manual reviews were confined to the free-text fields containing all case notes and those containing correspondence (e.g. letters to general practitioners, admission and discharge summaries). Entries in these fields were extracted covering the 7 days before and after the first mention of NMS and prepared for manual review.
Inclusion and exclusion criteria
Records were prescreened and included for review if there was clear evidence that NMS was considered a possible diagnosis in the open-text records by the recording clinician, and relevant action was taken on the grounds of this. These cases are termed ‘suspected NMS’ in this article. Relevant action in this context could include any one or more of the following: requesting laboratory investigations on the basis of this clinical suspicion, stopping medication or transferring the person to a general medical facility. To maximise sensitivity, the subsequent outcome or recorded diagnosis following these actions were not applied as exclusion criteria; that is, records were included for manual review even if the episode was subsequently judged not to have been one of NMS, or NMS was thought unlikely, provided that NMS was considered as a possible diagnosis initially and action was taken on the basis of this consideration. This prescreening was carried out by two authors (C-KC and SH) who reviewed all records returned by the search strategy.
Of the cases of suspected NMS identified following this procedure, all were then reviewed by three psychiatrists (SH, RS and WL). Initially, a randomly selected 30 cases were reviewed by all three raters independently to establish agreement over criteria and coding, followed by each rater separately reviewing a third of the remainder. Any remaining records which did not meet the above inclusion criteria were excluded. Through this review process, the suspected NMS cases were coded using a standard form which enquired about all the symptoms, signs and investigations specified in seven sets of diagnostic criteria: DSM-IV [American Psychiatric Association, 2000] and the six sets given in the appendix of Adityanjee and colleagues [Adityanjee et al. 1999]: those of Levenson [Levenson, 1985], Addonizio and colleagues [Addonizio et al. 1986], Pope and colleagues [Pope et al. 1986] (subsequently modified [Keck et al. 1989]), Adityanjee and colleagues [Adityanjee et al. 1988], Caroff and colleagues [Caroff et al. 1991] (subsequently modified [Caroff and Mann, 1993; Lazarus et al. 1989]), and later research criteria suggested by Adityanjee and colleagues [Adityanjee et al. 1999]. The criteria proposed by Sachdev were considered but could not be operationalised in the context of a case note review because of their emphasis on symptom grading [Sachdev, 2005].
Records that met at least one of the above diagnostic criteria were termed ‘diagnosable NMS’. Demographic data (age, sex and year of the suspected NMS) were also collected for each record. The Pope recommendations also allow an alternative category of ‘probable NMS’ which is designed for retrospective definition when documentation is inadequate; this was treated as a separate outcome [Pope et al. 1986].
Statistical analysis
Chance-corrected proportional agreements between sets of criteria were calculated using pairwise κ indices and among all sets combined. In addition, comparisons of presence, absence, or ‘no mention’ for specific symptoms/signs between the group who fulfilled any diagnosis (n = 43) and the group who did not (n = 140) were performed using χ2 tests. All statistical analyses were carried out using Stata Special Edition V.10.0 software.
Results
Of 485 cases returned by the initial text search, 302 were excluded as clearly irrelevant or failing to meet inclusion criteria, leaving 183 cases for which NMS had been considered clinically resulting in further investigation or other action. Of these, 43 could be judged from the available information to meet at least one set of the six diagnostic criteria (Table 1) which, together with the overlapping 46 ‘probable’ cases according to the criteria of Pope and colleagues [Pope et al. 1986] gave a wider group of 73 potential cases. No case record returned more than one episode of suspected or diagnosable NMS. The mean age of all 183 suspected cases was 43.2 years (SD = 18.0) and 121 (66.1%) were men. The mean age of the 43 identified cases was 45.7 years (SD = 18.0) and 30 (79.7%) were men. The first suspected NMS case was identified in 2001 but the subject did not meet any of the six criteria. The majority of cases were identified from 2006 to 2008 (details not shown).
Characteristics of suspected neuroleptic malignant syndrome cases meeting specific diagnostic criteria.
No case met the research diagnostic criteria of Adityanjee and colleagues [Adityanjee et al. 1999], but 11 cases met the clinical criteria of NMS suggested by Adityanjee and colleagues [Adityanjee et al. 1988]. Only one case met all six sets of the remaining criteria (Table 2). For the six core sets of criteria, the combined level of agreement (κ) was 0.35 [95% confidence interval (CI) 0.31–0.39]. For the seven wider sets of criteria (i.e. when the ‘probable’ cases identified by the retrospective criteria of Pope and colleagues were also included) the combined κ dropped to 0.27 (0.23–0.31). Pairwise agreements are represented in Table 3, which showed wide variation, with the highest κ statistics for Levenson with Addonizio and colleagues (the most inclusive sets), and for Pope and colleagues with DSM-IV.
Number of cases meeting number of sets of criteria (N = 183).*
For Pope’s criteria, probable cases were considered as non-neuroleptic malignant syndrome.
Pairwise agreements between sets of criteria (N = 43).
p < 0.05.
Addo., [Addonizio et al. 1986]; Adit., [Adityanjee et al. 1988] clinical criteria; Caro., [Caroff et al. 1991]; CI, 95% confidence interval; DSM, Diagnostic and Statistical Manual of Mental Disorders [American Psychiatric Association, 1994]; Lev., [Levenson, 1985]; Pope, Pope’s definite cases [Pope et al. 1986]; Pope(p), the probable cases of Pope’s criteria within the 43 cases of any confirmed neuroleptic malignant syndrome diagnosis [Pope et al. 1986].
Table 4 summarises the frequencies of the principal categories of symptoms and signs used in NMS diagnostic criteria for all suspected NMS cases. All differed significantly between those with and those without diagnosable NMS. Pyrexia, defined on the basis of ‘pyrexia’ or ‘fever’ as a term being stated or a recorded temperature of 37°C or higher, was present in all cases with diagnosable NMS, and extra-pyramidal symptoms (EPS) and autonomic symptoms were present in over 90%. However, these features (along with all other features) were also present in appreciable proportions (12–49%) of cases with suspected NMS who did not fulfil diagnostic criteria.
Distribution of main symptoms, signs and investigations among suspected neuroleptic malignant syndrome (NMS) cases (N = 183).
CK, creatine kinase; EPS, extrapyramidal syndrome.
Further analyses were carried out of the six mutually exclusive symptoms and signs given in Table 4: pyrexia/fever; rigidity; any EPS (excluding rigidity); any autonomic symptom; any altered consciousness; and elevated creatine kinase (CK). These revealed 0 (0%), 0 (0%), 1 (2.3%), 4 (9.3%), 13 (30.2%), 14 (32.6%) and 11 (25.6%) subjects with 0–6 of these six symptoms and signs respectively among the 43 diagnosed NMS cases [i.e. implying the following sensitivity statistics for ascending cutoffs (1+, 2+, 3+, 4+, 5+, 6): 0, 2.3, 11.6, 41.8, 74.4, 100]. Of the 140 subjects not meeting any NMS diagnostic criteria, 26 (18.6%), 27 (19.3%), 35 (25.0%), 37 (26.4%), 15 (10.7%), 0 (0.0%) and 0 (0.0%) were found with 0–6 of these six symptoms and signs respectively (i.e. specificity statistics for respective cutoffs of 81.4, 62.1, 37.1, 10.7, 0, 0), representing a significant group difference (χ2 = 118.8; degrees of freedom: 6; p < 0.01). Positive predictive values for cutoffs derived from these groups were 0, 2.8, 12.6, 59.0, 100 and 100 respectively.
Of the 140 subjects not meeting any NMS diagnostic criteria, 30 met the probable NMS criteria defined by Pope and colleagues. Of these, 0 (0.0%), 0 (0.0%), 12 (40.0%), 10 (33.3%), 8 (26.7%), 0 (0.0%) and 0 (0.0%) had 0–6 of the six symptoms and signs described above (cutoff sensitivities for probable NMS in this group: 0, 40.0, 73.3, 100, 100, 100). If they were reconsidered as NMS cases, the positive predictive values for cutoffs derived from the 0–6 symptoms and signs were 0, 34.3, 61.3, 100, 100, 100, and the specificities were 76.4, 51.9, 31.0, 6.5, 0, 0, respectively. Probable NMS defined by Pope and colleagues was also significantly associated with the six mutually exclusive symptoms and signs among subjects who did not meet any of the six criteria (χ2 = 27.6; degrees of freedom: 4; p < 0.01).
Discussion
We applied text string searching to a research database, derived from the electronic clinical records of a large mental health service provider and containing information on over 150,000 cases. We identified 183 instances of suspected NMS, that is, the possibility of NMS had been taken seriously and acted upon by the clinical team. After clinicians reviewed these case notes, 43 were found to meet at least one of six possible diagnostic criteria for NMS. Agreement between these criteria was relatively low and only one case met all six criteria. The pairs of sets which showed the best agreement (those of Pope and colleagues and DSM IV criteria) only shared about 75% of the same caseness. Pyrexia, EPS and autonomic symptoms were the symptoms most frequently present in patients with diagnosable NMS but these and other core symptoms existed in appreciable proportions of the remaining suspected cases.
NMS is a rare but serious and potentially life-threatening adverse reaction to antipsychotic medication and other drug classes. Rather than a specific disorder, NMS could be considered as a spectrum of complications ranging from slight rigidity and fever to severe rigidity with grossly increased CK and rhabdomyolysis. NMS can even be taken as one end of the spectrum of extrapyramidal effects. It receives a justifiably high prominence in psychiatric and general medical education given the potential adverse outcomes and the need for prompt referral and treatment. However, its diagnosis remains controversial and is based on clustering of symptoms and signs, and investigations with no conclusive diagnostic test.
The advantages of the study include the relatively large number of potential cases generating a relatively small diagnosable sample, but still one of the largest case groups to date for this rare condition. The SLAM BRC Case Register is a novel resource and particularly suited for this type of study given the large source sample, access to source free text from the full clinical record, and the facility to search for informative text strings. We adopted a relatively broad approach in the search strategy, with several filters, which should have minimised the risk of missed cases, particularly because most cases are likely to have the term recorded in more than one record field (i.e. the terminology of interest is likely to appear at some point in the record). Thus, false negatives might exist but they should be fairly limited in number.
While a case register sourced from routine clinical records is an advantage in terms of generalisability, it has limitations in the quality of the information available which was naturally recorded for clinical rather than research purposes. In particular, it was often difficult to find important negative statements regarding key features, especially for EPS. Variation in prevalence of diagnoses might reflect differences in underlying prevalence of the disorder, but might also reveal the comprehensiveness with which cardinal features were recorded in the case records. For example, the striking absence of diagnosable cases in the final year of our analysis (2009) would certainly deserve further attention if sustained, but whether this reflected changes in symptom and investigation recording would need to be established. It is also crucial to bear in mind that only the mental health records were contained in the data resource and that general medical notes from other providers were not available for review. However, the nature of the syndrome is such that nearly all patients received active management during the course of their illness. Furthermore, in most cases mental health records were maintained during and after periods of care on general medical units, so relatively little information was lost.
Since Gurrera and colleagues [Gurrera et al. 1992] compared the three main sets of diagnostic criteria for NMS, three new sets have been published: those of Caroff and colleagues [Caroff et al. 1991], DSM-IV [American Psychiatric Association, 1994] and those of Adityanjee and colleagues [Adityanjee et al. 1999], who proposed research diagnostic criteria. Gurrera and colleagues [Gurrera et al. 1992] found ‘only modest agreement’ among the criteria of Levenson [Levenson, 1985], Addonizio and colleagues [Addonizio et al. 1986] and Pope and colleagues [Pope et al. 1986]. Our comparison, also based on a retrospective review of medical notes, likewise found only modest, and if anything rather more modest, agreement. Gurrera and colleagues [Gurrera et al. 1992] derived κ and ICC statistics of between 0.41 and 0.65, and specifically modified the criteria of Levenson and Addonizio and colleagues, so as to conform to the ‘probable’ category allowed by Pope and colleagues. Their lowest ICC of 0.41 applied to a three-way comparison of the unmodified versions and Pope’s probable category, while the highest ICC applied to a three-way comparison of the modified versions and Pope’s probable category. Our study, while broadly in line with the conclusions of Gurrera and colleagues, showed some differences [Gurrera et al. 1992]. In particular, our measures of agreement were generally lower for overall and pairwise comparisons. Gurrera and colleagues reported κ values of 0.51 between the criteria of Levenson and those of Addonizio and colleagues, 0.60 between those of Pope and colleagues and those of Addonizio and colleagues, and 0.48 between those of Pope and colleagues and those of Levenson. In comparison, we found κ statistics for these comparisons of 0.51, 0.24 and 0.26 respectively. Subsequent to the completion of the study reported here, Delphi consensus criteria for NMS were published [Gurrera et al. 2011]. However, we believe that these criteria would have little utility for retrospective analyses such as those carried out here because, like those of Sachdev [Sachdev, 2005], they assume relatively specific sets of information are recorded in clinical records and are potentially better suited to prospective, more specific studies. Also of note is that Delphi methodology simply reflects the agreement of experts on the basis of the best evidence available. When evidence is limited, further empirical research is required.
When considering previous research, it is important to remember that our study is different in terms of case finding and statistical techniques. Gurrera and colleagues performed a retrospective analysis of medical notes derived exclusively from inpatients who had been referred to a consultation service over a period of 6 years. These authors found 64 patients in whom NMS was a differential diagnosis at the time of referral (corresponding to our definition of ‘suspected NMS’). However, they excluded clinical findings when an alternative ‘non-NMS cause’ was possible, which produced a group of 45 patients, with 65 possible episodes of NMS. While the stringency of their inclusion criteria was designed to reduce the number of false positives, ours was designed to allow the constraints of a retrospective case review to provide a more naturalistic picture of what clinicians consider as possible episodes of NMS.
Difficulties in diagnosing NMS should not inhibit research into this condition, which already presents challenges because of its rarity. Disagreements between the criteria and the relatively large number of people with at least some degree of symptomatology who do not fulfil diagnostic criteria suggest that it may be better to consider NMS as a spectrum of complications, possibly with clinical and subclinical states. Of potential relevance here is the improved profile of positive predictive values for a number of NMS-relevant symptoms when the broader probable criteria of Pope and colleagues [Pope et al. 1986] were included with the other criteria, although inevitably the analyses of screening properties of symptom counts involves a circular argument (analysing numbers of symptoms rather than the criteria based around those symptoms). A considerable degree of disagreement may lie simply in controversy over semantics between authors with respect to symptoms essential for diagnosis and how the symptoms are defined for practical purposes in research, rather than actual features of the disorder. Therefore, ways of characterising the disorder more robustly and with less ambiguity are needed. However, it should also be borne in mind that the core symptoms of NMS are, when present in isolation, relatively nonspecific and may have alternative aetiologies. Higher vigilance, and possibly the increased availability of rapid chemical pathology resources for mental health teams, may mean that NMS as seen in clinical services is more often detected at prediagnostic stages. It is possible that research diagnostic criteria and approaches may have to change to reflect this. Further evaluation is clearly required and the relatively holistic approach to case ascertainment used here should be helpful to others who are working in this field.
Footnotes
The development of SLAM BRC Case Register was funded by two Capital Awards from the UK National Institute for Health Research (NIHR) and is further supported through the BRC Nucleus funded jointly by the Guy’s and St Thomas’ Trustees and South London and Maudsley Special Trustees. C-KC and RS are funded by the NIHR Specialist Biomedical Research Centre for Mental Health at the South London and Maudsley NHS Foundation Trust and Institute of Psychiatry, King’s College London. SH is funded by an NIHR Academic Clinical Fellowship. WL is funded by the UK Medical Research Council.
The authors declare no conflict of interest in preparing this article.
