Abstract
Purpose
Several risk stratification scores for predicting stroke-associated pneumonia have been derived. We aimed to evaluate the performance and clinical usefulness of such scores for predicting stroke-associated pneumonia.
Method
A systematic literature review was undertaken in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses statement, with application of the Quality Assessment of Diagnostic Accuracy-2 tool. Published studies of hospitalised adults with ischaemic stroke, intracerebral haemorrhage, or both, which derived and validated an integer-based clinical risk score, or externally validated an existing score to predict occurrence of stroke-associated pneumonia, were considered and independently screened for inclusion by two reviewers.
Findings
We identified nine scores, from eight derivation cohorts. Age was a component of all scores, and the NIHSS score in all except one. Six scores were internally validated and five scores were externally validated. The A2DS2 score (Age, Atrial fibrillation, Dysphagia, Severity [NIHSS], Sex) was the most externally validated in 8 independent cohorts. Performance measures were reported for eight scores. Discrimination tended to be more variable in the external validation cohorts (C statistic 0.67–0.83) than the derivation cohorts (C statistic 0.74–0.85).
Discussion
Overall, discrimination and calibration were similar between the different scores. No study evaluated influence on clinical decision making or prognosis.
Conclusion
The clinical prediction scores varied in their simplicity of use and were comparable in performance. Utility of such scores for preventive intervention trials and in clinical practice remains uncertain and requires further study.
Introduction
Stroke-associated pneumonia (SAP) is a common and serious complication after acute stroke, associated with increased length of hospital stay, mortality and worse outcomes in survivors.1–6 A recent systematic review reported that SAP occurs in 14.3%, although the frequency varies widely depending on definition of SAP and patient characteristics. 7 Several features of SAP such as varied clinical manifestation, 8 uncertain role of blood biomarkers 9 and absence of definitive diagnostic criteria make it challenging to diagnose in clinical practice. As a first step, the recently convened Pneumonia In Stroke ConsEnsuS (PISCES) group proposed operational diagnostic criteria for SAP based on Center of Disease Control criteria (CDC). 9
Numerous baseline clinical factors such as age, dysphagia, severity of stroke, low conscious level, type and location of stroke may pre-dispose individuals to SAP.10–13 Predictive risk models derived using these routinely available variables may help in identifying patients at an increased risk of pneumonia for targeted preventive measures and may also provide opportunities for novel interventions for monitoring or therapy. However, clinical prediction scores have several potential weaknesses such as differences in derivation, inconsistent external validation and complexity thus making choice of score and application to clinical practice challenging.14,15 We therefore undertook a systematic review to identify scores used in predicting risk of SAP, with the aim of evaluating performance, usability and utility for clinical practice and research.
Methods
A systematic literature review was undertaken in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement. 16
Data sources and searches
Searches were undertaken in MEDLINE (1946-15 September 2015) and EMBASE (1947-15 September 2015) using pre-defined search criteria and terms (Online Table I). Hand searching of reference lists for additional eligible articles was also carried out, and the PISCES group were invited to provide any other potentially eligible articles.
Study selection
Published studies (English and Non-English) of hospitalised adults with ischaemic stroke, intracerebral haemorrhage (ICH), or both, which derived and validated an integer-based clinical risk score, or externally validated an existing score to predict occurrence of pneumonia after stroke, were independently screened for eligibility by two reviewers (AKK and CJS), using the study title and abstract (Online Table II). Lead or corresponding authors of studies under consideration were contacted by e-mail to resolve any issues relating to assessment of eligibility or data extraction. Discrepancies relating to eligibility or data extraction were resolved by discussion between the same two study investigators.
Data extraction
Data were independently extracted by two reviewers (AKK and CJS) and included study design, clinical environment, country, stroke subtype (ischaemic or ICH), mean age, mean National Institutes of Health Stroke Scale (NIHSS) score, components of score and weighting, measures of discrimination and calibration, co-morbidities, criteria used in diagnosis of pneumonia and proportion of patients diagnosed with pneumonia.
Assessment of quality: Risk of bias and applicability
Quality was assessed in terms of risk of bias and concerns regarding applicability, using the Quality Assessment of Diagnostic Accuracy (QUADAS)-2 tool. 17 In brief, judgement of applicability and risk of bias are made across four domains using relevant signalling questions: patient selection, index risk score, reference standard (diagnosis of SAP) and flow and timing. The QUADAS-2 tool was applied for each score within the identified validation cohorts by two reviewers (AKK and AV) independently.
Risk score performance
For the discriminative ability of scores, we extracted information on the area under the receiver operating characteristic curve (AUROC) or C-statistic, their 95% confidence intervals, and the
Clinical usefulness
We noted the complexity of application and use at the bedside, whether prediction scores incorporated categories of risk-stratification (usability), and whether scores had been used to evaluate clinical management or clinician behaviours (utility). We also evaluated the generalisability of each prediction model by determining whether it had been externally validated in an independent patient population, either in the original or subsequent publication.
Findings
Search results
The electronic search yielded 2493 publications. After screening, exclusion of duplicates and applying eligibility criteria, 46 full texts and abstracts were reviewed (Figure 1). No additional articles were identified through hand-searching of major stroke journals or by the PISCES group. Twelve fully published studies were finally considered eligible for inclusion.11,18–28
Flow diagram of systematic search methodology.
Clinical risk scores for predicting SAP
Components of clinical risk scores for predicting stroke-associated pneumonia.
ISAN: Independence Prestroke, Sex, Age, National Institutes of Health Stroke Scale; A2DS2: Age, Atrial Fibrillation, Dysphagia Sex, Severity; AIS-APS: Acute Ischaemic Stroke-Associated Pneumonia Score; PANTHERIS: Preventive Antibacterial Therapy in Acute Ischaemic Stroke; VHA: Veterans Health Administration; PNA: Pneumonia Prediction; ICH-APS: Intracerebral haemorrhage-Associated Pneumonia Score; NIHSS: National Institutes of Health Stroke Scale; GCS: Glasgow coma scale; OCSP: Oxfordshire community stroke project; mRS: modified Rankin scale; COPD: chronic obstructive pulmonary disease; WBC: White blood cell.
Study and participant characteristics
Characteristics of the derivation and validation cohorts are summarised in Online Table III. Median age was 71 years (range 61—76 years) and median NIHSS was 5 (range 4–13). All studies adequately described selection of study sample. Except one study which was prospective, 26 all were retrospective evaluations of existing prospective cohorts. Of the 14 separate derivation or validation cohorts, 8 (61%) were multicentre or national stroke registries and 6 (39%) were single-centre hospital-based stroke registries. The majority of the 14 studies (80%) evaluated only acute ischaemic stroke. Definition and ascertainment of risk factors for model derivation was varied and often limited by availability of data, particularly in existing national registries. For example, dysphagia assessment was not described among several studies,21,22,25 and pre-stroke disability was described in different ways.21,22,27 Some studies did not record pre-existing disability.18,25 Diagnostic approach to pneumonia varied between cohorts; clinician reported diagnosis of pneumonia (36%) and the CDC criteria for pneumonia (36%) were the most commonly used approaches. The other methods include adhoc objective criteria (14%) and Chinese Consensus criteria (14%).
Quality assessment
Overall, risk of bias and concerns regarding applicability were judged as generally low (Online Table IV). In some validation cohorts, risk of bias was judged as high based on patient selection (exclusions based on incomplete baseline data, 27 or selected high-risk cohort 20 ), reference standard (non-standardised criteria for diagnosis of SAP11,19,27,28) and flow and timing (verification bias, related to differences in applying the same reference standard by the study group11,19,27,28).
Performance and validation of the risk scores
Performance of the clinical risk scores for predicting SAP in the derivation and internal validation cohorts.
SAP: Stroke-Associated Pneumonia; ISAN: Independence Prestroke, Sex, Age, National Institutes of Health Stroke Scale; A2DS2: Age, Atrial, Fibrillation, Dysphagia, Sex, Severity; AIS-APS: Acute Ischemic Stroke-Associated Pneumonia Score; PANTHERIS: Preventive Antibacterial Therapy in Acute Ischaemic Stroke; VHA: Veterans Health Administration; PNA: Pneumonia Prediction; ICH-APS: Intracerebral Haemorrhage-Associated Pneumonia Score; SSNAP: Sentinel Stroke National Audit Programme; BSR: Berlin Stroke Register; CNSR: Chinese National Stroke Registry; Berlin NICU: Berlin Neurological Intensive Care Unit; CDC: Centers for Disease Control and Prevention; NR: Not Reported; CI: Confidence Interval.
Performance of the clinical risk scores for predicting SAP in the external validation cohorts.
SAP: Stroke-Associated Pneumonia; ISAN: Independence Prestroke, Sex, Age, National Institutes of Health Stroke Scale; A2DS2: Age, Atrial Fibrillation, Dysphagia, Sex, Severity; AIS-APS: Acute Ischaemic Stroke-Associated Pneumonia Score; VHA: Veterans Health Administration; NWGSR: North West Germany Stroke Register; SSNAP: Sentinel Stroke National Audit Programme; CNSR: Chinese National Stroke Registry; CICAS: Chinese Intracranial Atherosclerosis Study; HNSR: Henan Province Stroke Registry; WCH: Wuhan Central Hospital; CDC: Centers for Disease Control and Prevention; NR: Not Reported; CI: Confidence Interval.
Six of the risk scores were validated internally through split samples (Table 2). All reported the C-statistic for the internal validation cohort, which ranged from 0.73 to 0.88, with five models reporting calibration metric. Five of the nine scores were validated externally (Table 3), with C-statistic ranging from 0.68 to 0.83. The A2DS2 score 19 has been evaluated most extensively, in the largest derivation sample (n = 15,335), and in eight separate external validation cohorts. The A2DS2 score performed consistently across these cohorts (C statistic 0.73 to 0.84), with good calibration.
Clinical usefulness
The risk scores varied in their complexity and ease of use (Table 1), although most scores incorporated clinical variables readily available at baseline. Two scores require admission laboratory variables20,21 and one of the scores developed exclusively for ICH requires quantitative measurement of haematoma volume. 22 Several of the scores were stratified into integer-based risk categories11,19,21,22,27 (e.g. low, moderate, high risk), facilitating usability by clinicians. The role of implementing the models as prediction rules in terms of risk stratification, decision making or improved patient outcomes was not evaluated for any of the scores.
Discussion
An ideal risk score for predicting SAP would incorporate variables readily available at stroke presentation, be quick to apply, provide meaningful risk categories with performance acceptable to the particular application (and to clinicians) and have impact on clinical decision making and clinical outcomes. In this systematic review, we identified nine clinical risk scores for predicting SAP and assessed their performance metrics, clinical usability and utility. We sought to identify whether any of the scores could be applied for use in clinical care or research.
The scores varied considerably in their complexity, component variables, derivation cohort characteristics, approach to defining SAP, ease of application, consistency of external validation, and performance evaluation. Substantial heterogeneity between the studies was therefore anticipated and precluded meta-analyses. As previously acknowledged, 27 the prevalence of SAP varied between the cohorts, most likely related to underlying differences in patient characteristics and definitions used, 1 potentially contributing to outcome reporting bias. Several of the scores were derived from relatively small single centre cohorts18,20,25 limiting their generalisability. As all of the scores were derived using retrospective analyses of registry-based studies, model-building was limited by the baseline characteristics recorded in the different cohorts. Therefore, potentially important baseline characteristics (e.g. smoking, medication, chronic lung disease), medications (e.g. statin therapy or beta-blockade), laboratory variables (e.g. leukocyte count or C-reactive protein [CRP]) or interventions (e.g. mechanical intervention, type of swallow screen), which may have influenced SAP risk, were not available in the majority of the derivation cohorts.
For the studies reporting performance metrics, the discriminative ability and calibration of the scores ranged from moderate to good. However, several of the scores have not yet undergone external validation to our knowledge.20,22,25 Some of the scores performed similarly in the external validation and derivation cohorts,19,21,27 despite differences in patient characteristics, supporting generalisability. Importantly, the majority of the validation studies were unable to compare the performance of more than one score concurrently due to limitations imposed by data routinely collected in the registry-based cohorts. One study compared four scores concurrently, 21 and found no material difference in the performance metrics of the four scores (Pneumonia score, VHA score, AIS-APS, A2DS2) tested. Most scores were derived only in ischaemic stroke cohorts, although two scores with comparable performance were available for ICH. The ISAN and A2DS2 were evaluated in both ischaemic stroke and ICH, and performance metrics tended to be superior in ischaemic stroke rather than ICH, most likely due to ceiling effects. 27 The only scores derived exclusively for ICH (ICH-APS A and B) are less practical to apply, requiring baseline imaging parameters, and have not been externally validated to date. 22 Considering the high rate of early neurological deterioration and conflicting risk of death after ICH, the ISAN, A2DS2 and ICH-APS scores each performed better, and comparably, in sensitivity analyses stratifying for survival beyond 48–72 h after ICH.22,27
The role of clinical risk scores for predicting SAP in clinical care or research remains uncertain. None of the studies investigated utility in terms of clinician behaviours (for example, the time taken to administer the risk scores) or impact analysis on clinical outcomes. The current levels of sensitivity and specificity for given cut-offs on the scores19,21,22,24 may be unacceptable to clinicians, although this may depend on the particular application of the score. For example, for a cut-off of ≥4 on the A2DS2 score, sensitivity is 91% but specificity is 57%. 19 This means that only 9% of actual SAP cases are not identified as high risk (false-negative rate), yet 43% of the patients who do not get SAP are incorrectly identified as being at high risk (false-positive rate). For a safe, inexpensive and well-tolerated intervention to prevent SAP (e.g. enhanced monitoring or oral hygiene protocol) this extent of exposure to unnecessary interventions may be acceptable. However, for more expensive and complex preventive interventions with adverse effects, which are challenging to administer, then such low specificity may make clinical trials impractical and more difficult to justify.
Further large, multi-centre prospective studies of consecutive patients, with adjudicated diagnosis of SAP using standardised and validated criteria are required to evaluate comparative performance and utility of the available scores. Refining the existing scores, including the addition of laboratory biomarkers such as CRP to improve performance, 26 warrants further consideration. Finally, evaluating clinical utility of the scores is an essential step to determine effects on clinician behaviours, impact on clinical decision making, clinical outcomes and feasibility of implementation.
Conclusion
We identified several clinical risk scores for predicting SAP which varied in their simplicity and consistency of validation. When recorded, performance metrics were comparable between scores, and no single score consistently performed better than others. However, interpretation was limited by heterogeneity and some risk of bias. The utility of risk scores for predicting SAP remains uncertain and requires further study in prospective cohorts with standardised criteria for definition of SAP.
Supplemental Material
sj-pdf-1-eso-10.1177_2396987316651759 - Supplemental material for Clinical risk scores for predicting stroke-associated pneumonia: A systematic review
Supplemental material, sj-pdf-1-eso-10.1177_2396987316651759 for Clinical risk scores for predicting stroke-associated pneumonia: A systematic review by Amit K Kishore, Andy Vail, Benjamin D Bray, Angel Chamorro, Mario Di Napoli, Lalit Kalra, Peter Langhorne, Joan Montaner, Christine Roffe, Anthony G Rudd, Pippa J Tyrrell, Diederik van de Beek, Mark Woodhead, Andreas Meisel and Craig J Smith in European Stroke Journal
Footnotes
Acknowledgements
We are very grateful to Valerie Haigh for her assistance with the literature searches.
Declaration of Conflicting Interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: CJS, BB, PJT and AM have all co-authored derivation/validation studies of the ISAN and A2DS2 scores included in this systematic review. AM is a co-inventor and co-owner of a patent on anti-infective agents and immunomodulators used for preventative therapy following an acute cerebrovascular accident has been filed to the European Patent Office (PCT/EP03/02246). AKK, AV, AC, MDN, LK, PL, JM, CR, AGR, DvdB and MW declare no conflicts of interest or disclosures.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Ethical approval
Not applicable
Informed consent
Not applicable
Trial Registration
Not applicable
Guarantor
AKK and CJS.
Contributorship
CJS conceived the study. AKK researched literature and drew up the protocol. AKK, CJS and AV were involved in data analysis. AKK, CJS, AV, BB, PJT, AM, AC, MDN, LK, PL, JM, CR, AGR and DvdB were all involved in production of the manuscript.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
