Abstract
Purpose
To provide an overview of various sepsis International Classification of Diseases (ICD) coding methods and their diagnostic accuracy.
Methods
We undertook a systematic scoping review between 1991 and 2020 (search terms: sepsis, coding, and epidemiology) to include studies reporting the accuracy of a sepsis ICD coding method. Studies were grouped by ICD coding method, number of diagnostic accuracy parameters, ICD version, reference standard, design, country, setting, type of dataset and sepsis definition. ICD coding methods were categorised as explicit or implicit, with the explicit methods further divided into wide and narrow groups. Descriptive statistics were used to present data.
Results
We analysed 17 studies, of which 16 (94.1%) used retrospective medical chart review as the reference standard for clinical sepsis, and eight (47.1%) used hospital administrative data to identify sepsis. There were 53 assessments of various ICD coding methods, with 32 (60.4%) of them being explicit and 21 (39.6%) implicit methods. The coding methods had a median sensitivity of <75% but a median specificity of >85%. However, a wide variation was noted in the diagnostic accuracy parameters of all ICD coding methods. Most of the studies showed high methodological quality.
Conclusion
None of the current ICD coding methods is optimal for identifying sepsis.
Introduction
Sepsis is life-threatening organ dysfunction caused by a dysregulated host response to infection. 1 It is a leading cause of death, with sepsis-related mortality accounting for nearly one-fifth (11 million) of all global deaths globally.2,3 Accurate quantification of sepsis estimates is important for public health policymakers, researchers, and healthcare systems to develop and implement strategies to improve sepsis patient outcomes. However, it remains a challenge due to the absence of a “gold standard” diagnostic test and changing clinical definitions. 4
Various methods have been used to study sepsis epidemiology, namely, clinical diagnosis through prospective or retrospective cohort studies, analysis of administrative data such as The World Health Organisation (WHO) International Classification of Disease (ICD) codes 5 associated with hospital discharge data, insurance claims data, and death certification data, and interrogation of Electronic Health Records.2,6,7 Among these, ICD coding is the most practical method of conducting longitudinal sepsis surveillance at the national or regional level. 8
When estimating sepsis incidence using ICD coding, there are two broad approaches. The first approach is to count only episodes where a specific sepsis code is recorded, referred to as the explicit criteria. The second approach is to count episodes where both a code for infection and organ dysfunction are recorded, referred to as the implicit criteria since the presence of infection and organ dysfunction implies sepsis. However, both these methods have limitations. For the implicit criteria, the temporal and causal relationships between infection and organ dysfunction are assumed rather than proven. 8 A recent study examined the accuracy of explicit and implicit criteria for identifying sepsis through retrospective clinical diagnosis. The study found that the explicit criteria undercounted sepsis while the implicit criteria significantly overcounted it. 6 Examples of these methods are the research by Martin and colleagues who estimated sepsis based on only six codes indicating septicaemia, bacteraemia, and fungemia,9–11 whereas Angus and colleagues 7 used a broader set of 122 ICD-9 codes including 109 infection codes and 13 organ dysfunction codes. However, there is considerable variation in how these methods are used, with a systematic review conducted in 2015 reporting 38 ICD code-based case definitions for sepsis with differing numbers and types of ICD codes. 12
There is no consensus on the optimal ICD coding method to identify patients with sepsis in administrative data sets. This is further complicated by the considerable heterogeneity in the type of administrative dataset, reference standards, and sepsis definition, as well as the lack of well-defined comparators in studies on the diagnostic accuracy of the sepsis ICD coding methods. To address these issues, we conducted a scoping review to provide a broad overview of the available ICD coding methods used to study sepsis epidemiology and their relative diagnostic accuracy.
Methods
A systematic scoping review was undertaken in accordance with the guidance for conducting scoping reviews.13,14 A study protocol was prepared outlining the scope and objective of this scoping review. Due to the scoping nature of this review, it was not registered on the PROSPERO.
Eligibility Criteria
All studies were included that reported the accuracy of ICD-9 or ICD-10 codes for sepsis, severe sepsis or septic shock compared to a reference standard and were published between 1991 and 2020. For diagnostic accuracy parameters, studies were included provided at least one of the following was reported: sensitivity, specificity, positive or negative predictive value. Studies were excluded if there was no sepsis focus, they did not report ICD coding validation, they used older sepsis definitions to identify sepsis cases, or their full text was not available in English.
Information Sources
A systematic literature search was conducted in MEDLINE and EMBASE (using the OVID interface), Scopus, and the Cochrane Database of Systematic Reviews.
Search Strategy and Study Selection
We used a modification of a previous search strategy used in a similar systematic review. 12 The search strategy was based on following three topics: sepsis, septic shock; ICD coding and epidemiology (Table S1). The literature search was limited to publications that followed the first consensus definition of sepsis in 1991 15 with an end date of 2020. EndNote20 16 was used to manage search results, store full-text publications and facilitate the screening process.
Sepsis Definition
To be consistent with the current concept that sepsis is life-threatening organ dysfunction due to infection (Sepsis-3), 1 we included only studies that used the presence of infection and organ dysfunction to define clinical sepsis as the reference standards. For studies that used older SIRS-based definitions, 15 assessments of only severe sepsis (SIRS due to infection with organ dysfunction) and septic shock were included.
Data Collection and Charting
Two investigators independently extracted data from the included studies using a pre-specified data table. The following parameters were extracted: year, authors, country, study design, setting, data years, type of administrative database, subjects/participants, sample size, reference/gold standard for diagnosis of sepsis, ICD coding method, ICD code version, number of ICD codes, sepsis severity, and sepsis definition used. They also extracted the number of diagnostic accuracy measures used and the reported accuracy of the coding methods. Any discrepancies in the extracted data were resolved by discussion wherever possible or referred to a third investigator for adjudication.
Quality Assessment of Evidence
The Joanna Briggs Institute (JBI) appraisal tool for diagnostic accuracy studies 17 was used to evaluate methodological quality. This tool consists of 10 questions in four domains of patient selection, index test (ICD coding method), reference standard and, flow and timing with ‘yes’, ‘no’ and ‘unclear’ as answer options. Based on the answers, reviewers could assign risk of bias in either of the four domains and applicability concerns in three domains of patient selection, index test (ICD coding method), and reference standard) as ‘high’, ‘low’ and ‘uncertain’ When it was uncertain if a study fulfils a question, it was marked as unclear. Review Manager (RevMan 5.4) 18 was used to create the methodological quality summary, and risk of bias and applicability concerns graphs.
Synthesis of Results
Studies were grouped according to attributes such as type of ICD coding method, number of diagnostic accuracy parameters, ICD version used, reference standard, design, country, setting, type of dataset and sepsis definition. For the ICD coding methods, studies were grouped into explicit or implicit methods. Studies with explicit methods were further divided into narrow explicit methods when only sepsis, severe sepsis, septic shock, or septicaemia codes were used or wide explicit methods when a wide list of organism-specific sepsis codes was used.
The following diagnostic accuracy parameters were collected: sensitivity, specificity, Positive predictive value, Negative predictive value, positive and negative Likelihood ratio and Area Under the Reciprocal Operating Curve (AUROC).
Statistical Analysis
Descriptive statistics were used to present the characteristics of the included studies. The categorical variables are presented as the counts and percentage, and continuous variables as the median (Inter Quartile Range; IQR). The diagnostic accuracy of the sepsis ICD coding methods was evaluated using SPSS statistical software (IBM Corp. Released 2020. IBM SPSS Statistics for Windows, Version 28.0. Armonk, NY: IBM Corp) and are presented as median, mean and 95% confidence interval (95%CI). No pooled or weighted analysis was done due to the expected wide variation in the studies.
The results are reported as per the PRISMA extension for conducting scoping reviews. 19
Results
Search Results
A total of 1235 articles were identified, of these 175 were considered for full-text review with 17 studies meeting eligibility criteria and included in the review (Figure 1).

PRISMA flow diagram.
Characteristics of the Included Studies
The detailed characteristics of the individual studies are shown in Table 1, and key demographic characteristics are shown in Table 2. Eleven (64.7%) studies were from the USA, two (11.8%) from Denmark and four (23.6%) from other high-income countries. The studies were published between 1998 and 2018 and reported data collected between 1994 and 2014. One (5.9%) study was conducted in ICU, four (23.5%) in the Emergency Department, and 13 (76.5%) in other hospital settings. Out of the 17 studies, 16 (94.1%) included adult patients and one (5.9%) included paediatric patients. Hospital administrative data was the most commonly used dataset in 8 (47.1%) studies, followed by population data and EHR in 3 (17.6%) studies each. Hospital claims or insurance data was used in two (11.8%) studies, and ICU database was used in one (5.9%) study. Retrospective medical chart review was the reference standard in 16 (94.1%) studies, while one (5.9%) study used patients’ data enrolled in a sepsis clinical trial. Only three (17.6%) studies used the Sepsis-3 definition. ICD-9 and 10 codes were assessed in 11 (64.7%) and six (35.3%) studies, respectively. In terms of disease severity, 12 (70.6%) studies included patients with sepsis according to the Sepsis-3 or equivalent definition. Three (17.6%) studies included patients with septic shock, while two (11.8%) included patients with mixed sepsis severity.
Detailed Characteristics of Studies Included in Scoping Review.
AM: Australian Modification; BC: Blood Culture; CAS: Community-Acquired sepsis; CIHI: Canadian Institute for Health Information; CM: Clinical Modification; CMS: Centres for Medicare & Medicaid Services; HADR: Hospital admission/discharge register; GM: German Modification; ICD: International Classification of Disease: ICU: Intensive Care Unit; ED: Emergency Department; LA: Lactate; OD: Organ Dysfunction; NPV: Negative Predictive Value; NS: Not specified; PC: Prospective Cohort; PPV: Predictive Value; Positive RC: Retrospective Cohort; Sn: Sensitivity; Sp: Specificity; SS: Septic shock. *Use of vasopressor (dopamine, norepinephrine, vasopressin, phenylephrine, or epinephrine) use or lactate ≥4.0 mmol/L within ±1 day of culture.
Demographic Characteristics of ICD Coding Validation Studies.
CT: Clinical Trial; ED: Emergency department; EHR: Electronic Health Records; ICD: International Classification of Diseases; ICU: Intensive Care Unit; NS: Not specified; OD: Organ Dysfunction; NPV: Negative Predictive Value; PPV: Positive Predictive Value.
* One each in Australia, Canada, Germany and Japan; **Included Area Under the curve of the Receiver Operating Characteristic (AUROC) and Likelihood ratio.
The sepsis ICD coding methods are shown in Figure 2. There were 53 assessments of various sepsis ICD coding methods in the 17 included studies. Of these, 32 (60.4%) were of explicit methods with a pooled sample size of 1,148,028 and all 17 studies reported them. Within the explicit assessments, 22 (68.8%) were of narrow explicit methods in 13 studies, with seven assessments (six studies) of the Martin method. The remaining 10 (31.2%) were wide explicit methods in five studies. Of these, five included only sepsis codes, while the other six included additional organ dysfunction codes. Depending on the method, the number of ICD codes ranged from one to three and 16 to 57 for the narrow and wide explicit methods, respectively.

Studies assessing sepsis ICD coding methods compared to reference standard.
Implicit methods were evaluated in 21 assessments, accounting for 39.6% of all assessments. These evaluations were conducted in 12 studies with a combined sample size of 1,145,231. In these assessments, the Angus method was assessed 16 times which accounted for 75.0% of all implicit methods assessments and were reported in all 12 studies. The Dombrovskiy method was evaluated three times (15.0%) in two studies, while Wang and CMS each once (5.0%) in one study.
Diagnostic Accuracy
Sensitivity and PPV were the most commonly reported diagnostic accuracy parameters, each in 13 (76.5%) studies, followed by specificity in nine (52.9%) and NPV in eight (47.1%) studies. Other parameters, such as AUROC and LR were reported in three (17.6%) studies. Six (35.3%) studies reported one diagnostic accuracy parameter, while nine (52.9%) studies reported between two to four, and two (11.8%) reported more than four parameters. The median number of the diagnostic accuracy parameters reported per study was three.
Narrow explicit methods had a median (mean; 95% CI) sensitivity of 25.6% (32.1%; 21.1-43.0%), specificity of 99.3% (97.2%; 94.3-100.1%), and positive and negative predictive values of 81.2% (83.5%; 75.1-91.8%) and 95.7% (93.5%; 89.2-97.8%), respectively. Wide explicit methods had a median (mean; 95% CI) sensitivity of 25.4% (31.7%; 18.8-44.6%), specificity of 99.6% (98.0%; 95.2-100.7%), and positive and negative predictive values of 78.4% (73.8%; 56.6-91.9%) and 91.4% (87.2%; 77.6-96.7%), respectively. The Martin method had a median (mean; 95% CI) sensitivity of 51.2% (51.7%; 23.2-80.1%), specificity of 96.5% (97.0%; 94.0-100.0%), and positive and negative predictive values of 93.9% (89.9%; 78.2-101.6%) and 67.8% (73.3%; 59.8-86.8%), respectively. Among implicit methods, the Angus method had a median (mean; 95% CI) sensitivity of 50.3% (51.2%; 41.0-61.4%), specificity of 88.2% (87.2%; 68.9-105.6%), and positive and negative predictive values of 53.9% (52.6%; 36.9-68.4%) and 91.1% (87.4%; 77.8-97.0%), respectively. (Figure 3)

Diagnostic accuracy of sepsis ICD coding methods.
Quality Assessment
Overall methodological quality of the included studies was good. Out of the four domains, the risk of bias was low for the reference standard, flow, and timing (except one (5.9%) study, each with uncertain risk of bias), and patient selection (except three; 17.6% studies, of which one had a high risk of bias and two had uncertain risk of bias). However, the risk of bias for the sepsis ICD coding methods used as index tests was unclear in 14 (82.4%) studies. Applicability concerns were low for all three domains in all studies except for three (17.6%) studies for index text and one (5.9%) for patient selection. (Figure S1)
Discussion
In this review, we provide a summary of various ICD coding methods used for sepsis case identification. The majority of the studies were conducted in the United States, and most of them were retrospective cohorts involving adult patients. Medical chart reviews of administrative data were commonly used. However, only a few studies applied the Sepsis-3 definition and ICD-10 codes. The included studies varied significantly in terms of sample size, databases, sepsis definitions, reference standards, and ICD coding methods. Narrow explicit and Angus were the two most commonly reported ICD coding methods. About one-sixth of studies reported two or more diagnostic accuracy parameters, with sensitivity and Positive Predictive Value being the most common. The diagnostic accuracy parameters showed wide variation except for the specificity of narrow and wide explicit methods and the Martin method. Overall, all ICD coding methods were highly specific, with a median of more than 85%, but none showed desirable median sensitivity of >75%. Positive and Negative Predictive Values, which are the measures of the clinical validity of a diagnostic tool 20 also varied considerably among various ICD coding methods.
No studies were conducted in upper-middle or low-income countries where the burden of sepsis is considered to be highest. 21 In two-thirds of the studies analysed, two or more diagnostic accuracy parameters were reported, which is higher than a previous review of sepsis ICD coding methods. 12 As noted in a previous review, 12 the varying number of ICD codes used and their version, various reference standards, and multiple settings and datasets can explain the variation in diagnostic accuracy parameters. Although all studies were conducted in high-income countries with good implementation of EHR, 22 only a few studies in our review used it. Most ICD coding methods had high specificity, with a median of over 85%, but lacked sensitivity. Only eight assessments reported a sensitivity of more than 75%, similar to a previous review. 12 The sensitivity of the narrow explicit ICD coding method was less than 60%, which is consistent with a previous US study. 6
Strengths & Limitations
There are many strengths of this scoping review. It is the first comprehensive study to review the diagnostic accuracy of different sepsis ICD coding methods after the introduction of the Sepsis-3 definition in 2016. We only included studies that used the Sepsis-3 definition or severe sepsis or septic shock definitions from the older sepsis definition, which aligns with the current sepsis definition. Our review covers a range of sepsis ICD coding methods, including wide and narrow explicit definitions, which have not been previously reported. The majority of studies reported two or more diagnostic accuracy parameters, indicating the reliability of the included studies. However, limitations include the limited head-to-head comparative studies and considerable heterogeneity among the included studies, which limits the generalisability of the findings. In addition, the considerable variation in the diagnostic accuracy parameters makes it difficult to make meaningful comparisons of the different sepsis ICD coding methods.
Implications and Future Directions
Our review found that only a small portion of ICD coding assessments displayed a sensitivity of more than 75%, indicating sepsis is largely undercounted in the hospital administrative data. This finding has significant implications for clinicians, health policymakers, and researchers as ICD coding is one of the commonly used methods for sepsis surveillance, allocating resources and services and shaping future research. Therefore, it is crucial to make a concerted effort to identify better ICD coding methods or improve existing ones to obtain more accurate sepsis estimates. A modified version of the ICD codes used in the GBD study 2 was used in a retrospective analysis of an administrative dataset in Australia to report sepsis epidemiology. 23 It would be worthwhile to evaluate the-diagnostic accuracy of that method in a well-conducted clinical study. Recently, machine learning methods have been evaluated for automated ICD coding with reasonable success.24–27 Application of such methods to the current sepsis ICD coding methods to improve their diagnostic accuracy by addressing inherent limitations in the coding process is a worthy inclusion in future research studies.
Conclusions
This scoping review showed that there is no ICD coding method which can accurately measure sepsis burden. Due to significant heterogeneity and the lack of direct comparative studies, it is difficult to draw any firm conclusion about the relative diagnostic accuracy of different ICD coding methods. Further research is needed to improve current ICD coding methods or identify new ones. Additionally, it is crucial to verify the accuracy of sepsis ICD coding through prospective clinical studies.
Supplemental Material
sj-docx-1-jic-10.1177_08850666231192371 - Supplemental material for Accuracy of International Classification of Disease Coding Methods to Estimate Sepsis Epidemiology: A Scoping Review
Supplemental material, sj-docx-1-jic-10.1177_08850666231192371 for Accuracy of International Classification of Disease Coding Methods to Estimate Sepsis Epidemiology: A Scoping Review by Ashwani Kumar, Naomi Hammond, Sarah Grattan, Simon Finfer and Anthony Delaney in Journal of Intensive Care Medicine
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
