The applications of Australian-coded ICD-10 and ICD-10-AM data in research: A scoping review of the literature

Abstract

Background: Australia uses the International Classification of Diseases (ICD-10) for mortality coding and its Australian Modification, ICD-10-AM, for morbidity coding. The ICD underpins surveillance (population health, mortality), health planning and research (clinical, epidemiological and others). ICD-10-AM also supports activity-based funding, thereby propelling realignment of the foci of clinical coding and, potentially, coded data’s research utility. Objective: To conduct a scoping review of the literature exploring the use of ICD-10 and ICD-10-AM Australian-coded data in research. Research questions addressed herein: (1) What were the applications of ICD-10(-AM) Australian-coded data in published peer-reviewed research, 2012–2022? (2) What were the purposes of ICD-10(-AM) coded data within this context, as classified per a taxonomy of data use framework? Method: Following systematic Medline, Scopus and Cumulative Index to Nursing and Allied Health Literature database searches, a scoping literature review was conducted using PRISMA Extension for Scoping Reviews guidelines. References of a random 5% sample of within-scope articles were searched manually. Results were summarised using descriptive analyses. Results: Multi-stage screening of 2103 imported articles produced 636, including 25 from the references, for extraction and analysis; 54% were published 2019–2022; 50% within the largest five categories were published post-2019; 22% fell within the “Mental health and behavioural” category; 60.3% relied upon an ICD-10 modification. Articles were grouped by: research foci; relevant ICD chapter; themes per the taxonomy; purposes of the coded data. Observational study designs predominated: descriptive (50.6%) and cohort (34.6%). Conclusion: Researchers’ use of coded data is extensive, robust and growing. Increasing demand is foreshadowed for ICD-10(-AM) coded data, and HIM-Coders’ and Clinical Coders’ expert advice to medical researchers.

Keywords

international classification of diseases ICD-10 data quality health information management public health surveillance ICD codes ICD-10-AM Australian coded data clinical research mortality

Introduction

Health classification has been used for disease surveillance since the mid-17th century, well before the establishment of the International Classification of Diseases (ICD) in the late 19th century (Bowker and Star, 2000). Created to “coordinate [standardised] information and resources about mortality and morbidity globally,” the ICD has been revised approximately every decade, initially by the League of Nations and, since 1948, the World Health Organization (WHO) (Bowker and Star, 2000: 21). The ICD classification remains a core component of 21st century international, national and local health and medical bureaucracies and health information systems (Bowker and Star, 2000). Translated into over 43 languages, it is used in over 177 countries (World Health Organization (WHO), n.d.).

Australia adopted the tenth revision of the ICD (ICD-10) in 1997 for mortality (Medical Certificate of Cause of Death) coding. Similarly, to several other countries, in 1998 Australia’s National Centre for Classification in Health developed a country-specific modification (the ICD-10-AM (Australian Modification)) for morbidity coding and an Australian procedural classification (The Australian Classification of Health Interventions (ACHI)) (Roberts et al., 2002). The ICD-10-AM replaced the ICD-9-CM and was designed to accurately reflect the Australian clinical environment (Roberts et al., 1998, 2002). It was implemented gradually across all states and territories during 1998–1999 (Innes et al., 2000). Therefore, ICD-10 and ICD-10-AM are deeply embedded in the Australian health landscape. They underpin a vast cache of coded data for use in research.

In Australia, the application of the ICD extends across the whole of the health system. Consistent with practice in many other countries, the classification also functions extensively beyond its early purposes for mortality and population health surveillance, specifically: as a classification standard for mortality and morbidity data analysis, and research in the clinical, epidemiology, healthcare safety and quality and health administration arenas; as a clinical diagnostic tool; in compilation of risk prediction algorithms and for measuring the social determinants of health (Deschepper et al., 2019; Hay et al., 2017; Moriyama et al., 2011).

In addition to these original purposes, Australia’s ICD coded data are used to describe and contribute towards the pricing of public hospital services (Independent Hospital and Aged Care Pricing Authority [IHACPA], 2022). Since the second half of the 20th century, ICD data have contributed substantially to the undergirding, in Australia and several other countries, of hospital casemix- and other activity-based funding models in the public and private healthcare sectors, based upon patients’ clinical profiles and resource utilisation (Byron and McCathie, 1998; Wiley, 1992). The ICD coded data produced in the latter context are depended upon by provider organisations and health insurance funds.

The ICD classification has inherent limitations. For example, Liu et al.’s (2022) systematic review of Canadian ICD-10-based coding of sepsis for studies published from inception of the databases until September 2021 demonstrated the underreporting of sepsis in administrative databases, with a pooled sensitivity of only 35% (extensive under-reporting). The authors were optimistic that the future introduction of ICD-11, which has substantially more codes, may help to ameliorate this problem. In Australia, the construct and utility of ICD-10 and ICD-10-AM coded data are constrained in context of the classification’s internal standards and jurisdictional advisories that permit code allocation only to diagnoses, per Australian Coding Standard 0002, for “conditions that are significant in terms of treatment required, investigations needed and resources used in each episode of care” as opposed to describing “the complete disease status of the inpatient population” (Independent Hospital Pricing Authority, 2019: 4). Variability and shifts in coding errata, and in editions and versions of the classification, complicate the data collection, interpretation and analysis for identifying longitudinal statistical trends and for use in research studies.

In the past two and a half decades, the application in Australia of ICD coded data for funding and reimbursement purposes has dominated the health classification discourse and the performance of clinical coding in the hospital sector. It has also spurred the introduction of a professional sub-specialty of clinical coding auditing that is primarily focused on revenue assurance (i.e. the legal optimisation or, in the case of health insurance funds, the legal minimisation, of reimbursement). Australia’s activity-based funding environment has informed and, arguably, driven applications and interpretations of the classification. Anecdotally, it appears to have underpinned a shift in hospital-based coding practice from an emphasis on clinical coding for surveillance to a narrower focus. This, alongside the complexities brought by changes to editions and versions of the classification, appears to contribute to the limitations of reliable, coded data for research. The complexities were demonstrated by Ryan et al. (2021) in their multi-hospital study of ICD-10-AM stroke coding, which revealed substantial gaps in the quality of the coded data. For instance, these researchers found that when compared with diagnoses recorded by clinicians in the Australian Stroke Clinical Registry, the sensitivity of hospital coding ranged from 50.8% to 86.7% for different stroke types, and 1 in 10 stroke/transient ischaemic attack diagnoses had not been coded at hospital level (Ryan et al., 2021). These findings prompted the current investigation of the research-only applications of Australia’s ICD-10 and ICD-10-AM coded data, as the first step in a wider exploration of the effects of the financial imperative on coding practice and the ICD-coded data available for research.

Research questions

The purpose of this study was to undertake and present a scoping review of the literature exploring the use of ICD-10 and ICD-10-AM Australian-coded data in published research. It sought to address the following research questions:

(1) What were the applications of ICD-10 and ICD-10-AM Australian-coded data in peer-reviewed research published in 2012–2022?

(2) What were the purposes of ICD-10 and ICD-10-AM coded data within this research context, as classified according to a pre-existing, modified taxonomy of data use framework (Riley et al., 2022)?

(3) What was the extent of expert health classificatory involvement in informing the researchers (authors of the published research)?

(4) What was the extent of the researchers’ knowledge and understanding of the classification, codes and coded data?

The current article reports on the scoping review and the findings in relation to the first two research questions.

Method

Study design

A scoping review of the literature was conducted drawing upon the five-stage framework for scoping studies as constructed by Arksey and O’Malley (2005) and further informed by other authorities including the best practice guidelines from the Joanna Briggs Institute evidence synthesis methodology (Peters et al., 2020) and PRISMA-ScR (PRISMA Extension for Scoping Reviews) checklist (Tricco et al., 2018). The intent in adopting a scoping review was to optimise the known strengths of this method in mapping the literature within the area of interest (Arksey and O’Malley, 2005) and synthesising previously unexplored research evidence (Mays et al., 2001; Peters et al., 2015). A pre-existing, modified Taxonomy of Data Use Framework (Riley et al., 2022) was used to systematically frame and examine the within-scope articles and their author-researchers’ uses of the ICD-10 and ICD-10-AM classifications.

Search strategy

Following development of the study aim and research questions, a protocol (non-registered) was established. The research team collectively established search terms in context of the study purpose and research questions. These were mapped according to the population, context and concept elements suggested by Peters et al. (2020) (see Box 1). Articles were sourced from the Medline, Scopus and Cumulative Index to Nursing and Allied Health Literature (CINAHL) electronic databases. Following advice from a specialist librarian, a systematic, independent search of the electronic databases was undertaken (MR) using the agreed search terms. The search terms were applied to the keywords, title and abstract of each article discovered in the search.

Box 1.

Key search terms.

Concept 1 – Population	Concept 2 – Context	Concept 3 – Concept
International Classification of Disease* OR ICD-10 OR ICD 10 OR ICD-10-AM	Australia OR Australian Capital Territory OR ACT OR New South Wales OR NSW OR Northern Territory OR NT OR Queensland OR QLD OR South Australia OR SA OR Tasmania OR TAS OR Victoria OR VIC OR Western Australia OR WA	purpose OR role OR report*

Inclusion and exclusion criteria

The research team collaboratively developed the inclusion and exclusion criteria. The 11-year period, 2012–2022, was selected to represent the most recently published research use of the focal classifications. Only peer-reviewed research papers published in English were included. The criteria applied during the screening processes are shown in Box 2.

Box 2.

Inclusion and exclusion criteria.

Inclusion criteria	Exclusion criteria
• Research studies (undertaken anywhere in the world) that used Australian-coded ICD-10 or ICD-10-AM data • Mortality studies that used Australian-coded ICD-10 data • Concordance studies and other analyses of the ICD-10 or ICD-10-AM classification • Coded (ICD-10) data from Australian hospital Emergency Departments • Coded (ICD-10) data from disease registries	• Studies that did not use ICD-10 or ICD-10-AM coded data, for example, the International Classification of Diseases-Oncology (ICD-O) and the International Classification of Functioning, Disability and Health (ICF) • Studies undertaken outside Australia and absent Australian ICD-10 or ICD-10-AM coded data • Studies that used the ICD-10 or ICD-10-AM classification only for clinical diagnosis • Studies reporting research only on health funding or reimbusement• Reviews and meta-analyses • Protocols • Editorials • Letters to the editor; short communications; bulletins • Reports; other monographs • The grey literature, including white papers • Conference abstracts and conference full papers

Inclusion criteria

Exclusion criteria

• Research studies (undertaken anywhere in the world) that used Australian-coded ICD-10 or ICD-10-AM data
• Mortality studies that used Australian-coded ICD-10 data
• Concordance studies and other analyses of the ICD-10 or ICD-10-AM classification
• Coded (ICD-10) data from Australian hospital Emergency Departments
• Coded (ICD-10) data from disease registries

• Studies that did not use ICD-10 or ICD-10-AM coded data, for example, the International Classification of Diseases-Oncology (ICD-O) and the International Classification of Functioning, Disability and Health (ICF)
• Studies undertaken outside Australia and absent Australian ICD-10 or ICD-10-AM coded data
• Studies that used the ICD-10 or ICD-10-AM classification only for clinical diagnosis
• Studies reporting research only on health funding or reimbusement• Reviews and meta-analyses
• Protocols
• Editorials
• Letters to the editor; short communications; bulletins
• Reports; other monographs
• The grey literature, including white papers
• Conference abstracts and conference full papers

Eligibility screening

Step 1. In the pre-screening process, the within-scope articles identified in the database search were imported (MR) into Covidence, an online reference manager. Three hundred and fifty-five duplicates were removed using the Covidence tool; a further seven were manually identified by reviewers during the various screening stages.

Step 2. Title and abstract screening were undertaken by two researchers (JL and MR) who separately and sequentially applied the predetermined inclusion and exclusion criteria to each within-scope article. The outcomes of the searches were compared. Intra-researcher discrepancies were reviewed by a third, independent reviewer (KR), who made a final decision.

Step 3. In light of the broad scope and usage of ICD-10 and ICD-10-AM throughout the Australian health care system and to enhance the robustness of the review, it was decided to include a 5% random sample of the within-scope articles’ references lists, where those references met the criteria of English language, peer-review and publication within the 2012–2022 study timeframe. A manual reference search was undertaken (MR) and the abstracts of all potential titles were examined (MR). If the terms “ICD-10” and “Australia” or any Australian state or territory were found in the article, a URL link was transcribed to an Excel spreadsheet. Each article derived from the references sample was then cross-referenced against the existing within-scope articles to ascertain if it had been produced by the original search. Eligible articles were manually added to the within-scope pool as determined in the initial database searches. The inclusion and exclusion criteria were applied (MR, JL). Verification, via consensus, was undertaken independently by another member of the research team (KR).

Step 4. A full-text review was undertaken of each published article that was determined in Steps 2 and 3 to have met the inclusion criteria. This process was undertaken independently and sequentially by two researchers (JL and MR). The outcomes of their searches were compared. Discrepancy resolution was undertaken including discussion and then independent assessment by a third reviewer (KR).

Data extraction

Step 5. An extraction template facilitated the charting of results. This was created (SG) to include collaboratively determined (MR, JL, SG, SR and KR) requisite data items. The following bibliometric and research details were extracted from each within-scope, full-text article: study identification number; title; name of lead author; affiliation/organisation of lead author; contact details of lead author; (first) year of publication (online or journal issue); title of journal; identification of Health Information Manager (HIM) or Clinical Coder (CC) in the author line-up, text or acknowledgements; data source (state and/or country); study aims; study setting; study design; study participants; focus of the coded condition; study timeframe; data source; ICD-10 version; ICD-10 or ICD-10-AM codes; other classifications represented in the study; purpose of the ICD use; purpose category; comments regarding the authors’ demonstrated “coding knowledge” beyond code abstraction and other comments deemed to be relevant.

Step 6. Subsequent to the full-text review, data extraction was undertaken on each within-scope article. This involved four reviewers (MR, JL, SG and SR) working sequentially, in independent pairs. During extraction, the purposes for which the ICD-10 or ICD-10-AM coded data were used in each study were categorised. Up to three purposes per article were recorded using the modified version of the Taxonomy of Data Use developed by Riley et al. (2022) (Supplemental Table S1). A consensus process was undertaken systematically, whereby each article and the reviewers’ comments were further reviewed, amended and verified (KR).

Data analysis

Descriptive analyses, supported by IBM SPSS Version 28.0 (IBM Corp., 2021), were utilised to summarise the results. The articles were grouped according to the broad ICD-10(-AM) chapter titles (per the classification’s Tabular List) to determine the foci of the codes used by the authors of the reported research studies. Articles that reported studies with more than one disease focus were grouped into either a “Multi-morbidity” or “Mortality” category. For example, a study that focused on burns associated with gastrointestinal disease was classified as “Multi-morbidity” as it could fit equally under either of two ICD chapter titles, specifically “Diseases of the Digestive System” and “Injury, Poisoning and Certain other Consequences of External Causes.” Studies that focused on multiple causes of death were classified under “Mortality.” In a situation where more than one condition could be classified into the same disease chapter, this took precedence over the “Multi-morbidity” and “Mortality” categories (i.e. stroke and myocardial infarction would be classified as “Diseases of the Circulatory System”). The following hierarchy was devised (MR) following discussion and agreement amongst the research team. It was used to summarise the ICD purpose-categories and ensure that each study was counted only once:

Studies assigned one purpose-category were allocated to that purpose-category.

Studies assigned two purpose-categories, where one of those categories was “5a Research-observational-codes to select patients,” were allocated to the additional purpose category, given that all studies required the selection of codes.

Studies assigned two or more purposes categories, excluding the scenario in (2) above, were either listed by the combination of those categories if there was a sufficient number of cases (at least five), or allocated to a new category labelled “Multiple.”

Studies assigned two purpose-categories, where one of the categories was “Other,” were allocated to “Multiple” or assigned their own category subject to a sufficient number of cases (at least five).

In the analyses of characteristics of the uses of the ICD classification, articles were grouped for presentation according to the main categories of overall study purpose that arose thematically from the review of the reported research studies. The categories were determined by consensus amongst the full research team (see Supplemental material).

Results

Article selection

The screening of the titles and abstracts of 2103 articles imported from the selected three databases (Scopus, Medline and CINAHL) resulted in 611 articles being accepted for inclusion from the searches. An additional 25 papers were identified through the manual review of the 5% random sample of the references lists of the within-scope articles, resulting in 636 articles for extraction (Figure 1).

Figure 1.

PRISMA (Preferred Reporting Items in Systematic Reviews and Meta-Analysis) flowchart of scoping review (Covidence generated).

Characteristics of studies reported

Table 1 summarises the key characteristics of the studies reported in the within-scope articles. Over 54% of the eligible articles were published from 2019 onwards. The states with the largest populations, New South Wales and Victoria, collectively generated the largest proportion, almost 47%, of the studies. Of the few studies reported in the articles that involved international collaboration, the most commonly associated countries, aside from New Zealand, were the United Kingdom, the United States of America (USA), Canada and several European countries.

Table 1.

Summary of key characteristics of studies reported in within-scope articles.

Key characteristics	Articles (N = 636)	%
Year of first publication
2022	100	15.7
2021	76	11.9
2020	85	13.4
2019	62	9.7
2018	52	8.2
2017	43	6.8
2016	45	7.1
2015	48	7.5
2014	48	7.5
2013	43	6.8
2012	34	5.3
State/Territory
Australian Capital Territory	3	0.5
Northern Territory	11	1.7
New South Wales	154	24.2
Queensland	70	11.0
South Australia	26	4.1
Tasmania	6	0.9
Victoria	143	22.5
Western Australia	104	16.4
Australia-wide	79	12.4
Combined-multiple states and territories	40	6.2
Country
Australia only	601	94.4
Australia and New Zealand (NZ)	18	2.8
Other countries, except NZ	17	2.7
Study design
Descriptive study	322	50.6
Cohort study	220	34.6
Cross-sectional study	25	3.9
Case Control study	16	2.5
Case Review/Series	11	1.7
Ecological study	6	0.9
Before-and-After study	6	0.9
Time-Stratified Case-crossover study	6	0.9
Other^a	24	3.8

RCTs: randomised control trials.

Includes RCT (n = 2), mixed methods (n = 2), qualitative (n = 1), time series (n = 1), diagnostic accuracy (n = 1), quasi experimental (n = 2), predictive modelling (n = 7), mapping studies (n = 2), plus eight studies of indeterminate design.

Most of the study designs utilised by the researcher-authors of the within-scope articles were observational, predominately comprising descriptive (50.6%) and cohort (34.6%) study designs (Table 1). Authors who were identified by the research team as “health information managers” (HIMs) (i.e. possessing a professional degree in health information management), and authors known to be “clinical coders” (CCs), were identified in 24 (3.8%) of the within-scope articles. The author-researchers acknowledged HIM, CC, or health information service involvement, either by name or in general, anonymous terms, in 36 (5.7%) of the articles.

Each within-scope study was classified into themes based on the Riley et al. (2022) modified Taxonomy of Data Use and agreed by consensus amongst the research team. These themes were mortality studies only; studies of ICD-10 or ICD-10-AM coding quality; medical research studies; environmental and public health studies; patient safety, including risk prediction; clinical or practice guidelines; quality of care; health economics and administrative and other study purposes. For ease of presentation, the following selected variables are provided (Supplemental Table S2): First author and year of publication; state/territory and country of study cases/participants; foci of the coded data; ICD version used in the study and primary purpose of the ICD-code(s). Table 2 provides the frequency of each of the major themes of study purpose and shows medical research as the primary purpose in 35% of the studies.

Table 2.

Study purposes classified according to major themes based on modified “taxonomy of data use” theme.

Theme	Number	%
Medical research studies	224	35.2
Environmental and public health studies	106	16.7
Administrative and other study purposes	84	13.2
ICD ascertainment/coding quality	74	11.6
Patient safety, including risk prediction	45	7.1
Mortality only studies	37	5.8
Quality of care	34	5.3
Health economics	21	3.3
Clinical or practice guidelines	11	1.7
Total	636	100.0

ICD: International Classification of Diseases.

Foci of the ICD codes

Table 3 highlights the foci of ICD-codes used in the research studies reported in the within-scope articles. Over 60% (n = 405/636) of the studies fell into 5 of the 22 categories (i.e. the “top five”): “Injury, poisoning and certain other consequences of external causes”; “Diseases of the circulatory system”; “Mental and behavioural disorders”; “Multi-morbidity”; and “Neoplasms.” Studies within the “injury” category focused on conditions such as fractures, adverse drug events and injuries resultant upon trauma, while studies within the circulatory categories predominantly focused on stroke and cardiovascular diseases. Articles that reported studies classified within the mental and behavioural disorders category often focused on multiple mental health conditions or the impact of alcohol and drugs on mental health, rather than on singular conditions. The “Multi-morbidity” category included studies that examined reasons for hospital admissions, hospital acquired complications, reasons for frailty or determination of the relationship between two conditions (e.g. depression and cardiovascular disease). Fifty percent (n = 203/405) of the studies that fell within these “top five” categories were published after 2019 and 22% (n = 89/405, 14% of articles retrieved from the 11-year study period) were categorised within the “Mental health and behavioural” category.

Table 3.

Foci of studies using ICD-coded data.

Categories	Number	%
Injury, poisoning and certain other consequences of external causes	107	16.8
Diseases of the circulatory system	93	14.8
Mental and behavioural disorders	89	14.0
Multimorbidity	84	13.2
Neoplasms	32	5.0
Diseases of the nervous system	29	4.6
Diseases of the digestive system	28	4.2
Certain infectious and parasitic diseases	26	4.1
Diseases of the respiratory system	25	3.9
Endocrine, nutritional and metabolic diseases	25	3.9
External causes of morbidity and mortality	18	2.8
Mortality	17	2.7
Diseases of the musculoskeletal system and connective tissue	13	2.0
Pregnancy, childbirth and the puerperium	11	1.7
Diseases of the genitourinary system	8	1.3
Diseases of the eye and adnexa	6	0.9
Diseases of the skin and subcutaneous tissue	6	0.9
Certain conditions originating in the perinatal period	5	0.8
Symptoms, signs and abnormal clinical and laboratory findings, not elsewhere classified	5	0.8
Congenital malformations, deformations and chromosomal abnormalities	4	0.6
Diseases of the blood and blood-forming organs and certain disorders involving the immune mechanism	3	0.5
Factors influencing health status and contact with health services	2	0.3
Total	636	100.0

ICD: International Classification of Diseases.

Use of ICD versions reported by the study authors

Table 4 summarises the ICD-10 version as described by the authors of the within-scope articles. Most reported having used data coded according to a modification of the ICD-10 (60.3%, n = 322/534). The Australian Modification (i.e. ICD-10-AM or the non-existent “ICD-9-AM”) were specified in 99% (n = 319/322) of these articles. The remaining 1% of articles reported studies that used data coded according to a Clinical Modification (ICD-10-CM or ICD-9-CM). The ICD tenth revision, without specification of modification, was the only descriptor provided in 42% (n = 225/534) of the articles. Twenty-one percent (n = 48/225) of these research publications that used ICD-10 described the use of an additional revision (ICD-8 or ICD-9). “ICD” was the only descriptor provided in seven articles (1%, n = 7/636). Almost 13% of authors (n = 82/636) used any combination(s) of ICD-10 or ICD-10-AM with specialised terminologies or classifications (e.g. the Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT), Diagnostic and Statistical Manual of Mental Disorders (DSM)). These studies were not reported in any of the other ICD category breakdowns. The classifications were not within scope, but are reported here to provide context in relation to their usage alongside the ICD data. The ACHI, International Classification of Diseases-Oncology (ICD-0), SNOMED CT and Australian Refined-Diagnosis Related Groups (AR-DRG) were the most commonly used, specialised classification systems and terminology. Eighty-four percent (n = 533/636) of all articles included the ICD codes used in the research studies.

Table 4.

ICD versions used and presence of ICD codes.

ICD Version (as described in article)	Number of articles	%
ICD Not otherwise specified	7	1.1
ICD Revision specified only
ICD-10	177	27.8
ICD-10, ICD-9	39	6.1
ICD-10, ICD-9, ICD-8	9	1.4
ICD with modification specified
ICD-10, ICD-8, ICD-9-CM	1	0.2
ICD-10, “ICD-9-AM”^a	2	0.3
ICD-10, ICD-9-CM	2	0.3
ICD-10, ICD-10-AM	9	1.4
ICD-10, ICD-10-AM, ICD-9-CM	2	0.3
ICD-10, ICD-10-AM, ICD-10-CA	2	0.3
ICD-10, ICD-10-CM	1	0.2
ICD-10-AM	254	39.9
ICD-10-AM, ICD-9	11	1.7
ICD-10-AM, “ICD-9-AM” ^a	1	0.2
ICD-10-AM, “ICD-9-AM”^a, ICD-9-CM	2	0.3
ICD-10-AM, ICD-9-CM	27	4.2
ICD-10-AM, ICD-10-CM	1	0.2
ICD-10-AM, other^b	1	0.2
ICD-10-CM	6	0.9
Specialised classification used with any ICD-10 or ICD-10-AM^b	82	12.9
TOTAL	636	100
ICD codes provided
Yes	533	83.8
No	103	16.2

ACHI: Australian Classification of Health Interventions; AR-DRG: Australian Refined-Diagnosis Related Groups; ICD: International Classification of Diseases; SNOMED CT: Systematized Nomenclature of Medicine-Clinical Terms.

As stated by the authors of the relevant articles.

Other specified classifications include ACHI, SNOMED CT, ICD-0, ICD-9-BPA, AR-DRG and more.

Purposes of the ICD coded data

Table 5 shows the purposes for which the ICD codes were used within each study. ICD codes were used solely for identifying cases in 44.2% of the within-scope studies. Selection of cases and the corresponding classification of outcome variables such as co-morbidities or mortality were demonstrated in a further 10% of the studies. “Monitoring and surveillance” for public health purposes was the second most frequent use of ICD codes (14.3% of the published studies) and included studies that investigated toxic exposures or evaluated preventive public health measures. The quality of coded data, in terms of both case ascertainment and/or accuracy, was the focus in 11.3% of the published studies.

Table 5.

ICD purpose-categories from within-scope studies.

Purpose category	Number	%
Conduct medical research	281	44.2
To select cases for observational studies	269	42.3
Data linkage for further variables	8	1.3
To select cases for qualitative studies	3	0.5
To select cases for experimental studies	1	0.2
Protect and enhance public health	91	14.3
Public health monitoring and surveillance	79	12.4
Public health – reporting of toxic exposures	9	1.4
Public Health – evaluate preventive measures	3	0.5
(ICD) Data quality – case ascertainment and/or accuracy/quality of coded data	72	11.3
To select cases and code outcomes and/or co-morbidities and/or mortality	64	10.1
Code outcomes and/or co-morbidities and/or mortality	48	7.5
Improve patient safety	31	4.9
Patient safety – treatment and interventions	5	0.8
Patient safety – risk prediction	26	4.1
Multiple	22	3.5
Monitoring and surveillance and quality/ascertainment of codes	6	0.9
Public health monitoring and surveillance and code cause of death	8	1.3
Other multiple	8	1.3
Select cases and calculation of Charlson Comorbidity Index	8	1.3
Codes to inform practice guidelines or clinical decision making	5	0.8
Health economics	5	0.8
Administrative and other	3	0.5
Quality of care	3	0.5
Other	3	0.5
Total	636	100.0

ICD: International Classification of Diseases.

Discussion

This scoping review sought to identify the research applications of Australian-coded ICD-10 and ICD-10-AM data published in the peer-reviewed literature in 2012–2022, and to classify the purposes of these data according to a modified taxonomy of data use framework. Over half of the within-scope articles were published within the last 4 years of the 11-year study timeframe. This trend is consistent with the findings from Bornmann et al. (2021) bibliographic data-based investigation that revealed a very substantial and increasing growth in scientific publications within the same timeframe as the current study.

Publications by authors in the states with the highest populations, New South Wales (NSW) and Victoria, accounted for the largest representation, but were lower, particularly in the case of NSW, than their respective proportions of the national population. In contrast, 16.4% of the articles reported studies by authors in Western Australia (WA), which accounted for 11% of the nation’s population at the end of the study period (Australian Bureau of Statistics, 2023); this was possibly due in part to WA’s long-standing health data linkage infrastructure. The inter-state proportional differences could be accounted for by the fact that almost 19% of the articles had authors from multiple states.

Application of the modified taxonomy of data use framework revealed that just under half of the studies (44.2%) used ICD codes solely for identification of cases/subjects. The finding that over 60% of the articles reported research underpinned or informed by classification to categories of injury and poisonings, circulatory or mental health diagnoses, multiple comorbidities or neoplasms, cannot necessarily be considered to reflect contemporary topical issues in Australian healthcare. Rather, it should be interpreted with caution because descriptive and cohort studies dominated the reported study designs. This imbalance of study types demonstrated a bias that is likely due to the amenability of ICD coded data to these types of medical and epidemiological research study types when compared, for instance, with randomised control trials (RCTs), which use individual subject- and control-level data and are highly prevalent in medical research. It is useful to consider Zhao et al.’s (2022) findings that revealed substantial increases in the volume of published clinical research studies over the past three decades. They found that collectively, cohort, cross-sectional and case–control studies constituted 49% of all clinical study types. They also reported an 18% growth rate by 2020 of cohort and case–control studies, and that by 2018 the number of cross-sectional studies had surpassed the number of RCTs.

The absence of reference to the modification in 42% of the articles reflects mortality studies, but for morbidity studies suggests the relevant authors’ lack of familiarity with the classification. The ICD version and other classifications reported by the authors included some non-existent “classifications” or “versions,” or modifications that were highly improbable. For example, three discrete studies undertaken by hospital-based clinician–authors reported on ICD-10-CM data. During the study timeframe, Australian hospitals used ICD-10-AM whereas hospitals in the USA used ICD-9-CM and ICD-10-CM; furthermore, the research team members’ professional knowledge and consultations with senior HIM-Coders in those institutions indicated that the authors would have been provided with ICD-10-AM coded data. These discrepancies suggest that some researchers had insufficient understanding of the classification and, potentially, of nuances of the coded data that underpinned or informed their research.

The findings that 14% of articles retrieved from the 11-year study period were categorised within the “Mental health and behavioural” category and 39% of this group were published between 2020 and 2022 are possibly reflective of the Australian Government’s mental health policy and national reform priorities during the past decade (Australian Government, Department of Health 2021; Australian Government, Department of Health, National Mental Health Commission, 2017).

Strengths and limitations

The strengths of the review included the number of articles retrieved and analysed, and the inclusion of the 5% random sample of articles from the references lists. One possible limitation was the exclusion from scope of articles that reported research on health service funding or reimbursement; they may have contributed to a more comprehensive picture of the research uses of Australian-coded ICD-10 and ICD-10-AM data. Inevitably, some potentially eligible articles were missed from the search owing to the absence of search terms such as the name of the classification from the title, key words or abstracts. Incorrect or confused reporting by authors of the names of the classifications also created a dilemma. Another potential limitation related to the identification of HIMs’ or CCs’ involvements with the reported research studies. The health information management and clinical coding workforces are relatively small, and all of the research team members have very extensive, Australia-wide professional networks; however, it is possible that some HIM or CC authors may not have been identified during data extraction.

Conclusion

The current review has enriched our understanding of the applications and importance of coded data in the medical and wider health research environments. The findings demonstrate a diverse utility of Australian-coded ICD and ICD-10-AM data in peer-reviewed research studies. Medical and other health researchers’ usage of coded data is extensive and robust. The increasing volume in the past decade of published research that has relied upon clinical codes points to a corresponding, escalating demand for accurate and timely ICD-10 and ICD-10-AM data. This demand will be driven further by the increasing number of cohort and cross-sectional study types in the medical literature which, together with the expanding milieu of health data linkage, foreshadows substantial increases in researchers’ requirements for coded data. This combination of factors will inevitably drive a corresponding need by researchers for informed advice from experienced HIM-Coders and CCs on the applications and interpretation of the classification, the codes, and the coded data, to ensure and enhance research credibility and replicability. An examination of these aspects, in the context of the findings of the current scoping review, will be reported in a future publication.

Supplemental Material

sj-docx-1-him-10.1177_18333583231198592 – Supplemental material for The applications of Australian-coded ICD-10 and ICD-10-AM data in research: A scoping review of the literature

Supplemental material, sj-docx-1-him-10.1177_18333583231198592 for The applications of Australian-coded ICD-10 and ICD-10-AM data in research: A scoping review of the literature by Merilyn Riley, Jenn Lee, Sally Richardson, Stephanie Gjorgioski and Kerin Robinson in Health Information Management Journal

Footnotes

Acknowledgements

The authors thank Hannah Buttery, Librarian at La Trobe University Library, Melbourne, for expert advice during the database searches.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iDs

Merilyn Riley, BAppSc(MRA), BTh, GradDipEpi&Biost

Jenn Lee, BN, GradDipAppSc(HthInfoMgt), MAdEd, CHIA

Sally Richardson, BHSc, MHthInfoMgt, GCertHlthEc

Stephanie Gjorgioski, BHSc(MedClass), BHthInfoMgt, BHSc(Hons), GradCertHECTL, CHIM

Kerin Robinson, BAppSc(MRA), BHA, MHP, PhD, CHIM

Supplemental material

Supplemental material for this article is available online.

References

Arksey

O’Malley

(2005) Scoping studies: Towards a methodological framework. International Journal of Social Research Methodology 8(1): 19–32.

Australian Bureau of Statistics (ABS) (2023) States and territories. Annual population change. Year ending 31 December 2022. Canberra, ACT: ABS. Available at: https://www.abs.gov.au/statistics/people/population/national-state-and-territory-population/latest-release#states-and-territories (accessed 31 July 2023).

Australian Government, Department of Health, National Mental Health Commission (2017) The fifth National Mental Health and Suicide Prevention Plan. Canberra, ACT: National Mental Health Commission.

Australian Government, Department of Health, National Mental Health Strategy (2021) Prevention, compassion, care: National Mental Health and Suicide Prevention Plan. Canberra, ACT: DoH.

Bornmann

Haunschild

Mutz

(2021) Growth rates of modern science: A latent, piecewise growth curve approach to model publication numbers from established and new literature bases. Humanities and Social Sciences Communications 1(8): 224.

Bowker

Star

(2000) Sorting things out: Classification and its consequences. Boston, MA: Massachusetts Institute of Technology.

Byron

McCathie

(1998) Casemix: The allied health response. Medical Journal of Australia. 169(S1): S46–S47.

Deschepper

Eeckloo

Vogelaers

, et al. (2019) A hospital wide predictive model for unplanned readmission using hierarchical ICD data. Computer Methods and Programs in Biomedicine 173: 177–183.

Hay

Abajobir

Abate

, et al. (2017) Global, regional, and national disability-adjusted life-years (DALYs) for 333 diseases and injuries and healthy life expectancy (HALE) for 195 countries and territories, 1990–2016: A systematic analysis for the Global Burden of Disease Study 2016. The Lancet 390(10100): 1260–1344.

10.

IBM Corp (2021) IBM SPSS Statistics for Windows, Version 28.0. Armonk, NY: IBM Corp.

11.

Independent Hospital Pricing Authority (IHACPA) (2019) Australian Coding Standards, 11th edn. Sydney, NSW: IHPA.

12.

Independent Hospital and Aged Care Pricing Authority (IHACPA) (2022) Pricing framework for Australian public hospital services 2023–24. Consultation Report. Sydney, NSW: IHACPA.

13.

Innes

Peasley

Roberts

(2000) Ten down under: Implementing ICD-10 in Australia. Journal of AHIMA 71(1): 52–56.

14.

Liu

Hadzi-Tosev

Liu

, et al. (2022) Accuracy of international classification of diseases, 10th revision codes for identifying sepsis: A systematic review and meta-analysis. Critical Care Explorations 4(11): e0788.

15.

Mays

Roberts

Popay

(2001) Synthesising research evidence. In: Fulop

Allen

Clarke

Black

(eds.), Studying the Organisation and Delivery of Health Services: Research Methods. London, UK: Routledge, pp. 188–2020.

16.

Moriyama

Loy

Robb-Smith

(2011) In Rosenberg

Hoyert

(Eds.), History of the Statistical Classification of Diseases and Causes of Death. Hyattsville, MD: National Center for Health Statistics (USA). DHHS publication No. (PHS) 2011-1125.

17.

Peters

MJD

Godfrey

Khalil

, et al. (2015) Guidance for conducting systematic scoping reviews. International Journal of Evidence-Based Healthcare 13(3): 141–146.

18.

Peters

MDJ

Godfrey

McInerney

, et al. (2020) Chapter 11: Scoping Reviews (2020 version). In: Aromataris

Munn

(Eds.), JBI Manual for Evidence Synthesis. Adelaide, SA: JBI, pp. 407–451.

19.

Riley

Robinson

Kilkenny

, et al. (2022) The suitability of government health information assets for secondary use in research: A fit-for-purpose analysis. Health Information Management Journal. Epub ahead of print 26 April 2022. DOI: 10.1177/18333583221078377.

20.

Roberts

Innes

Walker

(1998) Introducing ICD-10-AM in Australian hospitals. Medical Journal of Australia 169 (8 Suppl): S32–S35.

21.

Roberts

Robinson

Williamson

(2002) Health information policy. In: Gardner

Barraclough

(Eds.), Health policy in Australia, 2nd edn. South Melbourne, VIC: Oxford University Press, pp. 100–121.

22.

Ryan

Riley

Cadilhac

, et al. (2021) Factors associated with stroke coding quality: A comparison of registry and administrative data. Journal of Stroke and Cerebrovascular Diseases 30(2): 105469.

23.

Tricco

Lillie

Zarin

, et al. (2018) PRISMA Extension for Scoping Reviews (PRISMA-ScR): Checklist and explanation. Annals of Internal Medicine 169(7): 467–473.

24.

Wiley

(1992) Hospital financing reform and case-mix measurement: An international review. Health Care Finance Review 13(4): 119–133.

25.

World Health Organization (WHO) (n.d.) Importance of ICD. Available at https://www.who.int/standards/classifications/frequently-asked-questions/importance-of-icd (accessed 14 June 2023).

26.

Zhao

Jiang

Yin

, et al. (2022) Changing trends in clinical research literature on PubMed databases from 1991 to 2020. European Journal of Medical Research 27: 95.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.13 MB