Abstract
We sought to evaluate whether case ascertainment using administrative health data would be a feasible way to identify peripheral artery disease (PAD) patients from the community. Subjects’ ankle–brachial index (ABI) scores from two previous prospective observational studies were linked with International Classification of Diseases (ICD) and Canadian Classification of Interventions (CCI) codes from three administrative databases from April 2002 to March 2012, including the Alberta Inpatient Hospital Database (ICD-10-CA/CCI), Ambulatory Care Database (ICD-10-CA/CCI), and the Practitioner Payments Database (ICD-9-CM). We calculated diagnostic statistics for putative case definitions of PAD consisting of individual code or sets of codes, using an ABI score ⩽ 0.90 as the gold standard. Multivariate logistic regression was performed to investigate additional predictive factors for PAD. Different combinations of diagnostic codes and predictive factors were explored to find out the best algorithms for identifying a PAD study cohort. A total of 1459 patients were included in our analysis. The average age was 63.5 years, 66% were male, and the prevalence of PAD was 8.1%. The highest sensitivity of 34.7% was obtained using the algorithm of at least one ICD diagnostic or procedure code, with specificity 91.9%, positive predictive value (PPV) 27.5% and negative predictive value (NPV) 94.1%. The algorithm achieving the highest PPV of 65% was age ⩾ 70 years and at least one code within 443.9 (ICD-9-CM), I73.9, I79.2 (ICD-10-CA/CCI), or all procedure codes, validated with ABI < 1.0 (sensitivity 5.56%, specificity 99.4% and NPV 84.6%). In conclusion, ascertaining PAD using administrative data scores was insensitive compared with the ABI, limiting the use of administrative data in the community setting.
Keywords
Background
Lower extremity peripheral artery disease (PAD) is an atherosclerotic disease that is often overlooked clinically 1 and is understudied relative to other important medical conditions. 2 Ascertainment of PAD involves measurement of the ankle–brachial index (ABI), which is diagnostic of PAD when less than or equal to 0.90. 3 Despite guideline recommendations to use the resting ABI to establish a PAD diagnosis in patients with exertional leg symptoms, non-healing wounds, age 65 and older, or age 50 with a history of smoking or diabetes, 3 many subjects with PAD are not identified by routine care. 1 Population-based studies evaluating PAD epidemiology using the ABI to ascertain the true prevalence of PAD are expensive and time consuming, and typically limited by small numbers of cases as well as cross-sectional design, precluding analyses of trends over time or estimates of incidence.4–10
Collected as part of routine care, administrative data utilize standard International Classification of Diseases (ICD) codes and procedure codes that describe a patient’s diagnosis and medical care, and are an inexpensive and powerful tool for epidemiology and outcomes research. An important criterion for use of administrative data to study a specific disease state is an adequately accurate case definition, as diagnostic algorithms based on administrative data have variable sensitivity, specificity, and positive and negative predictive values. 11 Often, a single ICD code is inadequate to accurately identify a disease, especially those of low prevalence, and derivation of an algorithm from multiple data points such as age, gender or related disease and procedure codes can provide better case definitions with improved diagnostic statistics. 12
Administrative data have been used to study various forms of PAD, but typically ascertain forms of PAD requiring revascularization,13,14 such as critical limb ischemia.15,16 While validated case definitions for critical limb ischemia exist, 17 most patients with PAD do not have critical limb ischemia, do not require revascularization, and may be asymptomatic 18 ; therefore, relying on hospitalizations or revascularization procedures to ascertain PAD is not sensitive, or biases towards advanced disease or severe symptoms. Fan et al. evaluated the validity of ICD-9 PAD codes as well as a multivariable combination of diagnostic and procedural codes to ascertain PAD compared with the ABI in 22,712 subjects attending the vascular laboratory at the Mayo Clinic between 1998 and 2008 and 4420 subjects from the community. 19 They found that ascertainment of PAD based on ICD-9 diagnostic codes only was limited by low sensitivity (38.7%, 95% confidence interval (CI) 27.6–50.6%) when applied to the community sample, but that incorporation of procedure codes using a multivariable model improved sensitivity (68%, 95% CI 56.2–78.3%) when compared to the gold standard of chart review in the community sample. 19 A key limitation of this study was that ABI was not used for diagnosis in the community sample, which was the final validation set, and the authors observed that: ‘additional work is needed to assess the performance of algorithms for identifying PAD cases and controls at other institutions’. 19
We sought to evaluate whether administrative health data can accurately ascertain PAD in the community when compared to the gold standard ABI, based on ICD diagnostic codes, procedure codes, or a combination of codes and common clinical factors in a multivariable model. 20
Methods
We conducted a validation study of administrative data, including ICD and Canadian Classification of Intervention (CCI) codes, to identify a PAD cohort using the ABI as the gold standard. This study was approved by The Health Research Ethics Board of the University of Alberta.
Study population
We combined data from two prospective cohorts identified in Alberta, Canada, to obtain our dataset. EpiPAD was a prospective study of lower extremity PAD in ambulatory health settings. 21 Patients 50 years of age or older were consecutively screened in community pharmacies (urban and rural), family medicine clinics (urban) and a community cardiology clinic (urban) in Alberta during 2008 with 361 undergoing ABI measurement. Patients were excluded from the study if they: (1) had dementia; (2) had undergone recent major surgery; (3) had a prior lower extremity amputation; (4) were wheelchair bound; (5) had open ulcers or sores on their lower extremities; (6) were medically unstable; (7) or were unable to communicate in English. The ABI study was a prospective observational inception cohort study of ABI as a predictor of outcomes in coronary artery disease. 22 Adults with suspected coronary disease who had been referred for a coronary angiogram were consecutively sampled from the Cardiac Catheterization Lab waiting room and the cardiology inpatient wards at the Mazankowski Alberta Heart Institute and the Royal Alexandra Hospital in Edmonton, Alberta. Patients were excluded from that study if they: (1) were heart transplant patients; (2) were being assessed for pulmonary hypertension, valve disease or congenital heart disease; (3) had open ulcers or sores on their lower extremities; (4) were emergency cases/medically unstable; (5) or were unable to communicate in English. Both outpatients and inpatients waiting for their angiogram were included in the study and had their ABI measured prior to their catheterization procedure. Data collection began in March 2010 and concluded in August 2012 (n=1100).
Reference standard – ABI scores
ABI ⩽ 0.90 was used as the gold standard for ascertainment of PAD. 23 ABI was assessed by a trained research assistant on each patient.21,22 After the patient rested in a supine position for at least 5 minutes, manual non-simultaneous systolic blood pressure was measured at the brachial, posterior tibial, and dorsalis pedis arteries bilaterally using an L150 Summit Doppler (Wallach Surgical, Trumbull, CT, USA) with an 8-MHz vascular probe.21,22 The ABI was calculated as the ratio of the highest systolic pressure of either the dorsalis pedis or posterior tibialis arteries in each leg and the highest systolic pressure in either of the brachial arteries. 22
Administrative data
Three administrative databases were used in this study. The Inpatient Discharge Abstract Database (DAD) and Ambulatory Care Data (ACD) include the diagnosis and procedure codes for patients discharged from any inpatient bed in Alberta. 24 Since 2002, the International Classification of Diseases, Tenth Revision, Canada/Canadian Classification of Health Interventions (ICD-10-CA/CCI) coding has been used where up to 25 diagnosis codes can be listed per patient for DAD and 10 diagnosis codes per patient for ACD, which allows practitioners to input more diagnostic or procedure codes in the system.24,25 This difference did not affect the codes, since all of our patients were enrolled after 2002. The Practitioner Payments Database (PPD) includes the fee for service claims from 1994 to the present with up to three ICD codes.24,25 The PPD only uses the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) coding, and therefore we combined ICD-9-CM with ICD-10-CA/CCI. As inpatients are more likely to have more severe disease and comorbidities, using multiple databases decreases potential selection bias in future research studies that apply our derived PAD case definition since both inpatients and patients in the community are represented. Three administrative databases were linked with the ABI and EpiPAD cohorts using personal health numbers as the unique identifier. We selected ICD codes related to atherosclerosis, cardiovascular disease, diabetes, peripheral artery disease and procedure codes related to the same diseases or conditions for validation (Table 1). Putative case definitions for PAD using individual or combinations of codes, as well as PAD codes combined with other demographic and comorbidity variables ascertained by subjects’ responses and chart information, were evaluated.
Baseline characteristics of study subjects (total n=1459).
PAD, peripheral artery disease; ABI, ankle–brachial index.
Statistical analysis
Descriptive statistics were used for demographic variables. Multivariate logistic regression was performed for developing the optimal algorithm. Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio (LR+) and negative likelihood ratio (LR−) were calculated for each ICD code related to a PAD diagnosis with corresponding 95% CIs. The performances for these codes within the databases informed development of potential algorithms. Once potential algorithms were developed, sensitivity, specificity, PPV, NPV, LR+ and LR− were calculated for each. The ABI is accepted in the literature as a dichotomous measure for PAD where an ABI score of ⩽ 0.90 is considered a diagnosis of PAD. 3 To potentially enhance sensitivity, a repeat of the sensitivity, specificity, PPV, NPV, LR+ and LR− calculations was done using an ABI < 1.0 as the gold standard, which is interpreted as either abnormal or borderline abnormal. 3 We performed similar analyses using an ABI < 0.4 as our reference standard to examine if critical limb ischemia (CLI), as the severe form of PAD, could yield a better performance.
All statistical analyses were performed with the statistical software STATA (version 13.1; StataCorp LP, College Station, TX, USA).
Results
A total of 1459 patients were included in our analyses. Two patients were excluded due to incomplete ABI data. The average age of the study subjects was 63.5 years, and 66% of them were male. There were 23 patients who underwent an endovascular procedure, and six patients who had evidence of CLI. Other standard cardiovascular risk factors, including smoking (58%, 95% CI 56–61%), diabetes (31%, 95% CI 28–33%), hypertension (73%, 95% CI 70–75%), and hyperlipidemia (76%, 95% CI 74–79%) were common. The ABI-based prevalence of PAD in the dataset was 8.1% (Table 1). The PAD-related ICD codes and their associated numbers of cases identified are listed in Table 2.
International Classification of Diseases (ICD) and Canadian Classification of Intervention (CCI) codes for validation with ankle–brachial index scores.
Six of the patients were identified with two procedure codes.
ICD-9-CM, International Classification of Diseases, Ninth Revision, Clinical Modification; ICD-10-CA, International Classification of Diseases, Tenth Revision, Canada; NEC, not elsewhere classified.
Multivariable logistic regression demonstrated that standard risk factors for PAD were associated with PAD in our dataset. Patients who were elderly (odds ratio (OR) 1.08, 95% CI 1.05–1.11), had diabetes (OR 2.12, 95% CI 1.28–3.52) or were smokers (OR 2.45, 95% CI 1.40–4.31) had significant higher odds of having PAD (Table 3).
Results of multivariable logistic regression of ankle–brachial index ⩽ 0.90.
CI, confidence interval.
The validity of ICD-9-CM and ICD-10-CA/CCI for each code is shown in Tables 4 and 5. The sensitivity for each single ICD code was low, ranging between 0.85% and 30.5%. The prevalence of PAD was lower based on ICD codes with the prevalence measured at only 1% when ascertained by 443.9 (ICD-9-CM). The most specific ICD code in the ICD-9-CM coding system identifying PAD patients was 443.9, with sensitivity 10.1% (95% CI 4.7–18.3%), specificity 99.4% (95% CI 98.7–99.8%), PPV 64.3% (95% CI 35.1–87.2%), NPV 91.6% (95% CI 89.7–93.3%), LR+ 17.8 (95% CI 6.09–51.9), and LR− 0.90 (95% CI 0.84–0.97). For the ICD-10-CA/CCI codes, the ICD code identifying PAD patients with the highest PPV was the procedure code KG.57.^^, with sensitivity 3.39% (95% CI 0.93–8.45%), specificity 99.7% (95% CI 99.2–99.8%), PPV 50% (95% CI 15.7–84.3%), NPV 92.1% (95% CI 90.6–93.5%), LR+ 11.4 (95% CI 2.88–44.9), and LR–0.97 (95% CI 0.94–1.00).
Validity of ICD-9-CM codes compared with ankle–brachial index ⩽ 0.90.
Diagnostic codes or procedure codes with no cases are not shown.
95% confidence intervals are shown in square brackets.
ICD-9-CM, International Classification of Diseases, Ninth Revision, Clinical Modification; PPV, positive predictive value; NPV, negative predictive value; LR+, positive likelihood ratio; LR−, negative likelihood ratio.
Validity of ICD-10-CA/CCI codes compared with ankle–brachial index ⩽ 0.90.
I70, I70.9, I73 not shown since there were no cases.
Combination: at least one of the above ICD-10 codes.
95% confidence intervals are shown in square brackets.
ICD-10-CA/CCI, International Classification of Diseases, Tenth Revision, Canada/Canadian Classification of Interventions; PPV, positive predictive value; NPV, negative predictive value; LR+, positive likelihood ratio; LR−, negative likelihood ratio.
For putative case definitions for PAD, including those that combine codes with additional comorbidity or demographic data, the highest sensitivity of 34.7% (95% CI 26.2–44.1%) was obtained by the algorithm of at least one ICD PAD diagnostic code, with specificity 91.9% (95% CI 90.4–93.3%), PPV 27.5% (95% CI 20.5–35.4%) and NPV 94.1% (95% CI 92.7–95.3%; Table 6). The case definition achieving the highest PPV of 65% (95% CI 40.8–84.6%) was age ⩾ 70 years and at least one code within 443.9 (ICD-9-CM), I73.9, I79.2 (ICD-10-CA/CCI) or any procedure code, validated with the alternative gold standard of ABI < 1.00, which is interpreted as abnormal or borderline abnormal, with sensitivity 5.56% (95% CI 2.99–9.31%), specificity 99.4% (95% CI 98.8–99.8%) and NPV 84.6% (95% CI 82.7–86.5%).
Combinatorial case definition of peripheral artery disease using administrative data.
Definition 1: At least one of the codes listed in the proposed codes list.
Definition 2: At least one of ICD 443.9, I73.9, I79.2 or any procedure code.
Definition 3: Definition 2 and age ⩾ 65.
Definition 4: Definition 2 and age ⩾ 70.
Definition 5: Definition 4 validated with ankle–brachial index < 1.0 and as the gold standard.
Definition 6: Definition 4 validated with ankle–brachial index < 0.4 and as the gold standard.
95% confidence intervals are shown in square brackets.
ICD, International Classification of Diseases; PPV, positive predictive value; NPV, negative predictive value; LR+, positive likelihood ratio; LR−, negative likelihood ratio.
Discussion
We found that using administrative data to identify PAD patients was specific but not sensitive in the community. The highest sensitivity of only 34.7% was obtained by ascertaining PAD using at least one ICD diagnostic or procedure code, with specificity 91.9%. The most specific algorithm with the highest PPV of 65% was age ⩾ 70 years and at least one code within 443.9 (ICD-9-CM), I73.9, I79.2 (ICD-10-CA/CCI) or any procedure code, but this result was achieved by relaxing the ABI criterion to ABI < 1.0 (sensitivity 5.56%, specificity 99.4% and NPV 84.6%). Our data support that many true cases of PAD, defined as ABI ⩽ 0.90, which were asymptomatic or minimally symptomatic, are not yet identified in the community, and thus do not appear within the administrative data within the healthcare system. There are at least two key implications of these findings: (1) PAD remains under-diagnosed in the community, and (2) administrative data are likely too insensitive to be reliably used to ascertain the complete burden of PAD in epidemiological research, since most cases would not be identifiable using administrative data alone.
Fan et al. 19 found that ascertainment of PAD based on ICD-9-CM codes had 68–85.5% sensitivity and 82.6–87.6% specificity in their Mayo Clinic dataset, but also found low sensitivity (38.7%, 95% CI 27.6–50.6%) when their ascertainment algorithm was applied to the community sample using chart review as the gold standard. Our study, in which the ABI was used as the gold standard for all subjects, confirmed a relatively low sensitivity of 34.7% (95% CI 26.2–44.1%) for ascertaining PAD using administrative data. The better performance of administrative data in the Mayo Clinic subset used in the Fan paper might be explained by differences in clinical practice within and outside the Mayo Clinic, including different rates of use of screening ABI tests, and different PAD prevalence within and outside the Mayo Clinic. However, both our data and the data from Fan et al. support that administrative data are an insensitive approach to ascertain PAD in the community.
Limitations
Our study has limitations. The inclusion of the participants in the study was not random, but they were consecutively sampled. Also, a large portion of our participants were referred for a coronary angiogram, and so it is possible our sample differs from the community population due to referral bias. However, since the prevalence of PAD we found in our sample is similar to that expected based on demographic and comorbidity data, we believe our sample likely does accurately reflect our community. Moreover, it is likely that any bias in our sample would be towards having more cases of PAD, rather than less, and so the low sensitivity for ICD codes for ascertaining PAD in our sample would likely also be seen in a true population-based sample. Also, some selection bias may have existed, as patients who had a prior lower extremity amputation were excluded from the EpiPAD dataset. We were unable to include patients with non-compressible vessels in our PAD definition as we did not have data on the toe–brachial index, which may underestimate the accuracy of the coding. It is possible that some values of ABI could possibly have been measured after remote lower extremity revascularization outside Alberta, which might result in some misclassification bias. In addition, our study evaluated data from the province of Alberta, which has a single-payer universal healthcare system, and our findings may not be completely generalizable to other regions with different healthcare providers.
Strengths
Our study also has several strengths. First, we used the ABI to ascertain PAD in all subjects. To our knowledge, this is the first study to use the ABI to validate diagnostic accuracy of administrative data using subjects not recruited from a vascular laboratory. In addition, not only did we investigate the validity of each proposed diagnostic code, but we also utilized multivariate logistic regression to create an algorithm from the administrative dataset to improve diagnostic accuracy. In addition, we also explored the reference standard of ABI < 1.0, which enabled us to achieve the highest PPV of 65%. We also examined the CLI standard of ABI < 0.4 to explore if there is a difference for sensitivity and specificity of the coding in subjects with more severe forms of PAD. However, the validity was still low. The case definition achieving the best result was age ⩾ 70 plus at least one of ICD 443.9, I73.9, I79.2 or any procedure code (sensitivity 33.3%, specificity 98.8% and PPV 10%, NPV 99.7%, LR+ 26.9, LR− 0.675). Since our data come from Alberta, with a single-payer universal healthcare system, our findings may be generalizable to those regions with similar healthcare systems.
Conclusion
In conclusion, administrative data are specific but not sensitive for the ascertainment of PAD when compared to ABI ⩽ 0.90, suggesting that PAD is under-diagnosed in the community and that administrative data cannot be used reliably to identify all forms of PAD in the community. While restrictive case definitions can increase PPV to more accurately identify true cases, this approach is very insensitive.
Footnotes
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: MS McMurtry is funded by Heart and Stroke Foundation of Canada.
