Abstract
Administrative data have been used to identify patients with various diseases, yet no prior study has determined the utility of International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM)-based codes to identify CLI patients. CLI cases (n=126), adjudicated by a vascular specialist, were carefully defined and enrolled in a hospital registry. Controls were frequency matched to cases on age, sex and admission date in a 2:1 ratio. ICD-9-CM codes for all patients were extracted. Algorithms were developed using frequency distributions of these codes, risk factors and procedures prevalent in CLI. The sensitivity for each algorithm was calculated and applied within the hospital system to identify CLI patients not included in the registry. Sensitivity ranged from 0.29 to 0.92. An algorithm based on diagnosis and procedure codes exhibited the best overall performance (sensitivity of 0.92). Each algorithm had differing CLI identification characteristics based on patient location. Administrative data can be used to identify CLI patients within a health system. The algorithms, developed from these data, can serve as a tool to facilitate clinical care, research, quality improvement, and population surveillance.
Introduction
Critical limb ischemia (CLI), the most severe manifestation of lower extremity peripheral artery disease (PAD), is associated with high rates of cardiovascular ischemic events, amputation, and death, and a very high health economic cost.1 –4 Nevertheless, major gaps exist in our understanding of the distribution of this disease across the general population and within health systems. 5 Currently, the population incidence and prevalence of CLI in North America remains incompletely defined. Population-based estimates of CLI incidence and prevalence have been extrapolated from ischemic amputation rates4,5 and percutaneous and operative revascularization rates, yet these rates only reflect part of the disease burden. In addition, little is known about the distribution of CLI patients across a health system’s different sites of care. The study of CLI is challenging without a uniform methodology for patient identification.
Administrative data may serve as a reliable source from which to address these CLI knowledge gaps and challenges. Administrative data contain longitudinal information that describe a patient’s diagnostic history and medical care and are increasingly being used to identify patients with specific diseases. Administrative data have been used to identify patients with PAD, and to derive population-based ischemic amputation rates.4,5 Although administrative data have limitations, including variable coding accuracy, and thus possible underreporting of chronic conditions, these data sources have the advantages of being timely, relatively inexpensive and are available in most health systems. The usefulness of this method, however, varies widely by disease state.6 –10
No prior studies have evaluated the utility of administrative data as a clinical tool to facilitate CLI identification. Identification of a health system’s CLI population may be useful in facilitating improved clinical care, could aid subject recruitment for clinical research, serve as the base for health system quality improvement, and could facilitate population-based surveillance. In light of this opportunity, the authors developed a methodology to link International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes to physician-diagnosed CLI cases, from which algorithms were derived, validated, and applied to a health system to identify (‘enumerate’) CLI patients. The term ‘enumeration’ was selected to describe the effort to detect (or list) all of the patients (elements) within the larger set of health system CLI patients, as this term is commonly used in mathematics and theoretical computer science. The authors hypothesized that administrative data-based algorithms could be used to accurately identify CLI patients across several care settings within a health system.
Methods
Study sample
We utilized data from the FReedom from Ischemic Events – New Dimensions for Survival (FRIENDS) registry, 11 a single tertiary care hospital CLI registry which was designed to collect health service research data for consecutively enrolled patients with severe PAD, including CLI and acute limb ischemia (ALI). From February 2007 to December 2009, 200 patients with limb ischemia (126 CLI, 74 ALI) were enrolled. Severe PAD was defined and adjudicated by the admitting vascular specialist according to the American College of Cardiology/American Heart Association (ACC/AHA) 1 and Trans-Atlantic Inter-Society Consensus for the Management of Peripheral Arterial Disease (TASC II) guidelines. 3 ALI was clinically defined as any sudden decrease in limb perfusion causing a potential threat to limb viability of less than or equal to 14 days in duration. CLI was defined, per these clinical care guidelines, as Rutherford class 4, 5 or 6 (presence of symptoms for more than 2 weeks with ischemic rest pain and minor or major tissue loss). An objective limb perfusion measurement, such as measurement of ankle or toe pressures, was encouraged, but not required for study entry. All cases were initially ascertained by a board certified and hospital credentialed vascular surgeon and medical record documentation was required immediately on admission (not retrospectively) for establishment of the presence of ALI or CLI. The cases were then additionally reviewed by two vascular medicine specialist clinical investigators (ATH and HHK) and confirmed to be appropriate for registry enrollment. Thus, enrollment in the registry served as the gold standard to define severe PAD (Figure 1).

Study design: algorithm derivation, performance evaluation and application.
Data on past cardiovascular history, atherosclerosis risk factors, etiology of limb ischemia, PAD revascularization treatment modality, and use of medications were abstracted from the medical record. Coronary artery disease (CAD) was deemed to be present if the patient had documented stable angina, previous myocardial infarction (MI), a history of percutaneous coronary intervention or a history of coronary artery bypass graft surgery. Cerebrovascular disease (CVD) was present if the patient had a documented history of transient ischemic attack (TIA), a history of ischemic or hemorrhagic stroke, or a history of carotid stenting or carotid endarterectomy. A history of PAD was deemed to be present if a patient had prior documented asymptomatic PAD (ankle–brachial index <0.9), intermittent claudication, or a prior episode of ALI or CLI. Risk factor documentation included diabetes, hypertension, hypercholesterolemia, current smoking and obesity. A family history of premature CAD was defined by chart documentation of a family history of premature CAD or any MI in a first-degree relative <55 years (men) or <65 (women). Major amputation was defined as amputation at the trans-metatarsal level or higher. Minor amputation was defined as amputation of the distal foot or toes. 12 All data were collected via use of a custom-designed, standardized case report form.
As the focus of this work was on CLI, the sample was restricted to the 126 patients with known vascular specialist diagnosis of CLI. This group was termed the ‘derivation population’. A frequency-matched sample of 252 patients (2:1 ratio) with the same age and sex distribution as the cases, who were admitted to the tertiary hospital during the same time period, and who did not have a known diagnosis of PAD according to their electronic medical record problem list, were selected to serve as controls (Figures 1 and 2). Descriptive statistics were calculated and data that were not normally distributed were analyzed by non-parametric methods. The demographic characteristics of patients with and without an index pedal pressure measurement, as an objective measure of limb ischemia, were compared. The mean pedal pressure was 24 mmHg (SD 19), n=24; mean ankle–brachial index (ABI) was 0.18 (SD 0.14), n=27.

Description of source population. *Enumeration cohort: larger population of critical limb ischemia (CLI) patients in the single urban tertiary hospital, including those not captured in the CLI Registry.
Definition of populations
The ‘derivation’ population is made up of the 126 CLI patients who were enrolled in the registry and who were matched by age, sex and hospital admission period to a non-CLI patient group in order to create the algorithms. The ‘enumeration’ population was derived using the above algorithms and encompasses all additional potential CLI patients who had hospital encounters in the same tertiary hospital during the registry enrollment period, but who were not enrolled in the registry.
Algorithm derivation, performance, and enumeration
Algorithm derivation
Figure 1 displays the methodology used for CLI identification. ICD-9-CM codes for inpatient and outpatient encounters for each individual, including principal diagnosis ICD-9-CM code, ICD-9-CM codes for secondary diagnoses, and principal procedure ICD-9-CM code occurring during the study period, were extracted from the electronic health record database by a health information specialist. We did not utilize current procedural terminology (CPT) codes since we wanted to obtain internationally applicable algorithms. For both cases and controls, the authors responsible for collection and analysis of administrative data were blinded to the chart review. EPIC (EPIC Systems Corp., Verona, WI, USA) was the electronic health record used both for chart review and abstraction of ICD-9-CM codes.
Administrative data for CLI cases were evaluated for the presence of one or more ICD-9-CM codes that are commonly used when care is provided to patients with CLI within the following categories defined a priori: (a) standard CLI and general PAD diagnostic codes ([440.22 rest pain, 440.23 ulceration, 440.24 gangrene], and 443.9 peripheral vascular disease unspecified); (b) lower extremity arterial revascularization procedures (39.25 aorto-iliac-femoral bypass, 39.29 other peripheral vascular bypass, 39.5 other repair of vessels, 38.18 lower extremity endarterectomy); (c) limb amputation (84.xx); or (d) diabetes (250.xx). The ICD-9-CM codes were then grouped to define three broad categories, and the frequency of each code for the registry-enrolled CLI patients tabulated (Table 1). Category A included codes pertaining to CLI signs and symptoms and non-specific PAD, and was termed ‘CLI clinical codes’. Category B included codes for leg revascularization procedures and amputation, and was termed ‘CLI procedure codes’. Category C comprised codes for diabetes and was termed ‘diabetes codes’.
Categories of International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes used in the development of algorithms in the derivation population.
Based on codes in any position.
CLI procedure codes: code frequencies are based on the primary procedure code only. For example, of the 126 FRIENDS participants, 22 had a procedure code of 84.xx.
Based on combinations of these three categories, five algorithms for identifying CLI patients were developed (Table 2). Owing to a high prevalence of diabetes,13 –15 revascularization and amputations in CLI patients,16 –18 several combinations of categories that defined CLI procedures – and/or diabetes – were developed in order to assess the impact of these combinations on algorithm-based CLI detection characteristics. Algorithm 1 was defined by the CLI clinical codes only. Algorithm 2 required the presence of both CLI clinical and procedure codes. Algorithm 3 was defined by CLI clinical codes and diabetes codes. Algorithm 4 required CLI clinical codes, CLI procedure codes and diabetes codes. Algorithm 5 was defined by CLI clinical codes or CLI procedure codes.
Sensitivity of administrative data-derived algorithms for CLI.
Algorithms use ICD-9-CM diagnosis and procedure codes.
Kappa statistics used to quantify agreement between the registry (derivation cohort) and the claims data (administrative cohort). Higher values indicate better agreement.
CLI clinical codes: 440.22, 440.23, 440.24, 443.9.
CLI procedure codes: 84.xx, 39.25, 39.29, 39.5, 38.18.
Diabetes codes: 250.xx.
CLI, critical limb ischemia.
Algorithm performance
Controls were frequency matched to cases in a 2:1 ratio based on age, sex and time of admission to the hospital. The sensitivity of each algorithm was calculated (Table 2). Sensitivity was defined as the proportion of patients known to have CLI who were correctly identified via an administrative data-based algorithm (true positives). We calculated Kappa statistics (typically used to quantify inter-rater agreement) to quantify agreement between the registry (derivation cohort) and the claims data (administrative cohort). These statistics do incorporate the control sample data, which sensitivity does not (Table 2).
Algorithm enumeration
Subsequently, the algorithms were applied to the same hospital system to determine the number of additional CLI cases that had received care during the same time period as registry enrollment, but that were not identified for inclusion in the registry. This population was termed the ‘enumeration population’ (Figure 2). A descriptive analysis of care sites was performed.
All analyses were performed using SAS software (version 9.2; SAS Institute, Inc., Cary, NC, USA). This study was approved by the Institutional Review Boards of the University of Minnesota and Allina Hospitals and Clinics, and was performed in accordance to their regulations.
Results
Derivation population
Baseline characteristics of the derivation population are described in Table 3. The mean age (SD) of cases was 75 (12) years and 63% were male. Diabetes was present in 48% of the patients and 79% had a prior history of PAD. This cohort reported classic Rutherford classification symptoms in all patients, including ischemic rest pain (28%), foot ulceration or gangrene (37%), or both rest pain and ulceration (36%). During the index hospitalization at time of registry enrollment, 78% underwent a revascularization procedure, and 26% underwent a primary amputation. The median number of ICD-9-CM codes per individual patient was 16 (range: 2–44), compared to 11 (range: 2–29) among controls. Individuals with available index hospitalization pedal pressures were identical in all key demographic characteristics to individuals without immediate pedal pressure measurements (Table 3).
Characteristics of derivation population of patients in those with vascular specialist-diagnosed CLI (A), and in those with (B) and without (C) objective pedal pressures.
Values are n (%) or mean ± SD.
Patients with CAD had more than one of MI, PCI, CABG or unstable angina.
Patients with CVD had more than one of stroke, TIA or carotid artery stenting/endarterectomy.
ABI information was present in 27 patients.
Minor amputation: amputation of the distal foot or toes. Number in parentheses is percent of amputees that underwent minor amputation.
Major amputation: amputation at the transmetatarsal level or higher. Number in parentheses is percent of amputees that underwent major amputation.
ABI, ankle–brachial index; BMI, body mass index; CABG, coronary artery bypass graft; CAD, coronary artery disease; CLI, critical limb ischemia; CVD, cerebrovascular disease; MI, myocardial infarction; PAD, peripheral artery disease; PCI, percutaneous coronary intervention; TIA, transient ischemic attack.
Performance
Controls and cases were frequency matched in a 2:1 ratio. The mean age (SD) of controls was 77 (12) years and 63% were male. The most common ICD-9-CM codes among controls were 428 (86%), 401.9 (84%), 272 (77%), and 402 (70%). The algorithms were developed using ICD-9-CM codes outlined in Tables 1 and 2. Algorithms were applied to cases and controls. The sensitivity of each algorithm identifying patients with CLI is shown in Table 2. Compared to individual codes which had sensitivity ranging from 0.10 to 0.56, algorithm 1 (0.75), algorithm 2 (0.59), and algorithm 5 (0.92) had higher sensitivity in identifying CLI patients (Tables 1 and 2). The kappa statistics mirror the sensitivity values, with both algorithm 1 and algorithm 5 indicating very good agreement between the registry and the administrative data.
Enumeration
In order to determine how many additional CLI patients had hospital encounters during the registry enrollment period but who were not enrolled in the registry (enumeration population), the algorithms were applied to the entire hospital patient population over the same period as the registry. Demographic characteristics of patients identified by each algorithm are presented in Table 4. Across each algorithm, patients identified in the enumeration population were younger than those in the derivation population. The percentage of patients with diabetes depended on whether diabetes codes were used in developing the algorithms (16% to 100%). CAD prevalence ranged from 2% to 33%, compared to 57% in the derivation population.
Characteristics of enumeration population by algorithm (includes derivation population).
Values are n (%) or mean ± SD.
Based on initial healthcare encounter included in the data set.
In order to emulate the early detection goals for CLI clinical practice, we classified patients’ care site according to their first medical encounter for CLI during the study period. The algorithms exhibited differential ability to identify CLI patients. Algorithms 5 and 1 identified the greatest number of patients (2944 and 1477, respectively), with algorithms 2, 3, and 4 identifying a substantially lower number of patients. The majority of CLI patients identified by algorithm 1 were first evaluated within the hospital imaging centers (68%); algorithm 2 identified patients dually at both the imaging and procedure-based sites of care (33% and 45%, respectively); algorithm 3 preferentially identified patients in the wound clinic (68%); patients identified by algorithm 4 were most commonly provided their first episode of care in the inpatient setting; and algorithm 5 identified both inpatients with CLI and patients who underwent imaging procedures (48% and 31%, respectively) (Table 5).
Distribution of first medical encounter by care site for each algorithm: enumeration population (includes derivation population).
Values are n (%).
Most common of the other sites include: hospital cardiovascular diagnostic services area, surgical services area, and clinic laboratory.
Discussion
This study demonstrates that administrative data, consolidated into specific algorithms based on diagnoses, procedures, and diabetes, can identify CLI patients, and do so with a level of accuracy that is similar or superior to that reported for other disease states. The application of these five distinct algorithms was associated with variable sensitivity for CLI. Each algorithm identified patients who received a first episode of CLI care in different locations in this health system. The use of these claims-based algorithms, in contrast to simple use of one or two codes, provides a diagnostic advantage, and also offers confidence that this approach can be utilized across many clinical care settings.
Previous studies have examined the utility of administrative data to identify patients with diverse medical conditions, including diabetes,8,9,20 hypertension,7,20 PAD 20 and other diseases.8,10,19,20 The present study is the first to show that administrative claims data are useful and accurate in identifying patients with CLI to facilitate clinical care, clinical research, health system quality improvement, and population-based surveillance. Prior reports of the use of administrative data to identify various patient populations have shown sensitivities of 67% (urinary tract infections), 36–69% (heart failure), 76% (acute myocardial infarction), 83% (diabetes mellitus), 65% (hypertension), and 29% (PAD).20 –22 Hebert et al. 9 applied a similar Medicare data-based approach to identify diabetic patients using different algorithms, and reported sensitivity values ranging from 5% to 79%. The sensitivity of our algorithms ranged from 29% to 92%.
The algorithms showed varied sensitivities for CLI identification. Algorithms 1 and 5, which required only CLI diagnoses, or CLI diagnoses or procedure codes, respectively, displayed the highest sensitivity (0.75 and 0.92, respectively). This high sensitivity is a logical finding given that the majority of patients with CLI would be expected to have documentation of rest pain, non-healing wounds, or gangrene. On the other hand, while some CLI patients may not have prior chart documentation of the clinical ICD-9-CM codes, they may require an endovascular procedure, bypass surgery or progress to amputation before identification of the disease. When both CLI diagnoses and diabetes codes are required in an algorithm (algorithm 3), it reduced the sensitivity (0.33), because although diabetes was highly prevalent, it was present in only 48% of patients in the derivation population, consistent with prior reports of diabetes prevalence in CLI (44–50%).23,24 The requirement of a CLI revascularization procedure in addition to a CLI diagnosis and diabetes (algorithm 4) further reduced the sensitivity, as would be expected (0.29). Finally, when both CLI diagnoses and procedures were required in an algorithm (algorithm 2), the sensitivity was also reduced, but only modestly (0.59). This is reflective of the high proportion of CLI patients (78%) that underwent endovascular procedures.
Application of this approach within this hospital setting demonstrated that an effort to recruit sequential subjects into a severe PAD registry is associated with limitations. The enumeration population is not identical to the derivation population, as the larger enumeration cohort consisted of individuals who were younger, and had less co-morbidity (atrial fibrillation, hypertension and diabetes) than those actively enrolled in the CLI registry. The use of these algorithms to unmask the larger contemporaneous CLI population offers insight into why the larger CLI cohort was not enrolled into the prospective registry. It is known that women are commonly under-enrolled in PAD clinical investigations. 25 Similarly, it is known that most PAD registries are not representative of real-world PAD ‘at risk’ populations and do not accurately report procedural outcomes. 5 Current efforts to create new PAD outcomes registries, based on surgical or cardiac catheterization laboratory recruitment motifs, 26 are also unlikely to fully represent the PAD or CLI populations within a health system. Our algorithms permit a wider population to be identified and enrolled in such studies.
This study highlights the utility of administrative claims-based algorithms in identifying CLI patients in different clinical care locations within a health system. It is not surprising that algorithm 5 identified patients who were more likely to be an inpatient or to undergo invasive procedures and imaging, since this algorithm was created with codes that utilize clinical signs and symptoms of CLI, or invasive procedures used to treat CLI. Algorithm 3 on the other hand, identified patients first seen in the wound clinic, as this algorithm utilized diabetes codes and such CLI patients with non-healing wounds are known to more likely be provided care at wound clinic care sites. The flexibility and adaptability of these algorithms can be of use to vascular and primary care clinicians in identifying patients to participate in CLI quality improvement initiatives or by clinical investigators who seek to define a representative population for enrollment in registries or clinical trials.
For many years, administrative data have been used to identify patients with specific disease conditions, and this approach often has been utilized to improve clinical care. For example, the Health Care Financing Administration (HCFA) launched the Health Care Quality Improvement Initiative (HCQII) in 1992 as a means of using administrative data to identify patients for national Medicare quality improvement projects. 27 Weiner 28 also showed that administrative data can be used to measure the quality of office-based care provided to elderly patients with diabetes, and to help support clinical quality improvement. Furthermore, administrative data have been used to identify at-risk patients within the Medicare population or in specific health systems, comparing quality of care metrics with guideline recommendations in order to promote quality improvement initiatives and monitor improvement over time.29,30 The present study demonstrates that CLI patients who are less likely to be included in a prospective registry can be identified using administrative data. Increased identification of these patients can lead to improvement in care and clinical outcomes by stimulating targeted quality improvement strategies. 31
This study has some limitations. First, prior research studies have noted the limited accuracy of administrative data to identify patients with some disease conditions. However, administrative data-based approaches still retain an important role in research. ICD-9-CM codes are widely used in all modern health systems, serve as the core of most electronic medical records, and can identify large numbers of patients with relative ease and low cost. Second, all patients included in this study (both registry and non-registry patients) were evaluated from a single health system, and the algorithms were created based on the clinical characteristics of this particular population. Hence, generalization of algorithm accuracy to other clinical populations must be prospectively tested and adapted as needed to optimize performance. Third, specificity could not be reliably determined since very few controls had PAD and thus all algorithms had specificity >0.99. Fourth, only a few individuals in this study had objective pedal pressure measurements. However, those without pedal pressures were similar to those who had pedal pressure measurements. Despite these limitations, our study has several key strengths including the prospective validation of cases by vascular specialists and the selection of controls by careful chart review. Finally, this study is the first to link the ICD-9-CM codes used in administrative databases to physician-diagnosed CLI cases to verify the accuracy of identifying CLI patients.
Conclusion
Algorithms derived using administrative data can be used to identify patients with CLI across various care settings within a health system. The patients identified by this approach will include a much wider population of individuals with CLI than might be recruited using any individual approach (e.g. recruitment from a single vascular practice; from a surgical or endovascular laboratory). Methods designed to improve the early identification of CLI patients, as provided by this approach, could facilitate creation of new severe PAD cohorts that could help close the CLI natural history knowledge gaps that currently exist. Health system-based CLI cohort identification may be a key factor for future efforts to facilitate clinical research, and to support health system quality improvement initiatives for CLI patients.
Footnotes
Declaration of conflicting interest
ATH reports research grants from Abbott Vascular, AstraZeneca, ViroMed, and Pluristem, and consulting relationships with Merck and Novartis. The other authors report no conflicts of interest.
Funding
This investigator-initiated study was supported in part by a grant to the University of Minnesota from Aastrom Biosciences (Ann Arbor, MI, USA).
