A scoping review of electronic phenotyping methodologies used to identify peripheral artery disease in observational studies

Abstract

Billing data including International Classification of Diseases (ICD) codes are increasingly used to identify cohorts of patients with peripheral artery disease (PAD) in electronic health records (EHRs) and administrative claims databases (ACDs). However, the validity of common PAD phenotyping approaches is a central challenge to the utilization of EHR and ACD data. We present a scoping review of contemporary PAD observational studies to describe the electronic phenotyping strategies employed in PAD identification and propose recommendations for improvement. We searched two databases, MEDLINE and Web of Science, identifying a total of 748 articles that underwent title and abstract review. Of these articles, 163 met the criteria for full-text review, with 84 articles ultimately included in the study. We demonstrate that 19.0% of eligible studies utilized ICD, Ninth Revision (ICD-9) codes, 11.9% utilized ICD, Tenth Revision (ICD-10) codes, and 69.0% of studies utilized a combination of ICD-9 and ICD-10 codes in their electronic phenotyping methodology. Of the included studies, 76.2% utilized a single-code query approach for electronic phenotyping despite low diagnostic yield, and 21.4% utilized rule-based methods. Only five studies utilized logistic regression modeling, despite the demonstrated effectiveness of this method. The current study demonstrates high utilization of unreliable electronic phenotyping methods such as single-code-based queries, which severely limits research quality. Improvements in electronic phenotyping methods are necessary to leverage data from EHRs and ACDs for high-quality research.

Keywords

administrative data electronic phenotyping ICD codes peripheral artery disease (PAD)quality improvement

Introduction

Peripheral artery disease (PAD) is a common disease affecting over 230 million people globally with significant morbidity, mortality, and decreased quality of life associated with disease progression.¹ Despite its widespread prevalence and risk for adverse outcomes, it remains underappreciated compared with other atherosclerotic disease processes such as coronary artery disease and cerebrovascular disease.^2,3 To address the systematic understudying of PAD, the American Heart Association has published PAD-related gaps in research, clinical practice, and implementation to encourage cross-collaboration between researchers, clinicians, and government agencies to increase the awareness and understanding of this disease.³

One way to enhance research in PAD is to leverage the investigative potential of administrative claims databases (ACDs) and electronic health records (EHRs). These data sources provide low-cost and readily available clinical information on small or large populations, which has resulted in their widespread application in studies of epidemiology, quality improvement, pharmacovigilance, clinical effectiveness, and clinical trial recruitment.⁴ However, because data are coded into EHRs and ACDs in heterogeneous, incomplete, and complex ways, it can be exceptionally challenging to accurately identify cohorts of patients with PAD via a process called electronic phenotyping.

Traditionally, the identification of phenotypes of interest in EHRs has relied on rule-based approaches where clinical experts use structured data such as laboratory values, imaging reports, and medication data to create inclusion and exclusion criteria often based on consensus guidelines. Though a multimodal rule-based approach is achievable within an EHR, most ACDs do not routinely collect granular clinical information, and phenotyping relies solely on billing data such as International Classification of Diseases (ICD) and Current Procedural Terminology (CPT) codes.

PAD-associated procedural codes have demonstrated high diagnostic accuracy, yet PAD-associated diagnosis codes have repeatedly demonstrated inadequate sensitivity to detect PAD phenotypes in both ACDs and EHRs.^5
–9 Additionally, there are hundreds of PAD-associated ICD diagnosis codes (9th and 10th Revisions, ICD-9 and ICD-10, respectively), diagnosis codes in the United States with little consensus on which codes should be utilized for reliable electronic phenotyping. This uncertainty is compounded by the observation that many contemporary studies utilize single-code queries, attributing a PAD diagnosis by the prevalence of one PAD-related code for PAD cohort selection. However, the literature demonstrates that rule-based electronic phenotyping approaches that combine data show superior performance to single-code queries.¹⁰

Overall, this presents a concerning quality problem for PAD research that utilizes ICD codes to identify PAD cohorts, as PAD cohorts must reflect patients with genuine disease diagnoses for meaningful conclusions and population generalizations to be made. To date, there has been no attempt to map the landscape of PAD electronic phenotyping methods employed in observational studies. To better characterize the extent of this informatics problem, the current review aims to describe the electronic phenotyping strategies most commonly used to identify PAD cohorts in observational research studies that utilize ICD codes and offer possible solutions.

Methods

Study selection

Study selection followed the Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR) guidelines.¹¹ For this review, studies were collected from MEDLINE and Web of Science. A search strategy was constructed and refined in collaboration with an institutional librarian. The search terms included combinations of key words and MeSH terms related to PAD, ICD, and observational cohort studies combined using Boolean operators. The full search strategy is detailed in Supplemental Table S1. Articles were eligible for inclusion if they were observational studies that identified a cohort of patients with PAD utilizing ICD codes. Only studies that extracted a cohort specifically for PAD without including additional comorbid conditions were included. To focus on cohorts built with ICD codes, articles utilizing predominately patient-level clinical information or non-ICD billing codes (ex. excluding only utilized CPT codes) were excluded. Articles were included if they were published between January 1, 2010 and April 30, 2024, to overlap the US transition from the ICD-9 to the ICD-10 coding system that occurred in 2015. Only studies conducted in the US were included given the country-specific differences in coding systems. Articles that were not written in English were excluded.

Titles and abstracts of articles from MEDLINE were reviewed by two independent authors (AAS and AB). If disagreements arose, a discussion between the two authors occurred to achieve consensus. An additional search of Web of Science was conducted by AAS and no studies were added from this search. Articles that satisfied the inclusion criteria underwent independent review by two authors (AB, JC, DM, NK, MM, BBa). If disagreements arose, a third independent reviewer resolved disagreements (AAS). For articles that underwent full-text review, data extraction in the domains of title, authors, data source, ICD coding system, and electronic phenotyping method was conducted.

The electronic phenotyping methods were categorized as ‘single-code’ or ‘rule-based.’ The ‘single-code’ method was defined as utilization of designated PAD ICD codes as diagnostic criteria and ascertainment of PAD status based on the presence of only one of these codes. The ‘rule-based’ method was defined as utilization of a set of diagnostic rules, including the presence of designated PAD ICD codes in addition to other diagnostic criteria defined by the study authors, including non-ICD billing codes, patient-level clinical information, and expert evaluation. A subset of ‘rule-based’ methodologies that operationalized regression modeling was further identified, and modified Standards for Reporting Diagnostic Accuracy (STARD) criteria in the domains of study design, eligibility criteria, test methods, analysis, and results were employed to evaluate the construction of these regression models based on diagnostic accuracy.¹²

Results

The initial search strategy yielded 748 studies for review. After title and abstract screening, 163 studies were ultimately selected for full manuscript review. A total of 68 studies were further excluded because of incorrect study methodology including no use of ICD codes, not an observational study, or a non-PAD cohort. An additional 11 studies were further excluded due to the inability to retrieve full text. Ultimately, 84 studies were included in the comprehensive review (Figure 1). Overall, 19.0% (n = 16) of eligible studies used ICD-9 codes; 11.9% used ICD-10 codes (n = 9); and 69.0% used a combination of both ICD-9 and ICD-10 codes (n = 58) to build their PAD cohorts. The most common codes were those related to atherosclerosis (ICD-9 440.x, ICD-10 I70.x); however, there was wide variability in the specific codes chosen to identify PAD phenotypes. Looking at databases, the most used databases included claims data from the Center of Medicare and Medicaid (n = 28) and the Healthcare Cost and Utilization Project’s National Inpatient Sample (n = 19) (Figures 2 –4).

Figure 1.

Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) flow diagram for article inclusion.

Figure 2.

Top 30 most common ICD-9 codes by proportion of studies using ICD-9 code.

Figure 3.

Top 30 most common ICD-10 codes by proportion of studies using ICD-10 code.

Figure 4.

Frequency of database use across studies.

Next, we evaluated electronic phenotyping methods utilized in each study – a summary of electronic phenotyping methods is provided in Figure 5. Overall, 76.2% of studies (n = 64) employed a single-code query, where having at least one PAD-related ICD-9 or ICD-10 code equated to a positive PAD phenotype. The next most common diagnostic methodology comprising 21.4% of studies (n = 18) was a rule-based methodology that combined multiple elements of structured and/or unstructured data to identify PAD phenotypes. Of these rule-based studies, 27.8% (n = 5) utilized a logistic regression model as their electronic phenotyping method. Two studies did not specify an electronic phenotyping methodology. Study details are summarized in Table S2.

Figure 5.

Overview of electronic phenotyping methods.

For all studies utilizing a rule-based methodology, rules were constructed based on the presence of PAD-related ICD diagnosis codes in addition to other structured data found in the individual study’s chosen database. Every study operationalized a different set of rules, but common elements included the presence of more than one PAD-related ICD diagnosis code, specification of ICD code position, additions of PAD-related procedure code data such as CPT codes, the presence of ankle–brachial index (ABI) data and/or imaging data demonstrating PAD, and visitation with a vascular expert (Table 1).

Table 1.

Rules operationalized in studies utilizing rule-based methodology.

First author and ref. no.	Definition of rules employed
Butala²¹	An inpatient endovascular or surgical revascularization in the CMS MedPAR database based on ICD-9 or ICD-10 codes with a primary discharge diagnosis of CLTI; OR an outpatient endovascular procedure in the CMS Carrier and Institutional Outpatient files using CPT codes and a CLTI diagnosis within the preceding year of the procedure.
Arya¹⁶	ICD-9 diagnosis code for PAD and any one of three criteria: two ABIs in 14 months, two visits to a vascular surgeon or clinic in 14 months, or any PAD procedure code.
Hess²²	ICD-9 discharge diagnosis code for PAD and at least one ICD-9 procedure code or CPT code for a peripheral endovascular or surgical revascularization procedure.
Zahner²³	One of the CLTI codes in the primary position, one of the CLTI codes in any position plus a PAD code in the primary position, the patient had a peripheral endovascular revascularization procedure, the patient had a surgical peripheral revascularization procedure, the patient had a revascularization NOS code plus PAD in the primary position or CLTI in any position, or the patient had a major amputation plus a PAD code in the primary position or a CLTI code in any position.
Sussman¹⁴	At least one claim indicating a diagnosis of PAD (ICD-9-CM 440.2x, 440.3x, 443.9x, 444.2x, 444.81, 445.0x) any time before or on the index date. At least one medical claim listing a CPT code for a PAD-related procedure (date of which was designated the index procedure date) between July 1, 2005 and December 31, 2008, which included bypass surgery, endarterectomy, atherectomy, noncoronary angioplasty without stent placement, noncoronary angioplasty with stent placement, and thrombectomy (if possible, only codes indicating that the procedure was conducted on a lower extremity or an artery of a lower extremity were used; however, some codes did not provide that level of specificity). No claims with a diagnosis of lupus (ICD-9-CM 710.0x) or vasculitis (ICD-9-CM 447.6x) at any time during the study period. No claims for PAD-related procedures at any time before the index date in the data window. Index procedure not based on a revision-procedure code.
Kwong¹⁷	At least two separate claims with an ICD-10 diagnosis code for CLTI recorded either in the Carrier file or in the inpatient or outpatient files.
Fanaroff¹³	ICD-9 or ICD-10 diagnosis codes consistent with CLTI OR ICD code indicating PAD AND ICD indicating lower-extremity wound, infection, or gangrene on the same inpatient or outpatient encounter. Identified patients were also required to have undergone lower-extremity arterial testing (ABI, duplex ultrasound, magnetic resonance angiography, computed tomography angiography, invasive angiography) or endovascular revascularization in the 6 months before or after the episode by which they qualified for a CLTI diagnosis.
Wiseman²⁴	Inpatients with a diagnosis for PAD using ICD-9 codes AND underwent an open or endovascular lower-extremity revascularization procedure.
Agarwal²⁵	Principal diagnosis for admission corresponded to CLTI or the principal diagnosis for admission corresponded to PAD along with secondary diagnoses of ulcers, osteomyelitis, and so on, or the patient underwent a revascularization procedure or major amputation procedure during the hospitalization.
Chase²⁶	Symptomatic PAD was defined as having evidence of intermittent claudication and/or acute limb ischemia requiring medical intervention identified by records of at least one of the following: (a) primary symptomatic PAD ICD-9-CM diagnosis of 440.2x (atherosclerosis of native arteries of the extremities), 440.4 (chronic total occlusion of artery of the extremities), 443.9 (peripheral vascular disease, unspecified), or 785.4 (gangrene) with a pharmacy claim of cilostazol or pentoxifylline within 90 days before or after diagnosis; (b) hospitalization with a discharge DRG code of 299 (peripheral vascular disorders with MCC), 300 (peripheral vascular disorders with CC), and 301 (peripheral vascular disorders without CC/MCC); (c) a primary ICD-9-CM diagnosis on the same day as a record of a CPT or ICD-9-CM procedure code for lower-extremity amputation, open (surgical procedures), endovascular (angioplasty, PRV with or without stent replacement), and other symptomatic PAD interventional procedures.
Armstrong²⁷	A minimum of two claims with an ICD-9 diagnosis of PAD. At least one of those claims had to have a diagnosis of atherosclerosis of either native vessels with rest pain, ulceration or gangrene, or bypass graft indicative of a clinical diagnosis of CLTI. In addition, patients had to have 6 months of continuous medical and pharmacy enrolment prior to diagnosis of CLTI.
Itoga¹⁸	Two or more ICD-9 codes for PAD in the inpatient or outpatient claims records ≥ 2 months apart.
Raja²⁸	Underwent femoropopliteal PVI identified by ICD-10 and CPT claims codes.
Weissler²⁹	Logistic regression – Weissler model
Pohlman³⁰	Logistic regression – Weissler model
Weissler¹⁵	Logistic regression – Weissler model
Arruda-Olson³¹	Logistic regression – Fan model
Arruda-Olson³²	Logistic regression– Fan model

ABI, ankle–brachial index; CC, complication or comorbidity; CLTI, chronic limb-threatening ischemia; CM, Clinical Modification; CMS, Centers for Medicare and Medicaid Services; CPT, Current Procedural Terminology; DRG, diagnosis-related group; ICD-9, International Classification of Diseases, Ninth Revision; ICD-10, International Classification of Diseases, Tenth Revision; MCC, major complications/comorbidities; NOS, not otherwise specified; PAD, peripheral artery disease; PRV, peripheral revascularization; PVI, peripheral vascular intervention.

Efforts to validate the selected electronic phenotyping methods varied considerably, with most studies not undertaking validation of their chosen approach. Fanaroff et al. conducted a study using a single-code approach to identify patients who had undergone PAD-related major or minor amputations.¹³ They then performed a sensitivity analysis, applying a rule-based methodology to define a ‘stricter’ cohort of PAD-related major amputations, and found no significant difference between this ‘stricter’ cohort and their primary cohort. Sussman et al. operationalized a rule-based phenotyping method to identify patients with PAD, then validated this cohort with a manual chart adjudication of EHRs.¹⁴ Eighteen of the studies referenced another source as the basis for their approach to selecting ICD codes in a single-code method or establishing rules in a rule-based method; however, these studies did not explicitly state whether the referenced method had been validated. Two studies utilizing a rule-based method explicitly noted that their chosen rule-based algorithm had been adapted from a previously published and validated algorithm. The five studies utilizing a logistic regression utilized models that had been previously published and validated (Table 2).

Table 2.

Studies addressing construct validity of patient identification methods.

First author and ref. no.	Data source	EHR / ACD / Both	ICD system	PAD ascertainment	Validation effort
Karim³³	HCUP NIS	ACD	ICD-9	Single code	Cited ICD codes previously used in literature; validation not explicit
Altin³⁴	HCUP NIS	ACD	ICD-9	Single code	Cited ICD codes previously used in literature; validation not explicit
Chaturvedi³⁵	HCUP NASS	ACD	ICD-10	Single code	Cited ICD codes previously used in literature; validation not explicit
Ochoa Chaar³⁶	HCUP NRD	ACD	ICD-9	Single code	Cited ICD codes previously used in literature; validation not explicit
Bidare³⁷	CMS	ACD	ICD-9	Single code	Cited ICD codes previously used in literature; validation not explicit
Butala³⁸	CMS	ACD	ICD-9	Single code	Cited ICD codes previously used in literature; validation not explicit
Majmundar³⁹	HCUP NRD	ACD	ICD-10	Single code	Cited ICD codes previously used in literature; validation not explicit
Medhekar⁴⁰	SPARCS	ACD	ICD-9	Single code	Cited ICD codes previously used in literature; validation not explicit
Khoury⁴¹	HCUP NRD	ACD	ICD-9	Single code	Cited ICD codes previously used in literature; validation not explicit
Secemsky⁴²	CMS	ACD	ICD-10	Single code	Cited ICD codes previously used in literature; validation not explicit
Bali⁴³	Commercial claims data and CMS	ACD	ICD-9	Single code	Cited ICD codes previously used in literature; validation not explicit
Goodney⁴⁴	CMS	ACD	ICD-9	Single code	Cited ICD codes previously used in literature; validation not explicit
Doshi⁴⁵	HCUP NIS	ACD	ICD-9	Single code	Cited ICD codes previously used in literature; validation not explicit
Duval⁴⁶	Blue Cross and Blue Shield Minnesota	ACD	ICD-9	Single code	Cited ICD codes previously used in literature; validation not explicit
Zahner²³	HCUP NIS	ACD	ICD-9	Rule based	Cited ICD codes previously used in literature; validation not explicit
Wiseman²⁴	CMS	ACD	ICD-9	Rule based	Cited ICD codes previously used in literature; validation not explicit
Raja²⁸	CMS	ACD	ICD-10	Rule based	Cited ICD codes previously used in literature; validation not explicit
Kwong¹⁷	CMS	ACD	ICD-10	Rule based	Cited ICD codes previously used in literature; validation not explicit
Fanaroff¹³	CMS	ACD	Both	Rule based	Adapted from a validated algorithm
Arya¹⁶	Veterans Health Administration	EHR/ACD	ICD-9	Rule based	Adapted from a validated algorithm
Arruda-Olson³¹	REP	EHR/ACD	ICD-9	Rule based	Validated regression model
Weissler²⁹	DUHS	EHR	Both	Rule based	Validated regression model
Arruda-Olson³²	REP	EHR/ACD	ICD-9	Rule based	Validated regression model
Weissler¹⁵	DEDUCE	EHR	Both	Rule based	Validated regression model
Pohlman³⁰	DUHS	EHR	Both	Rule based	Validated regression model
Fanaroff⁴⁷	CMS	ACD	Both	Single code	A sensitivity analysis of a ‘stricter’ cohort (major lower-extremity amputation plus a diagnosis of PAD) compared to the primary cohort
Sussman¹⁴	Fallon Community Health Plan, Reliant Medical Group	EHR/ACD	ICD-9	Rule based	Random selection of 300 patients underwent manual chart adjudication to confirm PAD diagnosis

ACD, administrative claims databases; CMS, Centers for Medicare and Medicaid Services; DEDUCE, Duke Enterprise Data Unified Content Explorer; DUHS, Duke University Health System; EHR, electronic health records; HCUP, Healthcare Cost and Utilization Project; ICD-9, International Classification of Diseases, Ninth Revision; ICD-10, International Classification of Diseases, Tenth Revision; NASS, National Ambulatory Surgery Sample; NIS, Nationwide Inpatient Sample; NRD, Nationwide Readmission Database; PAD, peripheral artery disease; REP, Rochester Epidemiology Project; SPARCS, Statewide Planning and Research Cooperative System.

Examining the studies utilizing a logistic regression model as their rule-based method, two models served as the basis for all five studies: a model published by Fan et al.⁶ and a model published by Weissler et al.¹⁵ A modified STARD criteria was utilized to evaluate the original model construction studies for these two models with the result detailed in Table S3. Both models utilized retrospective patient data extracted from large institutional databases with clearly defined inclusion and exclusion criteria for patients with PAD included in model-building cohorts. The Fan model utilized ABI measurements as the test reference standard, and the Weissler model utilized ABI measurements, a history of prior revascularization, or evidence of lower-extremity amputation for an indication of PAD as the test reference standard. Both models clearly reported standard model performance metrics to report diagnostic accuracy including area under the receiver operating curve (AUROC), sensitivity, specificity, positive predictive value, and negative predictive value of algorithm. In summary, both models closely adhered to the STARD criteria.

Discussion

This study provides an overview of electronic phenotyping methods used to identify patients with PAD in contemporary observational research. Most studies (69%) utilized both ICD-9 and ICD-10 codes in their electronic phenotyping methods, with fewer studies utilizing ICD-9 (19%) or ICD-10 (11.9%) coding systems in isolation. This distribution is expected, as our study population purposefully spanned the ICD-9 to ICD-10 transition, and datasets utilized in observational studies often predate the time of publication, contributing to a data lag.

Our study confirms that most contemporary PAD observational studies utilize single-code queries to identify cohorts of patients with PAD. Several quality issues arise with this observation. First, cohorts identified using single-code queries have been shown to have a low probability of true occurrence of the disease of interest. Additionally, PAD codes specifically have been shown to have poor diagnostic accuracy, with the sensitivities of individual codes as low as 0.85% and with the highest performing codes reaching sensitivities of only 30.5%.⁵ Lastly, our study demonstrates there was little consensus between studies regarding the specific PAD codes utilized to identify PAD cohorts.

The second most common electronic phenotyping strategy employed in our review was rule-based methods that combined ICD codes with other structured health data, with 18 studies utilizing this method. Evidence suggests rule-based methodologies offer improved diagnostic accuracy over single-code queries for PAD phenotyping.¹⁰ We demonstrate that each study operationalized a different set of rules for PAD-phenotyping; however, there were similarities between the elements of inclusion. Several studies augmented ICD diagnosis codes with procedural codes, as well as clinical data such as ABIs, imaging, and/or consultations with vascular specialists. For example, Arya et al.¹⁶ established a rule requiring the presence of ICD diagnosis codes for PAD along with any one of three criteria: two ABI measurements within 14 months, two visits to a vascular surgeon or clinic within 14 months, or any PAD procedure code. Kwong et al.¹⁷ required at least two ICD diagnosis codes for PAD, whereas Itoga et al.¹⁸ added a temporal element, requiring two ICD diagnosis codes spaced at least 2 months apart.

Several considerations arise regarding utilizing a rule-based methodology for PAD phenotyping. First, with rule-based approaches, similar issues arise as with single code-based queries, as there appears to be little consensus among studies regarding the specific ICD codes used to identify PAD. Though researchers have begun work to validate ICD diagnosis codes for other disease pathologies such as pulmonary embolism, this work is less robust for PAD.¹⁹ Additionally, it is important to note that requisite health data for the construction of robust rules differs between EHRs and ACDs. Though granular clinical data, such as ABI testing, radiology reports, and clinic notes from vascular specialists, can be extracted from an EHR, ACDs primarily contain structured data and rely on corresponding billing claims to indicate abnormal ABIs, imaging findings, or visits to a vascular specialist. To this end, robust rule-based phenotyping methodologies are more feasible in EHRs compared to ACDs.

On the other hand, five studies in our review utilized validated regression models as a rule-based phenotyping method, with evidence that regression modeling as an alternative to code-based queries has potential in PAD phenotyping. All studies were based on two models: a model published by Fan et al. and a model published by Weissler et al. The Fan model uses a total of 13 ICD-9 and CPT codes as model covariates to identify patients with PAD in administrative databases. The model performs with high accuracy in identifying PAD in patients who had been referred for vascular laboratory evaluation (sensitivity 85.5%, specificity 82.6%) compared to standard ABI testing. The Weissler group constructed a model-based algorithm utilizing ICD-9 codes, ICD-10 codes, and various administrative flags identifying PAD-related encounters, revascularization procedures, and relevant imaging.¹⁵ At a classification threshold of 45% probability of PAD phenotype, the regression model performed with a sensitivity of 75.3% and specificity of 81.7%. However, both models utilize patient cohorts from before or during the 2015 to 2016 ICD-9 to ICD-10 transition, and this timing may limit applicability to more contemporary cohorts given natural challenges in implementation and adoption of the new system during this transition period.²⁰

As demonstrated in our results, few studies attempted to validate their chosen electronic phenotyping method outside of citation that their chosen method had been previously operationalized in the literature. One clear benefit of logistic regression models is that model validation is often an integral component to model construction, which confers increased reliability in utilizing these methods. Additionally, machine learning methods can integrate multiple data sources and identify patterns, which may improve the reliability of detecting PAD within ACDs where EHR data are not readily available. On the other hand, regression models can be time intensive and complicated to construct. Despite internal validity, these models may not hold up to external validation when using outside data sources.

Limitations

The study has limitations that should be addressed. It is possible that our search did not capture all qualifying articles in the literature. However, our search was conducted in close consultation with librarians with expertise in conducting review searches and we believe our search is a broad representation of available studies. Additionally, the study search was likely narrowed by focusing on isolated PAD cohorts instead of including studies with multiple comorbidities. However, the goal of the current study was to provide a more specific understanding of electronic phenotyping methods used to identify PAD, so focusing on PAD created a more homogenous cohort aimed at minimizing confounding with other comorbidities. Additionally, though our review conducts a quality assessment of the studies of diagnostic accuracy using the STARD criteria, we did not evaluate study quality for all the reviewed articles beyond identification of electronic phenotyping methods and construct validation efforts as this was not the primary study goal.

Recommendations

Looking towards improving quality of PAD observational research, the authors recommend explicit discussion of the validity of the electronic phenotyping method used to identify patients within studies and citation of prior studies that have validated the same chosen method. Given the poor diagnostic accuracy detailed in the literature with viable alternatives, the authors recommend against utilization of single-code-based queries to identify patients with PAD. Rule-based methods are better supported in the literature; however, these methods may be best suited for researchers utilizing EHR data as opposed to administrative claims data as the most robust rules seem to utilize patient-level clinical data. When utilizing administrative claims data, it is important to recognize the limitations in being able to validate a PAD cohort. Researchers can consider techniques to improve specificity such as utilizing multiple diagnosis codes instead of a single-code query, adding a temporal element (i.e., two codes at least 60 days apart), or combining diagnosis codes with procedural codes to temper this limitation.

Furthermore, regression models may be one solution to the outlined informatics challenge, though more contemporary models accounting for the predominance of current ICD-10 coding practices may be warranted. To this end, the authors propose the construction of a contemporary algorithm to identify patients with PAD from administrative databases. Given the challenges of identifying PAD in administrative databases that lack the comprehensive clinical data found in EHRs, the authors propose utilizing claims data linked to individual-level EHR data for model construction and validation. This technique has not been utilized before in algorithm construction and would afford for the direct comparison of model performance utilizing administrative claims data with diagnostic standards such as noninvasive vascular studies (ABIs, toe–brachial indices, pulse volume recordings) and cross-sectional imaging.

Conclusions

Robust PAD research requires the utilization of a wide variety of data sources. However, the current study demonstrates high utilization of unreliable electronic phenotyping methods such as single code-based queries, which may undermine the validity of research studies that rely on these approaches. For epidemiological or observational comparative effectiveness studies to have a meaningful clinical impact, their electronic phenotyping approaches should be validated and reported transparently.

Supplemental Material

sj-docx-1-vmj-10.1177_1358863X251328671 – Supplemental material for A scoping review of electronic phenotyping methodologies used to identify peripheral artery disease in observational studies

Supplemental material, sj-docx-1-vmj-10.1177_1358863X251328671 for A scoping review of electronic phenotyping methodologies used to identify peripheral artery disease in observational studies by Abena Appah-Sampong, Ascharya Balaji, Jack H Casey, Navya Kotturu, Danielle Montano, Mohit Manchella, Bassil Bacare, James J Fitzgibbon, Patrick Heindel, Tanujit Dey, Behnood Bikdeli and Mohamad A Hussain in Vascular Medicine

Footnotes

Declaration of conflicting interests

The authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article. Dr Bikdeli is supported by a Career Development Award from the American Heart Association and VIVA Physicians (#938814). Dr Bikdeli was supported by the Scott Schoen and Nancy Adams IGNITE Award and is supported by the Mary Ann Tynan Research Scientist award from the Mary Horrigan Connors Center for Women’s Health and Gender Biology at Brigham and Women’s Hospital, and the Heart and Vascular Center Junior Faculty Award from Brigham and Women’s Hospital. Dr Bikdeli reports that he is a member of the Medical Advisory Board for the North American Thrombosis Forum, and serves in the Data Safety and Monitory Board of the NAIL-IT trial funded by the National Heart, Lung, and Blood Institute, and Translational Sciences. Dr Hussain is supported by a Brigham and Women’s Hospital Heart and Vascular Center Faculty Award and Brigham and Women’s Osteen Award. Dr Hussain is a consultant for Humacyte, Inc. Dr Hussain reports research funding from Vascular Therapies (site princiapl investigator [PI] of ACCESS-2 Trial), Humacyte, Inc. (site PI of V012 Trial), and VenoStent (site PI of SAVE-FistulaS Trial). The remaining authors have no conflicting interests.

Funding

This work was supported by the American Heart Association Research Supplement to Promote Diversity ‘Validation of the MAGNIFY-PAD Tool Identify Peripheral Artery Disease in Electronic Health Databases’ (grant no. 23DIVSUP1069428).

ORCID iDs

Abena Appah-Sampong

Behnood Bikdeli

Mohamad A Hussain

Supplemental material

Supplemental material for this article is available online.

References

Gerhard-Herman

Gornik

Barrett

, et al 2016 AHA/ACC guideline on the management of patients with lower extremity peripheral artery disease: Executive summary: A report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines. Circulation 2017; 135: e686–e725.

Song

Rudan

Zhu

, et al Global, regional, and national prevalence and risk factors for peripheral artery disease in 2015: An updated systematic review and analysis. Lancet Glob Health 2019; 7: e1020–e1030.

Criqui

Matsushita

Aboyans

, et al Lower extremity peripheral artery disease: Contemporary epidemiology, management gaps, and future directions: A Scientific Statement from the American Heart Association. Circulation 2021; 144: e171–e191.

Hashimoto

Brodt

Skelly

Dettori

JR.

Administrative database studies: Goldmine or goose chase?

Evid Based Spine Care J 2014; 5: 74–76.

Hong

Sebastianski

Makowsky

, et al Administrative data are not sensitive for the detection of peripheral artery disease in the community. Vasc Med 2016; 21: 331–336.

Fan

Arruda-Olson

Leibson

, et al Billing code algorithms to identify cases of peripheral artery disease from administrative data. J Am Med Inform Assoc 2013; 20: e349–e354.

Gouda

Dover

Wang

, et al The challenges of identifying patients with peripheral artery disease utilizing administrative databases. CJC Open 2023; 5: 709–712.

Lasota

Overvad

Eriksen

, et al Validity of peripheral arterial disease diagnoses in the Danish National Patient Registry. Eur J Vasc Endovasc Surg 2017; 53: 679–685.

Mell

Pettinger

Proulx-Burns

, et al Evaluation of Medicare claims data to ascertain peripheral vascular events in the Women’s Health Initiative. J Vasc Surg 2014; 60: 98–105.

10.

Banda

Seneviratne

Hernandez-Boussard

Shah

NH.

Advances in electronic phenotyping: From rule-based definitions to machine learning models. Annu Rev Biomed Data Sci 2018; 1: 53–68.

11.

Moher

Liberati

Tetzlaff

, et al Reprint—Preferred Reporting Items for Systematic Reviews and Meta-Analyses: The PRISMA statement. Phys Ther 2009; 89: 873–880.

12.

Bossuyt

Reitsma

Bruns

, et al STARD 2015: An updated list of essential items for reporting diagnostic accuracy studies. Radiology 2015; 277: 826–832.

13.

Fanaroff

Dayoub

Yang

, et al Association between diagnosis-to-limb revascularization time and clinical outcomes in outpatients with chronic limb-threatening ischemia: Insights from the CLIPPER Cohort. J Am Heart Assoc 2024; 13: e033898.

14.

Sussman

Mallick

Friedman

, et al Failure of surgical and endovascular infrainguinal and iliac procedures in the management of peripheral arterial disease using data from electronic medical records. J Vasc Interv Radiol 2013; 24: 378–391, 391.e1–3.

15.

Weissler

Lippmann

Smerek

, et al Model-based algorithms for detecting peripheral artery disease using administrative data from an electronic health record data system: Algorithm development study. JMIR Med Inform 2020; 8: e18542.

16.

Arya

Lee

Zahner

, et al The association of comorbid depression with mortality and amputation in veterans with peripheral artery disease. J Vasc Surg 2018; 68: 536–545.e2.

17.

Kwong

Rajasekar

Utter

, et al Poor utilization of palliative care among Medicare patients with chronic limb-threatening ischemia. J Vasc Surg 2023; 78: 464–472.

18.

Itoga

Sceats

Stern

Mell

MW.

Association of opioid use and peripheral arterial disease. J Vasc Surg 2019; 70: 1271–1279.e1.

19.

Bikdeli

Khairani

, et al Developing validated tools to identify pulmonary embolism in electronic databases: Rationale and design of the PE-EHR+ study. Thromb Haemost 2023; 123: 649–662.

20.

Kusnoor

Blasingame

Williams

, et al A narrative review of the impact of the transition to ICD-10 and ICD-10-CM/PCS. JAMIA Open 2019; 3: 126–131.

21.

Butala

Chandra

Beckman

, et al Contextualizing the BEST-CLI Trial results in clinical practice. J Soc Cardiovasc Angiogr Interv 2023; 2: 101036.

22.

Hess

Rogers

Wang

, et al Major adverse limb events and 1-year outcomes after peripheral artery revascularization. J Am Coll Cardiol 2018; 72: 999–1011.

23.

Zahner

Cortez

Duralde

, et al Association of comorbid depression with inpatient outcomes in critical limb ischemia. Vasc Med 2020; 25: 25–32.

24.

Wiseman

Fernandes-Taylor

Saha

, et al Endovascular versus open revascularization for peripheral arterial disease. Ann Surg 2017; 265: 424–430.

25.

Agarwal

Pitcavage

Sud

Thakkar

Burden of readmissions among patients with critical limb ischemia. J Am Coll Cardiol 2017; 69: 1897–1908.

26.

Chase

Friedman

Navaratnam

, et al Comparative assessment of medical resource use and costs associated with patients with symptomatic peripheral artery disease in the United States. J Manag Care Spec Pharm 2016; 22: 667–675.

27.

Armstrong

Ryan

Baker

, et al Risk of major amputation or death among patients with critical limb ischemia initially treated with endovascular intervention, surgical bypass, minor amputation, or conservative management. J Med Econ 2017; 20: 1148–1154.

28.

Raja

Wadhera

Choi

, et al Association of clinical setting with sociodemographics and outcomes following endovascular femoropopliteal artery revascularization in the United States. Circ Cardiovasc Qual Outcomes 2023; 16: e009199.

29.

Weissler

Ford

Narcisse

, et al Clinician specialty, access to care, and outcomes among patients with peripheral artery disease. Am J Med 2022; 135: 219–227.

30.

Pohlman

Ford

Weissler

, et al Impact of risk factor control on peripheral artery disease outcomes and health disparities. Vasc Med 2022; 27: 323–332.

31.

Arruda-Olson

Moussa Pacha

Afzal

, et al Burden of hospitalization in clinically diagnosed peripheral artery disease: A community-based study. Vasc Med 2018; 23: 23–31.

32.

Arruda-Olson

Afzal

Priya Mallipeddi

, et al Leveraging the electronic health record to create an automated real-time prognostic tool for peripheral arterial disease. J Am Heart Assoc 2018; 7: e009680.

33.

Karim

Panhwar

, et al Impact of malnutrition and frailty on mortality and major amputation in patients with CLTI. Catheter Cardiovasc Interv 2022; 99: 1300–1309.

34.

Altin

Kim

Aronow

, et al Seasonal variation in U.S. hospitalizations for chronic limb-threatening ischemia. Catheter Cardiovasc Interv 2020; 96: 1473–1480.

35.

Chaturvedi

Castro-Dominguez

Gertz

, et al Patterns of care and outcomes of ambulatory endovascular interventions in lower extremity peripheral arterial disease. Am J Cardiol 2023; 194: 7–26.

36.

Ochoa Chaar

Gholitabar

Goodney

, et al One-year readmission after open and endovascular revascularization for critical limb ischemia. Ann Vasc Surg 2019; 61: 25–32.e2.

37.

Bidare

Sharath

Cerise

Barshes

NR.

Specialist access and leg amputations among Texas Medicaid patients. Semin Vasc Surg 2023; 36: 49–57.

38.

Butala

Raja

, et al Association of frailty with treatment selection and long-term outcomes among patients with chronic limb-threatening ischemia. J Am Heart Assoc 2021; 10: e023138.

39.

Majmundar

Patel

Doshi

, et al Prognostic value of hospital frailty risk score and clinical outcomes in patients undergoing revascularization for critical limb-threatening ischemia. J Am Heart Assoc 2023; 12: e030294.

40.

Medhekar

Mix

Aquina

, et al Outcomes for critical limb ischemia are driven by lower extremity revascularization volume, not distance to hospital. J Vasc Surg 2017; 66: 476–487.e1.

41.

Khoury

Morales

Sanaiha

, et al Trends in mortality, readmissions, and complications after endovascular and open infrainguinal revascularization. Surgery 2019; 165: 1222–1227.

42.

Secemsky

Kirksey

Quiroga

, et al Impact of intensity of vascular care preceding major amputation among patients with chronic limb-threatening ischemia. Circ Cardiovasc Interv 2024; 17: e012798.

43.

Bali

Yermilov

Coutts

Legorreta

AP.

Novel screening metric for the identification of at-risk peripheral artery disease patients using administrative claims data. Vasc Med 2016; 21: 33–40.

44.

Goodney

Travis

Brooke

, et al Relationship between regional spending on vascular care and amputation rate. JAMA Surg 2014; 149: 34–42.

45.

Doshi

Changal

Gupta

, et al Comparison of outcomes and cost of endovascular management versus surgical bypass for the management of lower extremities peripheral arterial disease. Am J Cardiol 2018; 122: 1790–1796.

46.

Duval

Long

Roy

, et al The contribution of tobacco use to high health care utilization and medical costs in peripheral artery disease: A state-based cohort analysis. J Am Coll Cardiol 2015; 66: 1566–1574.

47.

Fanaroff

Yang

Nathan

, et al Geographic and socioeconomic disparities in major lower extremity amputation rates in metropolitan areas. J Am Heart Assoc 2021; 10: e021456.

48.

Hicks

Holscher

Wang

, et al Use of atherectomy during index peripheral vascular interventions. JACC Cardiovasc Interv 2021; 14: 678–688.

49.

Hicks

Holscher

Wang

, et al Overuse of early peripheral vascular interventions for claudication. J Vasc Surg 2020; 71: 121–130.e1.

50.

Weissler

Ford

Patel

, et al Younger patients with chronic limb threatening ischemia face more frequent amputations. Am Heart J 2021; 242: 6–14.

51.

Siracuse

Woodson

Ellis

, et al Intermittent claudication treatment patterns in the commercially insured non-Medicare population. J Vasc Surg 2021; 74: 499–504.

52.

Hughes

Mota

Nunez

, et al The effect of income and insurance on the likelihood of major leg amputation. J Vasc Surg 2019; 70: 580–587.

53.

Witrick

Kalbaugh

Shi

, et al Geographic disparities in readmissions for peripheral artery disease in South Carolina. Int J Environ Res Public Health 2021; 19: 285.

54.

Kwong

Rajasekar

Utter

, et al Updated estimates for the burden of chronic limb-threatening ischemia in the Medicare population. J Vasc Surg 2023; 77: 1760–1775.

55.

Schaumeier

Hawkins

Hevelone

, et al Association of treatment for critical limb ischemia with gender and hospital volume. Am Surg 2018; 84: 1069–1078.

56.

Secemsky

Schermerhorn

Carroll

, et al Readmissions after revascularization procedures for peripheral arterial disease: A nationwide cohort study. Ann Intern Med 2018; 168: 93–99.

57.

Scully

Arnaoutakis

DeBord Smith

, et al Estimated annual health care expenditures in individuals with peripheral arterial disease. J Vasc Surg 2018; 67: 558–567.

58.

Kohn

Alberts

Peacock

, et al Cost and inpatient burden of peripheral artery disease: Findings from the National Inpatient Sample. Atherosclerosis 2019; 286: 142–146.

59.

Marulanda

Duchesneau

Patel

, et al Increased long-term bleeding complications in females undergoing endovascular revascularization for peripheral arterial disease. J Vasc Surg 2022; 76: 1021–1029.e3.

60.

Hicks

Wang

Bruhn

, et al Race and socioeconomic differences associated with endovascular peripheral vascular interventions for newly diagnosed claudication. J Vasc Surg 2020; 72: 611–621.e5.

61.

Bose

Dun

Sorber

, et al Practice patterns surrounding the use of tibial interventions for claudication in the Medicare population. J Vasc Surg 2023; 77: 454–462.e1.

62.

Krawisz

Natesan

Wadhera

, et al Differences in comorbidities explain Black-White disparities in outcomes after femoropopliteal endovascular intervention. Circulation 2022; 146: 191–200.

63.

Summers

Yakkanti

Haziza

, et al Nationwide analysis on the impact of peripheral vascular disease following primary total knee arthroplasty: A matched-control analysis. Knee 2021; 31: 158–163.

64.

Panaich

Arora

Patel

, et al Comparison of inhospital outcomes and hospitalization costs of peripheral angioplasty and endovascular stenting. Am J Cardiol 2015; 116: 634–641.

65.

Agarwal

Sud

Shishehbor

MH.

Nationwide trends of hospital admission and outcomes among critical limb ischemia patients: From 2003–2011. J Am Coll Cardiol 2016; 67: 1901–1913.

66.

Secemsky

Kundi

Weinberg

, et al Association of survival with femoropopliteal artery revascularization with drug-coated devices. JAMA Cardiol 2019; 4: 332–340.

67.

Vogel

Dombrovskiy

Galiñanes

Kruse

RL.

Preoperative statins and limb salvage after lower extremity revascularization in the Medicare population. Circ Cardiovasc Interv 2013; 6: 694–700.

68.

Zhang

Kalbaugh

, et al Machine learning approach to predict in-hospital mortality in patients admitted for peripheral artery disease in the United States. J Am Heart Assoc 2022; 11: e026987.

69.

Tran

Cong

Eslami

, et al Symptomatic human immunodeficiency virus infection is associated with advanced presentation and perioperative mortality in patients undergoing surgery for peripheral arterial disease. J Vasc Surg 2022; 75: 1403–1412.e2.

70.

Howell

Lane

Weinkauf

, et al Interruption of insurance coverage and the risk of amputation in patients with pre-existing commercial health insurance and peripheral artery disease. Ann Vasc Surg 2023; 96: 284–291.

71.

Gutierrez

Rao

Jones

, et al Survival and causes of death among veterans with lower extremity revascularization with paclitaxel-coated devices: Insights from the Veterans Health Administration. J Am Heart Assoc 2021; 10: e018149.

72.

Bath

Smith

Woodard

, et al Complex relationship between low albumin level and poor outcome after lower extremity procedures for peripheral artery disease. J Vasc Surg 2021; 73: 200–209.

73.

Tsay

Luo

Zhang

, et al Perioperative outcomes of lower extremity revascularization for rest pain and tissue loss. Ann Vasc Surg 2020; 66: 493–501.

74.

Vogel

Smith

Kruse

RL.

The association of postoperative glycemic control and lower extremity procedure outcomes. J Vasc Surg 2017; 66: 1123–1132.

75.

Doshi

Shah

Meraj

Gender disparities among patients with peripheral arterial disease treated via endovascular approach: A propensity score matched analysis. J Intervent Cardiol 2017; 30: 604–611.

76.

Itoga

Baker

Mell

MW.

Impact of office-based laboratories on physician practice patterns and outcomes after percutaneous vascular interventions for peripheral artery disease. J Vasc Surg 2019; 70: 1524–1533.e12.

77.

Long

Zepel

Greiner

, et al Use and 1-year outcomes with conventional and drug-coated balloon angioplasty in patients with lower extremity peripheral artery disease. Am Heart J 2019; 217: 42–51.

78.

Vogel

Kruse

RL.

Risk factors for readmission after lower extremity procedures for peripheral artery disease. J Vasc Surg 2013; 58: 90–97.e1–4.

79.

Mustapha

Katzen

Neville

, et al Determinants of long-term outcomes and costs in the management of critical limb ischemia: A population-based cohort study. J Am Heart Assoc 2018; 7: e009724.

80.

Neel

Kruse

Dombrovskiy

Vogel

TR.

Cilostazol and freedom from amputation after lower extremity revascularization. J Vasc Surg 2015; 61: 960–964.

81.

Doshi

Shlofmitz

Meraj

Utilization and in-hospital outcomes associated with atherectomy in the treatment of peripheral vascular disease: An observational analysis from the National Inpatient Sample. Vascular 2018; 26: 464–471.

82.

Martinez

Franklin

Hernandez

, et al Readmissions to an alternate hospital in patients undergoing vascular intervention for claudication and critical limb ischemia associated with significantly higher mortality. J Vasc Surg 2019; 70: 1960–1972.

83.

Jones

Qualls

, et al Trends in settings for peripheral vascular intervention and the effect of changes in the outpatient prospective payment system. J Am Coll Cardiol 2015; 65: 920–927.

84.

Panaich

Arora

Patel

, et al In-hospital outcomes of atherectomy during endovascular lower extremity revascularization. Am J Cardiol 2016; 117: 676–684.

85.

Lefebvre

Chevan

The persistence of gender and racial disparities in vascular lower extremity amputation: An examination of HCUP-NIS data (2002–2011). Vasc Med 2015; 20: 51–59.

86.

Urie

Laskowski

Richard

, et al Impact of care fragmentation after major lower extremity amputation. Ann Vasc Surg 2024; 100: 47–52.

87.

Jaff

Cahill

, et al Clinical outcomes and medical care costs among Medicare beneficiaries receiving therapy for peripheral arterial disease. Ann Vasc Surg 2010; 24: 577–587.

88.

Sachs

Pomposelli

Hamdan

, et al Trends in the national outcomes and costs for claudication and limb threatening ischemia: Angioplasty vs bypass graft. J Vasc Surg 2011; 54: 1021–1031.e1.

89.

Kim

Swaminathan

Minutello

, et al Trends in hospital treatments for peripheral arterial disease in the United States and association between payer status and quality of care/outcomes, 2007–2011. Catheter Cardiovasc Interv 2015; 86: 864–872.

90.

Jones

Patel

Dai

, et al Temporal trends and geographic variation of lower-extremity amputation in patients with peripheral artery disease: Results from U.S. Medicare 2000–2008. J Am Coll Cardiol 2012; 60: 2230–2236.

91.

Rajaee

Cherkassky

Marcaccio

, et al Open revascularization procedures are more likely to influence smoking reduction than percutaneous procedures. Ann Vasc Surg 2014; 28: 990–998.

92.

Holman

Henke

Dimick

Birkmeyer

JD.

Racial disparities in the use of revascularization before leg amputation in Medicare patients. J Vasc Surg 2011; 54: 420–426.

93.

Oresanya

Zhao

Gan

, et al Functional outcomes after lower extremity revascularization in nursing home residents: A national cohort study. JAMA Intern Med 2015; 175: 951–957.

94.

Arora

Panaich

Patel

, et al Impact of hospital volume on outcomes of lower extremity endovascular interventions (insights from the Nationwide Inpatient Sample [2006 to 2011]). Am J Cardiol 2015; 116: 791–800.

95.

Uwumiro

Okpujie

Nebuwa

, et al Emerging trends in nationwide mortality, limb loss, and resource utilization for critical limb ischemia in young adults. Cardiovasc Revasc Med 2024; 67: 41–48.

96.

Bath

Smith

Kruse

Vogel

TR.

Neutrophil-lymphocyte ratio predicts disease severity and outcome after lower extremity procedures. J Vasc Surg 2020; 72: 622–631.

97.

Itoga

Sceats

Stern

Mell

MW.

Association of opioid use and peripheral artery disease. J Vasc Surg 2019; 70: 1271–1279.e1.

98.

Buelter

Smith

Carel

, et al Preoperative HbA1c and outcomes following lower extremity vascular procedures. Ann Vasc Surg 2022; 83: 298–304.

99.

Sorber

Alshaikh

Nejim

, et al Quantifying the risk-adjusted hospital costs of postoperative complications after lower extremity bypass in patients with claudication. J Vasc Surg 2021; 73: 1361–1367.e1.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.15 MB

0.25 MB