Abstract
Background:
Carotid artery stenting (CAS) carries important perioperative risks. Outcome prediction tools may help guide clinical decision-making but remain limited. We developed machine learning (ML) algorithms that predict 30-day outcomes following transfemoral CAS.
Methods:
The National Surgical Quality Improvement Program (NSQIP) targeted vascular database was used to identify patients who underwent transfemoral CAS between 2011 and 2021. Input features included 36 preoperative demographic/clinical variables. The primary outcome was a 30-day major adverse cardiovascular event (MACE; composite of stroke, myocardial infarction [MI], or death). The secondary outcomes were 30-day stroke, MI, death, carotid-related morbidity, other morbidity, non-home discharge, and unplanned readmission. Our data were split into training (70%) and test (30%) sets. Using 10-fold cross-validation, we trained six ML models using preoperative features with logistic regression as the baseline comparator. The primary model evaluation metric was area under the receiver operating characteristic curve (AUROC). Model robustness was evaluated with calibration plot and Brier score. Variable importance scores were calculated to determine the top 10 predictive features. Performance was assessed on subgroups based on age, sex, race, ethnicity, symptom status, stent type, and urgency.
Results:
Overall, 2093 patients underwent CAS during the study period. Thirty-day MACE occurred in 130 (6.2%) patients. The best-performing prediction model for 30-day MACE was XGBoost, achieving an AUROC (95% CI) of 0.93 (0.92–0.94). In comparison, logistic regression had an AUROC (95% CI) of 0.67 (0.65–0.68), and existing tools in the literature demonstrate AUROCs ranging from 0.58 to 0.74. For secondary outcomes, XGBoost achieved AUROCs between 0.86 and 0.97. The calibration plot showed good agreement between predicted and observed event probabilities with a Brier score of 0.02. The top three predictive features in our algorithm were (1) symptomatic carotid stenosis, (2) age, and (3) American Society of Anesthesiologists classification. Model performance remained robust on all subgroup analyses of specific demographic and clinical populations.
Conclusions:
Our ML models accurately predict 30-day outcomes following transfemoral CAS using preoperative data. They have the potential for important utility in guiding risk-mitigation strategies for patients being considered for CAS to improve outcomes.
Clinical Impact
Transfemoral carotid artery stenting (CAS) carries important perioperative risks. Outcome prediction tools may help guide clinical decision-making but remain limited. Using data from the National Surgical Quality Improvement Program (NSQIP) targeted vascular database, we developed machine learning (ML) models that accurately predict 30-day outcomes following transfemoral CAS using preoperative data, outperforming logistic regression and existing tools in the literature. The models were well-calibrated and remained robust across demographic and clinical subpopulations. These ML algorithms have the potential for important utility in guiding risk-mitigation strategies for patients being considered for transfemoral CAS to improve outcomes.
Introduction
Carotid artery stenosis accounts for approximately one-third of global ischemic strokes and significantly impacts morbidity and mortality. 1 Traditionally, moderate-severe carotid artery stenosis has been treated with surgical carotid endarterectomy (CEA). 2 In recent decades, transfemoral carotid artery stenting (CAS) has emerged as a less invasive alternative. 3 The Carotid Revascularization Endarterectomy versus Stenting Trial (CREST) found no significant difference in the primary outcome of stroke, myocardial infarction, or death between CAS and CEA. 4 However, CAS was associated with a higher risk of stroke, while CEA was associated with a higher risk of myocardial infarction. 4 Despite this, CAS procedures have increased by 72% over the past decade based on population-level data. 5 Although considered minimally invasive, transfemoral CAS carries significant perioperative risks, including a 30-day major adverse cardiovascular event (MACE) rate exceeding 9% in high-risk patients, as defined by the Centers for Medicare and Medicaid Services (CMS), including age ≥80 years, New York Heart Association congestive heart failure (CHF) class III/IV, left ventricular ejection fraction <30%, unstable angina within 30 days prior to intervention, myocardial infarction (MI) within 30 days prior to intervention, restenosis, previous radical neck dissection, contralateral carotid occlusion, prior neck radiation, contralateral laryngeal nerve injury/palsy, or high anatomic lesion. 6 The Society for Vascular Surgery (SVS) recommends CEA for low-risk patients 7 and reserves CAS for those with high-risk anatomical or physiological features. 8 Therefore, accurate risk assessment is crucial for guiding clinical decisions.
Currently, there are no widely adopted tools available to predict adverse events following CAS. A systematic review analyzing 37 studies that assessed outcome prediction models for carotid revascularization highlighted significant methodological shortcomings, incomplete reporting, and insufficient predictive accuracy. 9 For instance, many existing models do not report how missing data were managed and/or demonstrate suboptimal discriminatory ability, with area under the receiver operating characteristic curve (AUROC) values ranging from 0.58 to 0.74. 9 Moreover, these tools rely on traditional modeling techniques that necessitate manual input of clinical variables, which limits their practical use in busy clinical environments. 10 Notably, the SVS Vascular Quality Initiative (VQI) Cardiac Risk Index (CRI) offers risk assessments for CEA but does not include CAS. 11 Consequently, there is a pressing need to develop more effective risk prediction tools specifically tailored for patients undergoing CAS.
Machine learning (ML) is advancing rapidly, enabling computers to learn from data and accurately predict outcomes. 12 Through sophisticated analytics, ML can model complex relationships between inputs (e.g., patient characteristics) and outputs (e.g., clinical outcomes), driven by the vast availability of electronic data and enhanced computational capabilities. 12 ML techniques excel over traditional statistical methods in capturing intricate, multicollinear relationships among variables and outcomes in health care data.13,14 Previously, ML has successfully leveraged the American College of Surgeons (ACS) National Surgical Quality Improvement Program (NSQIP) database to develop algorithms predicting peri-operative complications. 15 Using data from more than 2900 procedures, the authors achieved AUROC values ranging from 0.85 to 0.88. 15 Given the heterogeneity of this cohort, there is potential to enhance predictive accuracy by tailoring ML algorithms specifically for patients undergoing CAS. In this study, we applied ML to the ACS NSQIP database to predict 30-day MACE following transfemoral CAS using preoperative variables.
Materials and Methods
Design
We conducted a multicenter retrospective ML-based prognostic study reported according to the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis + Artificial Intelligence (TRIPOD + AI) statement. 16
Dataset
The ACS NSQIP database contains demographic, clinical, and 30-day outcomes data on surgical patients across over 700 hospitals in approximately 15 countries worldwide. 17 The information is prospectively collected from electronic health records by trained clinical reviewers and regularly audited by ACS for accuracy. 18 Targeted NSQIP registries for vascular operations contain additional procedure-specific variables and outcomes. 19 This study was exempt from institutional ethics board review and informed consent was not required as the data came from a large, deidentified registry.
Cohort
All patients who underwent transfemoral CAS from 2011-2021 in the ACS NSQIP targeted CAS database were included. This information was merged with the main ACS NSQIP database using unique case identification numbers for a complete set of generic and procedure-specific variables and outcomes. 20 Patients with unreported presenting symptom status or stent type or treated for carotid aneurysm or dissection or malignancy were excluded.
Features
Thirty-six preoperative variables were used as input features for the ML models, as determined based on the availability of the variables in the NSQIP database and their important impact on CAS outcomes based on the literature.21 –23 Given the unique advantage of ML techniques in handling a large number of input features, all available preoperative variables in the NSQIP database were used to maximize predictive performance. There were no significant confounding effects between variables. Demographic variables included age, sex, body mass index, race, ethnicity, and origin status. Comorbidities included hypertension, diabetes, smoking status, CHF, chronic obstructive pulmonary disease (COPD), end-stage renal disease requiring dialysis, functional status, and physiologic high-risk factor [defined as at least one of (1) CHF class III/IV, (2) left ventricular ejection fraction <30%, (3) unstable angina within 30 days prior to intervention, or (4) MI within 30 days prior to intervention]. Medications included antiplatelets, statins, and beta-blockers. Preoperative laboratory investigations included serum sodium, blood urea nitrogen (BUN), serum creatinine, albumin, white blood cell count, hematocrit, platelet count, international normalized unit (INR), and partial thromboplastin time (PTT). Anatomic characteristics included ipsilateral/contralateral carotid stenosis percentage and anatomic high-risk factor [defined as at least one of (1) previous ipsilateral carotid endarterectomy or stent, (2) previous radical neck dissection, (3) contralateral carotid occlusion, (4) prior neck radiation, (5) contralateral laryngeal nerve injury/palsy, or (6) high anatomic lesion (cervical vertebrae 2 or higher)]. Other pre-procedural characteristics recorded were symptom status [asymptomatic or symptomatic (history of stroke, transient ischemic attack [TIA], or amaurosis fugax within 180 days prior to CAS)], stent type (single straight, single straight with cerebral protection device [CPD], single tapered, single tapered with CPD, multiple stents, or multiple stents with CPD), urgency [elective, urgent, or emergent], American Society of Anesthesiologists (ASA) classification, and specialty of the primary physician performing the procedure. A complete list of features and definitions can be found in Supplemental Table 1.
Outcomes
The primary outcome was a 30-day MACE, defined as a composite of stroke, MI, or death. Stroke was defined as motor, sensory, or cognitive dysfunction that persists for 24 hours in the setting of a suspected ischemic or hemorrhagic stroke in the ipsilateral or contralateral cerebral hemisphere. MI was defined as electrocardiogram changes indicative of acute MI (ST elevation >1 mm in two or more contiguous leads, new left bundle branch block, or new q-wave in two or more contiguous leads), new elevation in troponin greater than 3 times the regular upper level of the reference range in the setting of suspected myocardial ischemia, or physician/advanced provider diagnosis of MI. Death was defined as all-cause mortality. This composite outcome was chosen because it is frequently reported as a primary outcome in landmark clinical trials including CREST.4,24
Secondary outcomes were 30-day stroke, MI, death, carotid-related morbidity, other morbidity, non-home discharge, and unplanned readmission. Carotid-related morbidity was defined as a composite of distal embolization causing ipsilateral cerebral infarcts demonstrated on Doppler ultrasound, computed tomography angiography (CTA), magnetic resonance angiography (MRA), or angiogram, acute occlusion or thrombosis of the ipsilateral carotid artery demonstrated on Doppler ultrasound, CTA, MRA, or angiogram, TIA (neurologic dysfunction lasting <24 hours without evidence of cerebral infarction), puncture site bleeding/pseudoaneurysm or embolization of arterial closure device, re-stenosis >50% on postoperative Doppler ultrasound, CTA, MRA, or angiogram, or repeat carotid revascularization (endarterectomy or stent). Other morbidity was defined as a composite of surgical site infection (SSI), pneumonia, unplanned reintubation, pulmonary embolism (PE), failure to wean from ventilator (cumulative time of ventilator-assisted respirations >48 hours), acute kidney injury (AKI; a rise in creatinine of >2 mg/dL from preoperative value or requirement of dialysis in a patient who did not require dialysis preoperatively), urinary tract infection (UTI), cardiac arrest, bleeding requiring blood transfusion within 72 hours of intervention, deep vein thrombosis (DVT) requiring therapy, Clostridium difficile infection, sepsis, or septic shock. Non-home discharge was defined as discharge to rehabilitation, skilled care, or other facility.
Model Development
Six ML models were trained to predict 30-day primary and secondary outcomes following CEA: Extreme Gradient Boosting (XGBoost), random forest, Naïve Bayes classifier, radial basis function (RBF) support vector machine (SVM), single-layer perceptron artificial neural network (ANN) with a single hidden layer, sigmoid activation function, and cross-entropy loss function, and logistic regression. These were selected because they demonstrate the best performance for predicting surgical outcomes.25 –27 Logistic regression was the baseline comparator to assess relative model performance because it is the most common modeling technique used in traditional risk predictors. 28
Our data were randomly split into training (70%) and test (30%) sets. 29 Unique patient identification numbers were used to ensure that the training and testing populations were separated. Ten-fold cross-validation and grid search were performed on the training set to find optimal model hyperparameters.30,31 Preliminary analysis of our data demonstrated that the primary outcome was uncommon, occurring in 130/2093 (6.2%) patients. To improve class balance, Random Over-Sample Examples (ROSE) was applied to training data. 32 ROSE employs smoothed bootstrapping to draw new samples from the feature space around the minority class and is a commonly used method to support predictive modeling of rare events. 32 The models were then evaluated on test set data and ranked based on the primary discriminatory metric of AUROC. The best-performing model was XGBoost, which had the following optimized hyperparameters: number of rounds = 200, maximum tree depth = 3, learning rate = 0.3, gamma = 0, column sample by tree = 0.6, minimum child weight = 1, subsample = 1. The process for selecting these hyperparameters through grid search and cross-validation is detailed in Supplementary Table 2. Once the best-performing model for the primary outcome was identified, we trained the same model to predict secondary outcomes. Parameter space was optimized for XGBoost given previous literature demonstrating the superiority of this model for using structured data to predict binary outcomes.33 –35
Statistical Analysis
Baseline demographic and clinical characteristics for patients with versus without 30-day MACE were summarized as mean (standard deviation) or number (proportion). Differences between groups were assessed using independent t-test (continuous variables) or chi-square test (categorical variables). Statistical significance was adjusted using Bonferroni correction to account for multiple comparisons. The p-values for categorical variables with multiple categories, such as race, were determined based on multi-cell chi-square tests.
The primary metric for assessing model performance was AUROC (95% CI), a validated discriminatory metric considers both sensitivity and specificity. 36 Secondary performance metrics were accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). Confidence intervals for AUROC, accuracy, sensitivity, specificity, PPV, and NPV were calculated using Newcombe’s Wald method.37,38 To further assess model performance, we plotted a calibration curve and calculated the Brier score, a measurement of the agreement between predicted and observed event probabilities. 39 In the final model, feature importance was determined by ranking the top 10 predictors based on the variable importance score (gain), a measure of the relative impact of individual covariates in contributing to an overall prediction. 40 Feature importance was determined separately for the overall cohort, symptomatic patients, and asymptomatic patients. This was done to understand the distinction between features that may be more pronounced in one group versus the other, as factors that influence CAS outcomes may be different in symptomatic versus asymptomatic patients. To assess model robustness across demographic/clinical subpopulations, we performed a subgroup analysis of predictive performance based on age, sex, race, ethnicity, symptom status, stent type with or without CPD, and urgency.
Based on a validated sample size calculator for clinical prediction models, to achieve a minimum AUROC of 0.9 with an outcome rate of ~6% and 36 input features, the minimum sample size required is 2,006 patients with 121 events.41,42 Our cohort of 2093 patients with 130 primary events satisfied this sample size requirement. The variables of interest displayed <5% missing data; hence, complete-case analysis was applied whereby only non-missing covariates for each patient were considered. 43 This is a valid analytical method for datasets with minimal missing data (<5%) and reflects predictive modeling of real-world data, which inherently includes missing information.44,45 For ML algorithms that require an input value for missing data (e.g., neural network and logistic regression), the mean value of the variable was used, which has no significant impact on model performance for small amounts of missing data. 46 All analyses were performed in R version 4.3.0 47 with the following packages: caret, 48 xgboost, 49 ranger, 50 naivebayes, 51 e1071, 52 nnet, 53 and pROC. 54
Results
Patients and Events
From an initial cohort of 2212 patients who underwent transfemoral CAS in the NSQIP targeted vascular database between 2011 and 2021, we excluded 119 patients for the following reasons: undocumented symptom status (n=59) or stent type (n=43) and treatment for carotid aneurysm (n=5), carotid dissection (n=2), or malignancy (n=10). Overall, we included 2093 patients. The primary outcome of 30-day MACE occurred in 130 (6.2%) patients. The 30-day secondary outcomes occurred in the following distribution: 65 (3.1%) patients had a stroke, 49 (2.3%) patients had a MI, 36 (1.7%) patients died, 171 (8.2%) patients had a carotid-related morbidity (distal embolization [n=24], acute occlusion or thrombosis [n=22], TIA [n=23], puncture site bleeding/pseudoaneurysm or embolization of arterial closure device [n=31], restenosis [n=18], repeat carotid revascularization [n=83]), 145 (6.9%) patients had other morbidity (SSI [n=5], pneumonia [n=30], unplanned reintubation [n=37], PE [n=3], failure to wean from ventilator [n=20], AKI [n=14], UTI [n=20], cardiac arrest [n=6], bleeding requiring transfusion [n=61], DVT [n=6], Clostridium difficile infection [n=4], sepsis [n=6], septic shock [n=9]), 222 (10.6%) patients had a non-home discharge, and 162 (7.7%) patients had an unplanned readmission.
Preoperative Demographic and Clinical Characteristics
Compared to patients without a primary outcome, those who developed MACE at 30 days were older (72.9 [SD 10.1] vs. 69.8 [SD 9.8] years, p<0.001) and more likely to reside in nursing homes (2.3% vs. 0.9%, p<0.001) or be transferred from another hospital (22.3% vs. 10.5%, p<0.001). There were no significant differences in comorbidities between groups, although patients with 30-day MACE were less likely to receive antiplatelets (89.2% vs. 93.7%, p=0.004). Notable differences in laboratory investigations included higher mean creatinine (118.48 [SD 41.7] vs. 100.8 [SD 22.9] umol/L, p=0.002) and BUN (22.7 [SD 13.7] vs. 20.0 [SD 9.4] mmol/L, p=0.002) in patients with an event. A greater proportion of patients with a primary outcome had symptomatic carotid stenosis (59.2% vs. 48.2%, p=0.005), required urgent/emergent intervention (40.8% vs. 24.2%, p<0.001), and had an ASA class above 3 (39.2% vs. 23.3%, p=0.006) (Table 1).
Preoperative Demographic and Clinical Characteristics of Patients Undergoing Carotid Artery Stenting With and Without Major Adverse Cardiovascular Events at 30 Days.
Values are reported as number (%) unless otherwise indicated.
Abbreviations: MACE, major adverse cardiovascular event; BMI, body mass index; BUN, blood urea nitrogen; INR, international normalized ratio; PTT, partial thromboplastin time; ASA, American Society of Anesthesiologists; SD, standard deviation.
Hispanic ethnicity is reported independently from race because the US Census Bureau classifies Hispanic status as an ethnicity rather than a race, and they note that Hispanic people can be of any race (https://www.census.gov/topics/population/race/about.html).
At least one of the following: (1) New York Heart Association congestive heart failure class III/IV, (2) left ventricular ejection fraction <30%, (3) unstable angina within 30 days prior to intervention, or 4) myocardial infarction within 30 days prior to intervention.
At least 1 of the following: (1) previous ipsilateral carotid endarterectomy or stent, (2) previous radical neck dissection, (3) contralateral carotid occlusion, (4) prior neck radiation, (5) contralateral laryngeal nerve injury/palsy, or (6) high anatomic lesion (cervical vertebrae 2 or higher).
Includes cardiac surgery and general surgery.
Model Performance
Of the six ML models evaluated on test set data for predicting 30-day MACE following CAS, XGBoost had the best performance with an AUROC (95% CI) of 0.93 (0.92–0.94) compared to random forest [0.92 (0.91–0.93)], Naïve Bayes [0.84 (0.83–0.86)], RBF SVM [0.83 (0.82–0.84)], MLP ANN [0.73 (0.71–0.74)], and logistic regression [0.67 (0.65–0.68)]. The other performance metrics of XGBoost were the following: accuracy 0.86 (95% CI 0.84–0.87), sensitivity 0.84, specificity 0.88, PPV 0.88, and NPV 0.83 (Table 2).
Model Performance on Test Set Data for Predicting 30-Day Major Adverse Cardiovascular Events Following Carotid Artery Stenting Using Preoperative Features.
Abbreviations: XGBoost, Extreme Gradient Boosting; AUROC, area under the receiver operating characteristic curve; CI, confidence interval; PPV, positive predictive value; NPV, negative predictive value; RBF SVM, radial basis function support vector machine; ANN, artificial neural network.
For 30-day secondary outcomes, XGBoost attained the following AUROCs (95% CI): stroke [0.89 (0.88–0.90)], MI [0.93 (0.92–0.94)], death [0.94 (0.93–0.95)], carotid-related morbidity [0.86 (0.85–0.87)], other morbidity [0.92 (0.91–0.93)], non-home discharge [0.97 (0.96–0.98)], and unplanned readmission [0.86 (0.85–0.88)] (Table 3).
XGBoost Performance on Test Set Data for Predicting 30-day Secondary Outcomes Following Carotid Artery Stenting Using Preoperative Features.
Abbreviations: XGBoost, Extreme Gradient Boosting, AUROC, area under the receiver operating characteristic curve; CI, confidence interval; PPV, positive predictive value; NPV, negative predictive value.
The ROC curve of the XGBoost model is reported in Figure 1. The calibration plot demonstrated good agreement between predicted and observed event probabilities with a Brier score of 0.02 (Figure 2). The top 10 predictors of increased risk of 30-day MACE following CAS in the XGBoost model were the following: (1) symptomatic carotid stenosis, (2) older age, (3) higher ASA class, (4) higher preoperative creatinine, (5) transferred from another hospital, (6) urgency (urgent/emergent procedure), (7) stent type (single straight/tapered stent without a cerebral protection device or multiple stents), (8) preoperative CHF, (9) lack of preoperative antiplatelet, and (10) primary specialty of the proceduralist (interventional radiology, neurosurgery, or other non-vascular surgery specialty) (Figure 3). On subgroup analysis based on symptom status, 9/10 of the most important features were the same for symptomatic and asymptomatic patients, with the top 3 predictors being age, ASA class, and preoperative creatinine for both groups. Transfer from another hospital was a top 10 predictor for 30-day MACE in patients with symptomatic carotid stenosis, but not in patients with asymptomatic carotid stenosis. Preoperative BUN was a top 10 predictor for 30-day MACE in patients with asymptomatic carotid stenosis, but not in patients with symptomatic carotid stenosis (Supplemental Figure 1).

Receiver operating characteristic curve for predicting 30-day major adverse cardiovascular events following carotid artery stenting using Extreme Gradient Boosting (XGBoost) model. AUROC, area under the receiver operating characteristic curve; CI, confidence interval.

Calibration plot with Brier score for predicting 30-day major adverse cardiovascular events following carotid artery stenting using Extreme Gradient Boosting (XGBoost) model.

Variable importance scores (gain) for the top 10 predictors of 30-day major adverse cardiovascular events following carotid artery stenting in the Extreme Gradient Boosting (XGBoost) model. ASA, American Society of Anesthesiologists; CHF, congestive heart failure.
Subgroup Analysis
The XGBoost model performance for predicting 30-day MACE remained excellent on all subgroup analyses of demographic and clinical subpopulations, with AUROCs ranging from 0.92 to 0.94 and no significant differences between majority and minority groups (Supplemental Figures 2–8).
Discussion
Summary of Findings
In this study, we leveraged data from the ACS NSQIP targeted vascular files between 2011 and 2021 consisting of 2,093 patients who underwent transfemoral CAS to develop ML models that accurately predict 30-day MACE with an AUROC of 0.93. Additionally, our algorithms predicted 30-day stroke, MI, death, carotid-related morbidity, other morbidity, non-home discharge, and unplanned readmission with AUROCs ranging from 0.86 to 0.97. Several significant findings emerged from our analysis. First, patients who develop 30-day MACE following CAS constitute a high-risk population with predictive factors at the preoperative stage, including older age and higher creatinine with a greater proportion having symptomatic carotid stenosis and requiring urgent/emergent intervention. Second, we trained six ML models to predict 30-day MACE using preoperative features and showed that XGBoost achieved the best performance. Our model was well-calibrated and remained robust on subgroup analyses based on age, sex, race, ethnicity, symptom status, stent type, and urgency. Finally, we identified the top 10 predictors of 30-day MACE in our ML models. These features offer clinicians valuable insights into the factors influencing risk predictions, thereby guiding patient selection and preoperative optimization. Overall, we have developed a robust ML-based risk assessment tool that can help guide clinical decision-making to improve outcomes and reduce costs from complications, reinterventions, and readmissions associated with CAS.
Comparison to Existing Literature
Volkers et al (2018) conducted a systematic review encompassing 37 studies that developed 46 prediction models for patients undergoing carotid revascularization. 9 The majority of these models were for CEA (74%), with a minority focused on CAS (26%). 9 Most studies utilized traditional statistical methods like logistic regression or Cox proportional hazards analysis, achieving AUROC values ranging from 0.58 to 0.74, 9 while we achieved AUROCs >0.90 using ML methods. Most (54%) models did not discuss how missing data was handled, 9 while we used complete-case analysis due to a small amount of missing data. Furthermore, none of the existing models predicted readmission, 9 whereas our models included secondary outcomes such as non-home discharge and unplanned readmission, which impact patient outcomes and healthcare costs. 55 Compared to current CAS risk prediction tools, our ML algorithms exhibit methodological strength and better performance on more clinically relevant outcomes. Additionally, our models fill a gap by providing accurate risk predictions for a procedure that has often not been included in existing tools such as the SVS VQI CRI. 11
Bonde et al (2021) trained ML algorithms using data from over 2900 procedures in the ACS NSQIP database to predict peri-operative complications, achieving AUROC values between 0.85 and 0.88. 15 Considering the distinct characteristics and vascular comorbidities of CAS patients, generic surgical risk prediction tools may have limitations. 56 By developing tailored ML algorithms for CAS, we surpassed an AUROC of 0.90 and included specific outcomes such as distal embolization, acute occlusion/thrombosis, restenosis, and repeat revascularization, crucial for vascular surgeons and interventionalists. Our study underscores the importance of procedure-specific ML models in enhancing performance and clinical relevance. This effort complements previous work on predicting outcomes for CEA using ML. 57
Explanation of Findings
There are several explanations for our findings. First, patients who develop adverse events following CAS represent a high-risk group, which is corroborated by previous literature. 58 Aggressive medical management including antiplatelet therapy is a Grade 1A recommendation by SVS guidelines, 59 yet patients who developed MACE in our cohort were less likely to receive antiplatelets. This underscores a critical opportunity to improve patient care by understanding their surgical risk and medically optimizing them prior to CAS. According to SVS guidelines, patients undergoing CAS should receive dual antiplatelet therapy perioperatively (Grade 1C), an embolic protection device should be used during the procedure to reduce the risk of cerebral embolization (Grade 1B), and no specific recommendation has been made regarding the type of anesthesia for CAS. 59 Given the increased risk of perioperative strokes without the use of cerebral protection devices, it is critical to employ these devices intraoperatively when feasible. 59 Second, our ML models demonstrated performance superior to existing tools for several reasons. Compared to traditional logistic regression, advanced ML techniques can better model complex, non-linear relationships. 60 This is particularly important in health care data, where patient outcomes can be influenced by many factors. 61 Our top-performing algorithm was XGBoost, which has unique advantages over other ML approaches including relatively fewer issues with overfitting and faster computing while maintaining precision.35,62,63 Furthermore, XGBoost is well-suited to structured data, likely explaining its better performance compared to more complex algorithms such as neural networks on our dataset. 64 It is important to note that in some cases, carefully constructed logistic regression models can achieve equivalent or superior performance compared to ML models. 65 Third, our XGBoost model performance remained robust across demographic/clinical subpopulations. This is an important finding given that algorithm bias against underrepresented populations is a significant issue in ML studies. 66 We were likely able to avoid these biases due to the excellent capture of sociodemographic data by ACS NSQIP, a multi-national database that includes diverse patient populations.67,68 Fourth, a small proportion of CAS (<5%) was performed for ipsilateral carotid stenosis percentage <50% or occlusions. The reasons for these interventions are unclear from our dataset but may be related to patient preference, poor adherence to guideline-directed therapy, or coding errors.7,8
Implications
Our ML models can guide clinical decision-making in several ways. Preoperatively, a patient predicted to be at heightened risk of adverse events should be further assessed in terms of modifiable and non-modifiable factors. Patients with significant non-modifiable risks may benefit from medical management alone. 69 Conversely, individuals with low predicted risk may be considered for CEA based on SVS guidelines. 7 Those with modifiable risks, such as cardiovascular comorbidities, should be further evaluated and optimized with consideration of referral to cardiologists or internal medicine specialists.70,71 At the postoperative stage, patients at high risk of 30-day MACE may benefit from close monitoring in the intensive care unit. 72 Additionally, patients at high risk of non-home discharge or readmission should receive early support from allied health professionals to facilitate safe discharge planning. 73 These peri-operative decisions guided by our ML models have the potential to improve outcomes and reduce costs related to adverse events.
The programming code for our ML models is publicly available on GitHub (https://shorturl.at/AEiV2). These tools can be used by clinicians involved in the peri-operative management of patients being considered for CAS. On a broader scale, our models can be implemented by the >700 centers worldwide that participate in ACS NSQIP. Their utility may also extend beyond NSQIP sites, as the input features are commonly captured variables for routine vascular care. 74 A distinct advantage of our ML models lies in their capacity to provide automated risk predictions, thereby enhancing practicality in busy clinical environments compared to traditional risk predictors that generally require manual input of variables. 75 Specifically, our ML algorithms can autonomously extract a patient’s NSQIP information to provide predictions of procedural risk. Predictive performance declined significantly for all models with the reduction of the number of features. We advocate for dedicated health care data analytics teams at the institutional level, as their benefits have been previously demonstrated and these experts can facilitate model implementation using our code. 76
Limitations
Our study has several limitations. First, our models were developed using ACS NSQIP data. Hospitals that participate in ACS NSQIP tend to be larger with more resources, which may limit the generalizability of our models. 77 Notably, the top 10 predictive features in our models are generally accessible across hospital settings in the work-up of patients with carotid stenosis, and therefore, future models that limit their input features to those that are common to obtain while maintaining predictive performance may increase generalizability. Future investigations are needed to evaluate whether model performance remains robust at institutions not enrolled in ACS NSQIP. Additionally, prospective validation whereby our ML models are tested for predictive performance and/or impact on outcomes in a prospectively recruited cohort of patients, rather than a clinical registry, would further demonstrate the clinical utility of the models. Second, the sample size was lower than expected over a 10-year period likely because ACS NSQIP is primarily a surgical database, and procedures performed by interventional radiologists or other non-surgical specialists may be under-captured. Additional investigation of CAS performed by non-surgical specialists may increase the sample size for analysis. Third, the ACS NSQIP database captures 30-day outcomes. Evaluation of ML algorithms on other data sources with longer follow-up may improve our understanding of long-term risk. Fourth, although preoperative medications were captured, postoperative antiplatelet therapy use was not available in our dataset. Additionally, patients who were not taking antiplatelets may be on anticoagulants. However, information on anticoagulants was not available within our dataset. Future model training on datasets that capture more detailed medication information would be prudent. Furthermore, some anatomic variables including the location, thickness, and distribution of the stenosis were not available in our dataset. Future models that incorporate these input features may improve model predictive performance. Moreover, the NSQIP dataset does not clarify the specific imaging modality by which the anatomic characteristics were determined. Given that there may be greater diagnostic accuracy achieved with some imaging modalities, this information would be helpful for future studies. Fifth, our models are limited to patients undergoing transfemoral CAS. A ML model for predicting CEA outcomes has been previously described, 57 and work is ongoing to develop predictive algorithms for transcarotid artery revascularization (TCAR). Particularly, a combined ML model for transfemoral CAS, CEA, and TCAR that predicts risk for the various treatments may provide additional information to clinicians regarding the optimal treatment approach based on short, middle, and long-term outcomes.
Conclusions
In this study, we leveraged the ACS NSQIP targeted vascular database to develop robust ML models that preoperatively predict 30-day MACE following CAS with excellent performance (AUROC 0.93). Our models also predicted stroke, MI, death, carotid-related morbidity, other morbidity, non-home discharge, and readmission with AUROCs of 0.86–0.97. Given that our ML algorithms perform better than existing tools and logistic regression, they have the potential for important utility in the peri-operative management of patients being considered for transfemoral CAS to mitigate adverse outcomes. Prospective validation of our prediction models is warranted.
Supplemental Material
sj-docx-1-jet-10.1177_15266028251333670 – Supplemental material for Predicting Outcomes Following Carotid Artery Stenting Using Machine Learning
Supplemental material, sj-docx-1-jet-10.1177_15266028251333670 for Predicting Outcomes Following Carotid Artery Stenting Using Machine Learning by Ben Li, Badr Aljabri, Derek Beaton, Mohamad A. Hussain, Douglas S. Lee, Duminda N. Wijeysundera, Ori D. Rotstein, Charles de Mestral, Muhammad Mamdani, Graham Roche-Nagle and Mohammed Al-Omran in Journal of Endovascular Therapy
Footnotes
Acknowledgements
The American College of Surgeons (ACS) National Surgical Quality Improvement Program (NSQIP) and the hospitals participating in the ACS NSQIP are the source of the data used herein; they have not verified, and are not responsible for, the statistical validity of the data analysis or the conclusions derived by the authors.
Code Availability
Data Availability
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was funded partially by the Canadian Institutes of Health Research, Ontario Ministry of Health, PSI Foundation, and University of Toronto Schwartz Reisman Institute for Technology and Society (Dr. Li). Dr. Hussain is funded by a Brigham and Women’s Hospital Heart and Vascular Center Faculty Award. The funding sources did not play a role in the design or conduct of the research.
Ethical Approval and Informed Consent
This study was exempt from institutional ethics board review and informed consent was not required as the data came from a large, deidentified registry (ACS NSQIP).
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
