Sage Journals: Discover world-class research

Abstract

Background

Pulmonary embolism (PE) patients combined with heart failure (HF) have been reported to have a high short-term mortality. However, few studies have developed predictive tools of 30-day mortality for these patients in intensive care unit (ICU). This study aimed to construct and validate a machine learning (ML) model to predict 30-day mortality for PE patients combined with HF in ICU.

Methods

We enrolled patients with PE combined with HF in the Medical Information Mart for Intensive Care Database (MIMIC) and developed six ML models after feature selection. Further, eICU Collaborative Research Database (eICU-CRD) was utilized for external vali- dation. The area under curves (AUC), calibration curves, decision curve analysis (DCA), net reclassification improvement (NRI), and integrated discrimination improvement (IDI) were performed to evaluate the prediction performance. Shapley additive explanation (SHAP) was performed to enhance the interpretability of our models.

Results

A total of 472 PE patients combined with HF were included. We developed six ML models by the 13 selected features. After internal validation, the Support Vector Ma- chine (SVM) model performed best with an AUC of 0.835, a superior calibration degree, and a wider risk threshold (from 0% to 90%) for obtaining clinical benefit, which also outperformed traditional mortality risk evaluation systems,as evaluated by NRI and IDI. The SVM model was still reliable after external validation. SHAP was performed to explain the model. Moreover, an online application was developed for further clinical use.

Conclusion

This study developed a potential tool for identify short-term mortality risk to guide clinical decision making for PE patients combined with HF in the ICU. The SHAP method also helped clinicians to better understand the model.

Keywords

machine learning intensive care unit pulmonary embolism heart failure prediction model mortality

Introduction

Pulmonary embolism (PE) is known as the third most fatal acute cardiovascular syndrome worldwide, ranking just behind myocardial infarction and stroke with a declined but still high mortality rate of 2%–7%, despite more advancements in treatment techniques.^1,2 As for heart failure (HF), it has been demonstrated as a risk factor for mortality prediction and risk stratification of pulmonary embolism.^3,4 Such elevated risk is considered to be associated with impaired left ventricular function and decreased ejection fraction, increased venous stasis, and chronic inflammatory state in the cardiovascular system.^5,6 In addition, massive PE, which is more likely to occur among inpatients with underlying cardiopulmonary disease, can lead to impaired right ventricular (RV) function, dilatation of the right ventricle, myocardial ischemia, and ultimately further deteriorates left ventricular function and cardiac output.⁷

Moreover, a retrospective single center study found an incidence of 9.1% of PE among patients with severe chronic heart failure(CHF), who were in a severity-dependently throm-boembolic risk.⁸ Another multi-center study described a 12.2% all-cause in-hospital mortality rate of PE patients with CHF, where as the rate declined to 9.7% after receiving an inferior vena cava filter.⁷ Thus, there calls a critical need to predict the risk of short-term mortal-ity for pulmonary embolism patients with heart failure (PEpHF) and reduce the death and disability by early detection and prevention. Several studies have identified the risk factors of clinical prognosis for PE patients and HF patients respectively. For PE patients, elevated heart rate, right ventricular longitudinal strain-related indices, white blood cell count (WBC) and systemic inflammation response index (SIRI) have been identified as risk factors associ-ated with adverse outcomes, besides previous surgery, cancer, immobilization and markers of thrombo-inflammation such as D-dimer, C-reactive protein (CRP), and blood cellular in- dices.^9–11 For HF patients, lymphocyte to white blood cell count ratio (LWR), shock indexes (SI), triglycerides, N-terminal pro-b-type natriuretic peptide, and blood urea nitrogen levels were described as risk factors for all-cause mortality.^12–14 In addition, the effort to develop prognostic models for the above groups has also been made. For example, Ding et al¹⁵ developed a nomogram-based model to predict the short-term mortality for patients with PE and Shi et al¹⁶ also constructed a nomogram-based model to predict the short-term mortality for patients with HF. Nevertheless, few studies have identified the risk factors of 30-day all-cause mortality for PEpHF from ICU cohort and developed a dynamic predictive and visual tool for clinical use.

Recently, ML, as a part of artificial intelligence, leverages data to fit high-dimensional and nonlinear relationships among clinical features, and thus could forecast clinical outcomes.^8,17 Characterized by its powerful learning capabilities and predictive effectiveness, ML has been preferred by more clinicians to support clinical decision-making, with more accuracy compared with traditional statistical methods.¹⁸ Meanwhile, there have been several studies aiming to construct the predictive models of prognosis for PE patients and HF patients respectively, based on machine learning (ML).^19,20 In addition, the pulmonary embolism severity index(PESI) and simplified pulmonary embolism severity index(sPESI) score systems have also been used to classify the risk of developing prognostic adverse events for PE patients^4,21 and been proved as excellent clinical indicators. However, it's still necessary to evaluate their efficiency for predicting the 30-day mortality of PEpHF in ICU due to the heterogeneity in different groups.

Above all, our research aimed at developing a ML predictive model that can accu- rately predict 30-day mortality for PEpHF in ICU, by using the free and open critical care database—Medical Information Mart for Intensive Care(MIMIC) IV, MIMIC III and eICU Collaborative Research Database (eICU-CRD). Furthermore, in order to fit the need of clinical application, we also intended to develop an online tool for clinicians to better identify the risk group to reduce the death rate among PEpHF in ICU. To enhance the interpretability of ML model, shapley additive explanation (SHAP)²² was also carried out in our study.

Material and Methods

Data Source

We extracted derivation data from MIMIC-III and MIMIC-IV, which are known as open- sourced databases with medical health records for patients who have been admitted at Beth Israel Deaconess Medical Center.²³ We also extracted data from eICU-CRD, comprising 139 367 patients admitted between 2014 and 2015, for external validation.²⁴ Data were obtained by Structured Query Language (SQL) and pgAdmin4 PostgreSQL 9.6. Moreover, we have already completed an online course offered by the National Institutes of Health (NIH) and were granted access to the MIMIC database (certification number: 64322113). All methods were carried out in accordance with the “Declaration of Helsinki.” This retrospective study did not use personal identifying information and thus did not require informed patient consent or Institutional Ethics Committee Board approval.

Study Population

As shown in Figure 1, patients from the MIMIC database and eICU-CRD were fully retrieved, target PE patients with HF in MIMIC database were identified by the international classification of diseases (ICD)-9 or -10 version diagnostic code (Table S1). Patients that met exclusion criteria (age < 18 or age >95, not first ICU admission and < 24 h stay in ICU) were excluded.

Figure 1.

Flow chart of the study design. Abbreviations: ICD, international classification of diseases; CART, classification and regression trees; SMOTE, synthetic minority over-sampling technique; ML, machine learning; ICU, intensive care unit; RF, random forest; XGboost, extreme gradient boosting; LR, logistic regression; Catboost, categorical boosting; Light GBM, light gradient boosting; SVM, support vector machine SHAP, shapley additive explanation; AUC, area under curve.

Finally, 472 patients were included for the model development, while 534 patients selected from eICU-CRD were regarded as an independent validation set to evaluate the generalizability of the established models.

Predictor Variables

Variables extracted from the MIMIC-III and MIMIC-IV database encompassed demographic characteristics, vital signs, laboratory tests and co-existing diseases. The hadmi_id parameter was used to extract demographic characteristics, including age, sex, weight, and ethnicity. In terms of comorbidities, chronic pulmonary diseases(COPD), renal disease, severe liver disease, peripheral vascular disease(PVD), myocardial infarct(MI), acute heart failure, and cancer-related comorbidities were mainly extracted. The vital sign values, including heart rate_max(HR_max), systolic blood pressure_min(SBP_min), respiratory rate_max(RR_max), saturation of peripheral oxygen_min (Spo2_min), temperature_min, and urine output(UO), were extracted and presented in the suitable format accordingly. The laboratory results were anion gap, bicarbonate, creatinine, chloride, glucose, bicarbonate, blood urea nitrogen, potassium, partial thromboplastin time (PTT), prothrombin time(PT), hematocrit, hemoglobin, white blood cell (WBC) count, and platelet. Ventilation, vasopressin and anticoagulant drug treatments were also included. Furthermore, PESI, sPESI, Charlson Comorbidity Index (CCI), Sequential Organ Failure Assessment (SOFA), and Glasgow Coma Scale (GCS) scores of each patient was calculated. Ventilation, vasopressin and anticoagulant drug treatments were also included. The primary outcome of our study was all cause mortality within 30 days after ICU admission. Laboratory tests and vital signs were measured within the first 24 h after ICU admission. Variables with missing value proportion more than 20% were excluded. Categorical variables were converted to integer values (usually 0 or 1), while continuous variables were the original variable values.

Imputation of Missing Value

To avoid changes in variables distribution caused by imputation, only variables with missing value proportion less than 20% were included in further analysis. In order to ensure the relia-bility of our study, the predicted outcome was not imputed. In addition, PESI, sPESI, Charlson Comorbidity Index, and SOFA scores were also not imputed due to no missing data. R package “caret” was used to impute the missing value. The approach, based on the Classification and Regression Trees (CART) algorithm, could handle non-linear relationships and interactions, accommodates mixed data types, and is robust to outliers, improving imputation accuracy and model interpretability based on an assumption that all missing values are missing at random (MAR). This decision aligns with the suitability of CART Imputer for MAR scenarios.

Model Development, Comparison, and Interpretation

Firstly, we randomly divided patients into training cohort for model construction and another validation cohort for model validation with a ratio of 3:1. Subsequently, utilizing the training data, the conventional logistic regression analysis and least absolute shrinkage and selection operator (LASSO) regression analysis were used and further selected the most relative indi- cators. Due to the unbalance of positive and negative cases in the training cohort, we used synthetic minority over-sampling technique (SMOTE) to reduce overfitting by synthesizing new samples and added them to the training cohort. After variable shrunking and case balanc- ing, we finally constructed six ML models— Extreme Gradient Boosting (XGboost), Logistic Regression (LR), Random Forest (RF), Categorical Boosting (Catboost), Light Gradient Boosting Machine (Light GBM) and Support Vector Machine (SVM) with six algorithms using all of selected variables and different model training parameters. Moreover, the PESI and sPESI were existed score systems for assessing PE risk. To determine whether the new model outperformed them, we also testing the predictive ability of PESI and sPESI in training and validation cohorts companied by comparison with the ML model. Finally, we reevaluated the new model in the eICU-CRD cohort. The area under curves (AUC), calibration curves, net reclassification improvement (NRI), and integrated discrimination improvement (IDI) were per- formed to evaluate the prediction performance of discrimination and calibration respectively for all ML models in training and validation cohorts. Decision Curve Analysis (DCA) was also used to explore whether the ML models could improve clinical decision making. SHAP was performed to enhance the interpretability of our models, which can accurately calculate the contribution and impact of each feature on the predicted outcome and present it visually.

Statistical Analysis

Baseline data and clinical outcomes in the training and validation cohorts were expressed. Categorical variables were expressed as percentages and compared by the chi-square test or Fisher's exact test accordingly. Continuous variables were represented as the median with interquartile range and compared using the Mann-Whitney U test. Data cleaning and transformation, variable selection, model building, performance evaluation, and validation were all conducted in R software(version 4.1.4) through appropriate R packages(eg “caret” and “fastshap”).All tests were two-sided, and P ≤ .05 was considered statistically significant.

Results

Baseline Characteristics

A total of 472 patients from MIMIC database were included for model derivation in this study.The patients were divided into training and internal validation groups using complete randomized sampling, with a ratio of 3:1. No statistically significant differences were found between training cohort and validation cohort in most terms and the baseline characteristics of the patients are concretely presented in Table 1. The baseline characteristics of these patients between survival and non-survival groups can be found in Table S2, in which the significant difference is exhibited. After “SMOTE”, the number of cases of the training group increased to 528, including 264 cases of 30-day mortality. Whereas, the validation group consisted of 118 patients, including 30 positive cases and 88 negative cases. In addition, 173 patients from eICU-CRD were extracted as the external validation cohort, with 28 positive cases and 145 negative cases.The detailed baseline information of these patients can be found in Table S2.

Table 1.

Basic Characteristics of the Patients Divided into Training and Testing Cohort.

Basic characteristic	Derivation Cohort	Validation Cohort	p-value
	(N = 354)	(N = 118)
Demographics
Age	72 (63, 81)	70 (61, 81)	.268
Gender			.750
Female	186 (52.5%)	60 (50.8%)
Male	168 (47.5%)	58 (49.2%)
Race			.226
Others	136 (38.4%)	38 (32.2%)
White	218 (61.6%)	80 (67.8%)
marital status			.450
No	212 (59.9%)	66 (55.9%)
Yes	142 (40.1%)	52 (44.1%)
Weight	80 (67, 98)	82 (72, 102)	.262
Score systems
CCI	6.0 (5.0, 9.0)	7.0 (5.0, 8.0)	.729
APS III	47 (36, 59)	49 (38, 60)	.319
GCS	15.00 (14.00, 15.00)	15.00 (14.00, 15.00)	.594
SOFA	5.0 (3.0, 9.0)	6.0 (4.0, 8.0)	.302
PESI	40.00 (30.00, 67.50)	40.00 (30.00, 67.50)	.886
sPESI	2.00 (1.25, 3.00)	2.00 (1.00, 3.00)	.580
Comorbidities
Peripheral Vascular Disease			.057
No	311 (87.9%)	111 (94.1%)
Yes	43 (12.1%)	7 (5.9%)
Cerebrovascular Disease			.178
No	306 (86.4%)	96 (81.4%)
Yes	48 (13.6%)	22 (18.6%)
Chronic Pulmonary Disease			.081
No	208 (58.8%)	80 (67.8%)
Yes	146 (41.2%)	38 (32.2%)
Sepsis			>.999
No	346 (97.7%)	116 (98.3%)
Yes	8 (2.3%)	2 (1.7%)
Diabetes			.152

Characteristic	Training cohort	Validation cohort	P value
	(N = 354)	(N = 118)
No	230 (65.0%)	68 (57.6%)
Yes	124 (35.0%)	50 (42.4%)
Renal Disease			.857
No	261 (73.7%)	86 (72.9%)
Yes	93 (26.3%)	32 (27.1%)
Malignant Cancer			.526
No	291 (82.2%)	100 (84.7%)
Yes	63 (17.8%)	18 (15.3%)
Severe Liver Disease			.739
No	345 (97.5%)	116 (98.3%)
Yes	9 (2.5%)	2 (1.7%)
Metastatic solid tumor			.663
No	316 (89.3%)	107 (90.7%)
Yes	38 (10.7%)	11 (9.3%)
Myocardial infarct			.858
0	258 (72.9%)	85 (72.0%)
1	96 (27.1%)	33 (28.0%)
Treatment
Warfarin			.366
No	188 (53.1%)	57 (48.3%)
Yes	166 (46.9%)	61 (51.7%)
NOAC			.435
No	276 (78.0%)	96 (81.4%)
Yes	78 (22.0%)	22 (18.6%)
Heparin			.127
No	26 (7.3%)	4 (3.4%)
Yes	328 (92.7%)	114 (96.6%)
Ventilation			.231
No	75 (21.2%)	19 (16.1%)
Yes	279 (78.8%)	99 (83.9%)
Vasopressin			.499
No	323 (91.2%)	110 (93.2%)
Yes	31 (8.8%)	8 (6.8%)
Clinical Outcome			>.999
Survivor	264 (74.6%)	88 (74.6%)
Death	90 (25.4%)	30 (25.4%)

Note: Continous variables are expressed in terms of the median with interquartile range. The Mann-Whitney U test was used to determine the P value. Categorical variables are presented in terms of %. The weighted chi-square test or Fisher's exact test was used to determine P value.

Abbreviations: SOFA, sequential organ failure assessment; GCS, Glasgow coma scale; APS III, Acute Physiology score III; NOAC, novel oral anticoagulant; CCI, Charlson Comorbidity Index; PESI, pulmonary embolism severity index; sPESI, simplified pulmonary embolism severity index.

Construction and Validation of Machine Learning Models

Following variable screening methods displayed in Figure 2, 13 potential independent risk factors (hemoglobin_min, APS III score, WBC_max, SOFA score, CCI, race, ventilation, peripheral vascular disease, malignant cancer, the metastasis of solid tumor, the treatment of warfarin and NOAC, and urineoutput < 400 ml) among PEpHF were identified as predictors for constructing six ML models, whose performance was then evaluated in the internal validation dataset. As the Figure 3A shows, except LR and Light GBM models, the other 4 models both performed well to predict 30-day mortality in training cohort with AUCs over 0.90. The other five metrics for assessing the performance of all six ML models were also evaluated, including sensitivity, specificity, PPV, NPV, and accuracy. As presented in Figure 3C, SVM (0.886, 90.2%, 64.0%, 87.2%, 80.0%, 0.864) performed best. Correspondingly, after processing internal validation, SVM remained well-behaved with the biggest AUC of 0.835 (Figure 3B). Furthermore, when evaluated by calibration curves, SVM performed better when compared to the 3 left models, which owned a wider risk threshold (from 0% to 90%) for obtaining clinical benefit as well (Figure 3D and E, Figure S1, and Figure S2). All the above demonstrated the SVM model was deemed to be selected as the best fitting model. In Figure 3F and Table 2, we further compared the SVM model with PESI and sPESI score systems in predictive efficiency. Corresponding with its previous performance, SVM exhibited better predictive ability than PESI and sPESI scores both in training and validation cohort evaluated by AUC, NRI and IDI with great significance (Figure 3F and Table 1). Ultimately, when validated by the eICU-CRD cohort, the SVM model still maintained a good performance with an AUC of 0.799, which are displayed in Figure S3.

Figure 2.

The procession of important features screening. (A) The result of logistic regression between all variables and 30-day mortality. (B and C) The result of LASSO regression to shrunk features. (D) The dynamic procession of important feature selecting. Abbreviations: LR, logistic regression; LASSO, least absolute shrinkage and selection operator; OR, odds ratio; ROC, receiver operating characteristic.

Figure 3.

The performance of six ML models in the training and validation cohorts, and comparison between SVM model and PESI (sPESI). (A and B) The AUCs of machine learning models in training and validation cohorts. (C) The ring plot of the detailed performance of six machine learning models in the validation cohort. (D) The AUCs of PESI (sPESI) in training and validation cohorts. (E and F) The calibration and clinical decision curves of the SVM model in training and validation cohorts. Abbreviations: RF, random forest; XGboost, extreme gradient boosting; LR, logistic regression; Catboost, categorical boosting; Light GBM, light gradient boosting; SVM, support vectormachine; SHAP, shapley additive explanation; AUC, area under curve; PESI, pulmonary embolism severity index; sPESI, simplified pulmonary embolism severity index.

Table 2.

Comparison Between the SVM Model and PESI (sPESI) Score Systems in Predictive Efficiency.

Items	Compared to PESI	P value	Compared to sPESI	P value
training cohort
NRI(Continuous)	1.45 (1.30–1.59)	<.001	1.43 (1.28–1.58)	<.001
IDI	0.61 (0.55–0.67)	<.001	0.63 (0.57–0.69)	<.001
vilidation cohort
NRI(Continuous)	1.00 (0.67–1.33)	<.001	0.94 (0.59–1.28)	<.001
IDI	0.40 (0.28–0.52)	<.001	0.40 (0.28–0.51)	<.001

Abbreviations: SVM, support vector; PESI, pulmonary embolism severity index; sPESI, simplified pulmonary embolism severity index; NRI, net reclassification improvement; IDI, integrated discrimination improvement.

Model Interpretation and Visualization

To make ML models explainable and interpretable, the SHAP method is utilized to interpret the output of the final model by calculating the contribution of each variable to the prediction, which is consist of global explanation of the model at the feature level and local explanation at the individual level. Global explanation described the overall functionality of the model. To characterize the significance of characteristics, the importance of each feature in different models were further ranked by calculating SHAP value. Figure 4A depicts the importance of predictor variables in the SVM model, with the use of warfarin exerting the greatest impact. What's more, the strength and direction relationships of the predictors with 30-day mortality in PEpHF are further displayed in Figure 4B. For example, patients who didn’t receive the treatment of warfarin or NOAC were more likely to suffer a worse survival outcome, while urine output >400 mL were a protective factor. In addition, local explanation can analyzed how a certain prediction was made for a specific individual by incorporating the individualized input data. Figure 4C shows a patient who didn’t suffer a 30-day death during the ICU stay in the waterfall plot and Figure 4D shows another patient who suffered a 30-day death during the ICU stay. Finally, in order to facilitate the clinical application and promotion of the model, we have built a web app based on the SVM model simultaneously on https://fkd9fq-zhenli- li.shinyapps.io/predict, which can output prediction probabilities of all cause death within 30 days after ICU admissions (Figure 5).

Figure 4.

Global and local model explanation of SVM model by the SHAP method. (A) SHAP summary dot plot to show the strength and direction relationships of the predictors with 30-day mortality in pulmonary embolism patients with heart failure(PEpHF). (B) A summary plot of importance scores of the selected features. (C and D) Waterfall plots to analyze how a certain prediction was made for a specific individual by incorporating the individualized input data. SVM, support vector machine.

Figure 5.

The web APP deployed based on a SVM model to predict mortality risk for clinical use. Abbreviations: SVM, support vector machine.

Discussion

The Feasibility of the ML Model Construction for PEpHF

Up to now, numerous studies have concentrated on predictive ML-based models construction of short-term mortality among PE or HF groups respectively,^19,20 but there still remains a vacancy for mortality prediction of PEpHF due to scattered and incomplete case recording. MIMIC databases, which are more convenient for clinicians and researchers to gain detailed clinical data, with sophisticated ML algorithms can facilitate the development of clinical prediction models.²⁵ This is the first prospective study, to our knowledge, to investigate and compare 6 ML models for predicting 30-day mortality of PEpHF from ICU cohorts, in which a series of risk features were also identified and analyzed. Within 6 ML models, the SVM model displayed the best AUC value accompanied by a good net benefit and a well-fitted calibration curve in both internal and external validating process. SVM, with the best hyperplane separating classes in highdimensional space, is based on kernel function and is widely used as a classification model.^25,26 Moreover, several studies have regarded this method as a well-behaved ML algorithm in the field of medicine.^27,28 Hence, we accordingly employed the SVM algorithm to develop the final model with 13 strongly-related clinical features, which can be obtained or evaluated easily within the first 24 h admission. Thus, it makes this model promising as an early discriminating tool of 30-day mortality for PEpHF in ICU.

The PESI includes a larger array of patient variables along with a significantly greater discriminatory power.²¹ The sPESI is another risk stratification score system to quickly classify PE patients at high or low risk, which has also been reported to have a low specificity with a significant proportion of intermediate risk patients potentially being classified as low- risk conversely. Therefore, we compared the discrimination ability between the optimal SVM model and preceded PESI. The result showed SVM model exhibited an overwhelming predictive performance. Correspond with our finding, previous studies have discovered that PESI scoring system indeed have some deficiency, such as poor prediction performance, low sensitivity and specificity to predict the diagnosis in PE patients, compared with ML models.^18,29 Nevertheless, though the ML method is powerful towards high dimensional processing, it is still limited by how to stating a transparent interpretation, known as a “black-box”.²² The SHAP method could explain a ML model via a global explanation to describe overall functionality of different variables and a local explanation to display how a individualized variables’ contribution combines the final prediction. Hence, the SHAP method was utilized to explain the ML models and visualize individual variable predictions to dispel doubts of clinicians for further clinical application in our study.

The Relationship Between the Selected Features and Prognostic Outcome

For patients with confirmed PE, therapeutic anticoagulation is the cornerstone of treatment, and unfractionated heparin (UFH) is recommend by the ESC/ERS guidelines and lower risk PE can be treated with a NOAC from diagnosis.³⁰ In addtion, the treatment of NOACs and warfarin performed as a protective factor of short-term mortality for PEpHE in the present study. From this perspective, patients who used warfarin or NOACs may be associated with a relatively lower risk at initial anticoagulation therapy, which is parallel with a better prognosis among PE patients. From another perspective, several studies have already demonstrated anticoagulation improves the prognosis of HF patients, which may increase the protective ability. In addition, SOFA score, CCI score, and APS III score all displayed relatively high prediction importance in SVM model of the present study for predicting the short-term mortality of PEpHF. Correspondingly, these score systems have been generally regarded as prognostic factors among patients in ICU, including critically ill patients with PE and HF.^31–33

For hematological indicators, Venetz et al³⁴ have demonstrated the elevation in WBC count is a independent predictor of adverse clinical outcomes and accounts for left ventricular dysfunction.Furthermore, a > 12.6 × 10⁹ /L WBC count was found associated with a higher risk of 30-day mortality in a large cohort with acute PE. Growing animal and autopsy studies revealed efflux of neutrophils, monocytes and lymphocytes caused by chemokine synthesis in the right ventricular could cause cardiomyocyte injury and necrosis by releasing enzyme, chemokines and other pro-inflammatory cytokines, which would lead to fibrosis development and irreversible RV dysfunction.^35,36 Neutrophils produce neutrophil extracellular traps (NETS) and neutrophil derived microparticles (NMP) are the core of thrombotic process by promoting platelet adhesion and coagulation cascade. Correspondingly, WBC count functioned significantly in different ML models of our study, the elevation of which may be a marker for hypercoagulability.³⁷ Moreover, Kostev et al demonstrated anemia was a parameter with an 1.34 odd ratio for in-hospital mortality in PE patients hospitalized^38,39 and another study demonstrated anemia may be a reliable indicator of both short-and long-term all-cause mortality as well as the rates of all-cause HF events.⁴⁰ Consistent with this, we also discovered the decrease of hemoglobin was a independent risk factor for PEpHF in ICU.

The prevalence of venous thromboembolism (VTE) can often present as the first symptom of malignancy. Cancer-associated PE is one of the most important risk factors contributing to cancer mortality.⁴¹ The Kaplan-Meier analysis of a retrospective study showed significantly higher mortality in cancer-associated PE patients than in non-cancer patients.⁴² Another study revealed that metastatic cancer was a significant prognostic factor of long-term mortality in cancer patients with PE.⁴³ Moreover, patients with HF showed higher cancer-related mortality than those expected for the general population.⁴⁴ Another comorbidity, callled PVD, has also been found to be associated with mortality as well as a twofold increase in the prevalence rehospitalization in patients with HF or PE.^45,46 Especially, peripheral artery disease has been reported to be associated with a worse overall prognosis and all- cause mortality in HF patients, which highlights its importance as a comorbidity.^47,48 Additionally, among PAD patients with HF, impaired physical function may relate to skeletal muscle abnormalities or a mixture of central and peripheral hemodynamic abnormalities,⁴⁸ especially in HFpEF, which indicates further reducing cardiovascular fitness and negatively affect patient prognosis. According to the above, the comorbidities of cancer and PVD are reliable predictors, which confirms the direction of the SHAP values in our study.

Mechanical ventilation, as one of the most commonly used life support strategies in ICU, is mainly used to support patients with respiratory insufficiency due to serious illness. Ventilator- associated pneumonia (VAP), ventilator-associated lung injury (VALI), and neuromuscular asthenia, as common complications of mechanical ventilation, may also worse the prognosis at ICU duration. Its utilize also presented as an adverse factor contributing to mortality risk accumulation in our study. Indeed, a decreased UO may indicate low renal perfusion and con- sequent fluid overload, which in turn contributes to subsequent organ dysfunction.⁴⁹ Several studies demonstrated that UO is another haemodynamic marker that reflects haemodynamic stability in patients with RV failure.⁵⁰ Actually, for elderly patients with HF, decreased 24 h UO often indicates severe impairment of cardiac function, especially when combined with renal insufficiency, leading to an increased risk of death.⁵¹ Furthermore, patients with severe PE are more likely to develop acute kidney injury (AKI); little urine are associated with renal insufficiency, which may indicate inadequate blood volume in patients. From another perspective, oliguria has also been found associated with the occurrence of postoperative PE, which may aggravate the thrombus load by a mean of positive feedback.^52,53 Hence, it is necessary to closely observe changes in urine output.Correspondingly, we also involved it as a predict factor in the format of a division by a threshold of 400 mL per day.

Clinical Prospect

As we stated above, the clinical prediction models of short-term mortality for PEpHF are still rare. To our best knowledge, we firstly established a ML model to better manage this group for clinical practice, which showed good predictive accuracy and overall net benefit. Our study may display significant potential to guide clinicians in making appropriate decisions on risk stratification and optimizing patient care to some extent. To better adapt to further clinical demand, we also developed an online web version of the APP on https://fkd9fq-zhenli-li.shinyapps.io/predict (Figure 5). Some limitations are as follows. Firstly, a relatively limited number of patients were included due to the characteristic of single center study. Hence, for future generality, a further external validation of patients in other clinical centers is required to improve the confidence of the results. Secondly, the only criteria for case selection were ICD codes which may influence the quality of the patients included in the study and cause bias. Thirdly, because the data come from only one database and the majority of patients were white, potential bias may be observed.At last, some variables may not included due to the sever missing, such as D-dimer and CRP. Nevertheless, we believe that the proposed model may contribute to further understanding of the mortality of PEpHF.

Conclusion

The study successfully developed and validated a ML-based tool to predict 30-day mortality in PEpHF from ICU. Among the six ML algorithms compared, the SVM model demonstrated the best predictive performance after validating, outperforming traditional score systems such as PESI and sPESI. Moreover, it was also validated by eICU-CRD database and showed a relatively good performance. SHAP analysis further elucidated the impact of key features on the model's predictions from overall and local prospects. However, other external validation is still needed to reaffirm the model and explore its applicability across broader patient populations. Correspondingly, we have developed online tools for further external validation and clinical use. In conclusion, our study, to some extent, highlights the significant potential of ML method in predicting short-term mortality risk for PEpHF in ICU, offering a potential choice for clinical decision-making.

Supplemental Material

sj-docx-1-cat-10.1177_10760296241304764 - Supplemental material for Interpretable Machine Learning Approach for Predicting 30-Day Mortality of Critical Ill Patients with Pulmonary Embolism and Heart Failure: A Retrospective Study

Supplemental material, sj-docx-1-cat-10.1177_10760296241304764 for Interpretable Machine Learning Approach for Predicting 30-Day Mortality of Critical Ill Patients with Pulmonary Embolism and Heart Failure: A Retrospective Study by Jing Liu, Ruobei Li, Tiezhu Yao, Guang Liu, Ling Guo, Jing He, Zhengkun Guan, Shaoyan Du, Jingtao Ma and Zhenli Li in Clinical and Applied Thrombosis/Hemostasis

Supplemental Material

sj-xlsx-2-cat-10.1177_10760296241304764 - Supplemental material for Interpretable Machine Learning Approach for Predicting 30-Day Mortality of Critical Ill Patients with Pulmonary Embolism and Heart Failure: A Retrospective Study

Supplemental material, sj-xlsx-2-cat-10.1177_10760296241304764 for Interpretable Machine Learning Approach for Predicting 30-Day Mortality of Critical Ill Patients with Pulmonary Embolism and Heart Failure: A Retrospective Study by Jing Liu, Ruobei Li, Tiezhu Yao, Guang Liu, Ling Guo, Jing He, Zhengkun Guan, Shaoyan Du, Jingtao Ma and Zhenli Li in Clinical and Applied Thrombosis/Hemostasis

Footnotes

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Foundation of Hebei Provincial Department of Science and Technology, (grant number 223777114D).

Supplemental Material

Supplemental material for this article is available online.

References

Zhu

Yuan

Wei

, et al. Development and validation of a simple nomogram for predicting the short-term prognosis of patients with pulmonary embolism. Hear Lung. 2023;57:144-151. DOI: https://doi.org/10.1016/j.hrtlng.2022.09.010

Barca-Hernando

Jara-Palomares

. Pulmonary embolism: A practical approach to update risk stratification and treatment decisions based on the guidelines. Expert Rev Respir Med. 2023;17(12):1151-1158. DOI: https://doi.org/10.1080/17476348.2023.2298826

Chowdhury

Brown

Kasarabada

. Risk stratification of pulmonary embolism. Curr Opin Pulm Med. 2023;29(5):363-369. DOI: https://doi.org/10.1097/MCP.0000000000000998

Aujesky

Obrosky

Stone

, et al. Derivation and validation of a prognostic model for pulmonary em- bolism. Am J Respir Crit Care Med. 2005;172(8):1041-1046. DOI: https://doi.org/10.1164/rccm.200506-862OC

Dean

Abraham

. Venous thromboembolic disease in congestive heart failure. Congest Hear Fail. 2010;16(4):164-169. DOI: https://doi.org/10.1111/j.1751-7133.2010.00148.x

Chong

Lip

. Viewpoint: The prothrombotic state in heart failure: A maladaptive inflammatory response? Eur J Hear Fail. 2007;9(2):124-128. DOI: https://doi.org/10.1016/j.ejheart.2006.05.009

Wadhwa

Gutta

Trivedi

, et al. In-hospital mortality benefit of inferior vena cava filters in patients with pulmonary embolism and congestive heart failure. AJR Am J Roentgenol. 2018;211(3):672-676. DOI: https://doi.org/10.2214/AJR.17.19332

Darze

Latado

Guimaraes

, et al. Incidence and clinical predictors of pulmonary embolism in severe heart failure patients admitted to a coronary care unit. Chest. 2005;128(4):2576-2580. DOI: https://doi.org/10.1378/chest.128.4.2576

Wang

Liu

, et al. Prolonged elevated heart rate is association with adverse outcome in severe pulmonary embolism: A retrospective study. Int J Cardiol. 2024;417:132581. DOI: https://doi.org/10.1016/j.ijcard.2024.132581

10.

Tzourtzos

Lakkas

Katsouras

. Right ventricular longitudinal strain- related indices in acute pulmonary embolism. Medicina (Kaunas). 2024;60(10):1586. DOI: https://doi.org/10.3390/ medicina60101586

11.

Zhan

Che

. A prognostic prediction model for acute pulmonary embolism. J Investig Med. 2024;72(8):930-937. DOI: https://doi.org/10.1177/10815589241283739

12.

Charach

Spitzer

Zusmanovitch

Charach

. Lymphocyte to white blood cell count ratio an independent risk factor for heart failure. Life (Basel). 2024;14(10):1266. DOI: https://doi.org/10.3390/life14101266

13.

Guzel

Kilic

. Novel predictors of prognosis in heart failure with reduced ejection fraction and the presence of spontaneous echo contrast: Shock index, modified shock index and age-adjusted shock index. Postepy Kardiol Interwencyjnej. 2024;20(3):294-301. DOI: https://doi.org/10.5114/aic.2024.141958

14.

Guan

Wang

Sun

. Development and validation of a nomo- gram model for all-cause mortality risk in patients with chronic heart failure and atrial fibrillation. BMC Geriatr. 2024;24(1):470. DOI: https://doi.org/10.1186/s12877-024-05059-1

15.

Ding

Liu

Zhang

, et al. Development and external validation of a nomogram for predicting short-term prognosis in patients with acute pulmonary embolism. Int J Cardiol. 2024;407:132065. DOI: https://doi.org/10.1016/j.ijcard.2024.132065

16.

Shi

Zeng

Chen

. A nomogram for predicting short-term mortality in icu patients with coexisting chronic obstructive pulmonary disease and congestive heart failure. Respir Med. 2024;234:107803. DOI: https://doi.org/10.1016/j.rmed.2024.107803

17.

Hong

Liu

, et al. Development of a machine learning-based predictive model for lung metastasis in patients with ewing sarcoma. Front Med (Lausanne). 2022;9:807382. DOI: https://doi.org/10.3389/fmed.2022.807382

18.

Zhou

Wang

Yang

Shi

. Federated-learning-based prognosis assessment model for acute pulmonary thromboembolism. BMC Med Inf Decis Mak. 2024;24(1):141. DOI: https://doi.org/10.1186/s12911-024-02543-x

19.

Teodoru

Negrea

Cozgarea

Cozma

Boicean

. Enhancing pulmonary embolism mortality risk stratification using machine learning: The role of the neutrophil-to-lymphocyte ratio. J Clin Med. 2024;13(5):1191. DOI: https://doi.org/10.3390/jcm13051191

20.

Liu

, et al. Establishment of machine learning-based tool for early detection of pulmonary embolism. Comput Methods Programs Biomed. 2024;244:107977. DOI: https://doi.org/10.1016/j.cmpb.2023.107977

21.

Hariharan

Takayesu

Kabrhel

. Association between the pulmonary embolism severity index (pesi) and short-term clinical deterioration. Thromb Haemost. 2011;105(04):706-711. DOI: https://doi.org/10.1160/TH10-09-0577

22.

Azodi

Tang

Shiu

. Opening the black box: Interpretable machine learning for geneticists. Trends Genet. 2020;36(6):442-455, DOI: https://doi.org/10.1016/j.tig.2020.03.005

23.

Johnson

ABL

Pollard

Horng

Celi

Mark

. Mimic-iv (version 2.2). physionet. 2023. https://doi.org/10.13026/6mm1-ek67

24.

Pollard

Johnson

Raffa

, et al. Research database (version 2.0). physionet. https://doi.org/10.13026/C2WM1R

25.

, et al. Identification and validation of an explainable prediction model of acute kidney injury with prognostic implications in critically ill children: A prospective multicenter cohort study. EClinicalMedicine. 2024;68:102409. DOI: https://doi.org/10.1016/j.eclinm.2023.102409

26.

Alkhathlan

Saudagar

AKJ

. Predicting and classifying breast cancer using machine learning. J Comput Biol. 2022;29(6):497-514. DOI: https://doi.org/10.1089/cmb.2021.0236

27.

Deng

Chen

, et al. Machine learning for early prediction of sepsis-associated acute brain injury. Front Med (Lausanne). 2022;9:962027. DOI: https://doi.org/10.3389/fmed.2022.962027

28.

Pang

Ouyang

Liu

Tang

. Establishment of icu mortality risk prediction models with machine learning algorithm using mimic-iv database. Diagn (Basel). 2022;12(5):1068. DOI: https://doi.org/10.3390/diagnostics12051068

29.

El-Bouri

Sanders

Lip

GYH

investigators

B-V

. Predicting acute and long-term mortality in a cohort of pulmonary embolism patients using machine learning. Eur J Intern Med. 2023;118:42-48. DOI: https://doi.org/10.1016/j.ejim.2023.07.012

30.

Zuin

Bikdeli

Ballard-Hernandez

, et al. International clinical practice guideline recommendations for acute pul- monary embolism: Harmony, dissonance, and silence. J Am Coll Cardiol. 2024;84(16):1561-1577. DOI: https://doi.org/10.1016/j.jacc.2024.07.044

31.

Liu

Meng

, et al. Prognostic accuracy of the serum lactate level, the sofa score and the qsofa score for mortality among adults with sepsis. Scand J Trauma Resusc Emerg Med. 2019;27(1):51. DOI: https://doi.org/10.1186/s13049-019-0609-3

32.

Weng

Yang

, et al. Development and validation of a nomogram to predict the mortality risk in elderly patients with arf. PeerJ. 2021;9:e11016. DOI: https://doi.org/10.7717/peerj.11016

33.

Raj

Chandna

Doukas

, et al. Combined use of wells scores and d-dimer levels for the diagnosis of deep vein thrombosis and pulmonary embolism in COVID-19: A retrospective cohort study. Cureus. 2021;13(9):e17687. DOI:https://doi.org/10.7759/cureus.17687

34.

Venetz

Labarere

Jimenez

Aujesky

. White blood cell count and mortality in patients with acute pulmonary embolism. Am J Hematol. 2013;88(8):677-681. DOI: https://doi.org/10.1002/ajh.23484

35.

Watts

Gellar

Obraztsova

Kline

Zagorski

. Role of inflamma- tion in right ventricular damage and repair following experimental pulmonary embolism in rats. Int J Exp Pathol. 2008;89(5):389-399. DOI: https://doi.org/10.1111/j.1365-2613.2008.00610.x

36.

Begieneman

van de

Goot FR

van der

Bilt IA

, et al. Pulmonary embolism causes endomyocarditis in the human heart. Heart. 2008;94(4):450-456. DOI: https://doi.org/10.1136/hrt.2007.118638

37.

Obradovic

Dzudovic

Subotic

, et al. Association of blood leukocytes and hemoglobin with hospital mortality in acute pulmonary embolism. J Clin Med. 2023;12(19):6269. DOI: https://doi.org/10.3390/jcm12196269

38.

Kostev

Laduch

Scheimann

, et al. Mortality rate and factors associated with in-hospital mortality in patients hospitalized with pulmonary embolism in Germany. J Thrombosis Thrombolysis. 2024;57(7):1154-1162. DOI: https://doi.org/10.1007/s11239-024-03036-4

39.

Ruan

Qiu

Chen

. The association between mean corpuscular hemoglobin concentration and prognosis in patients with acute pulmonary embolism: A retrospective cohort study. Clin Appl Thrombosis/Hemostasis : Official J Int Acad Clin Appl Thromb. 2022;28:10760296221103867. DOI: https://doi.org/10.1177/10760296221103867

40.

Mirghani

Alshreef

Al-Teman

, et al. Updates on the association between anemia and heart failure: A systematic review. Cureus. 2024;16(9):e69101. DOI: https://doi.org/10.7759/cureus.69101

41.

Streiff

Holmstrom

Angelini

, et al. Cancer-associated venous thromboembolic disease, version 2.2021, nccn clinical practice guidelines in oncology. J Natl Compr Cancer Netw. 2021;19(10):1181-1201. DOI: https://doi.org/10.6004/jnccn.2021.0047

42.

Nakano

Adachi

Imai

, et al. Mortality, recurrent thromboembolism and major bleeding in cancer- associated and non-cancer pulmonary embolism patients treated with direct oral an- ticoagulants. Circ J: Official J Jpn Circ Soc. 2024;88(2):243-250. DOI: https://doi.org/10.1253/circj.CJ-20-1247

43.

Nishikawa

Fujita

Morishima

, et al. Prognostic effect of incidental pulmonary embolism on long-term mortality in cancer patients. Circ J: Official J Jpn Circ Soc. 2024;88(2):198-204. DOI: https://doi.org/10.1253/circj.CJ-20-1160

44.

Sagastagoitia-Fornie

Barge-Caballero

, et al. Cancer in patients with heart failure: Incidence, risk factors and prognostic impact. Eur J Intern Med. 2022;105:89-96. DOI: https://doi.org/10.1016/j.ejim.2022.09.008

45.

Sandesara

Hammadah

Samman-Tahhan

Kelli

O’Neal

. Peripheral artery disease and risk of adverse outcomes in heart failure with preserved ejection fraction. Clin Cardiol. 2017;40(9):692-696. DOI: https://doi.org/10.1002/clc.22716

46.

Fabbian

Gallerani

Pala

, et al. In-hospital mortality for pulmonary embolism: Relationship with chronic kidney disease and end-stage renal disease. The hospital admission and discharge database of the emilia romagna region of Italy. Intern Emerg Med. 2013;8(8):735-740. DOI: https://doi.org/10.1007/s11739-012-0892-8

47.

Wei

Qian

Fang

Wang

. The prognostic value of peripheral artery disease in heart failure: Insights from a meta-analysis. Hear Lung Circ. 2016;25(12):1195-1202. DOI: https://doi.org/10.1016/j.hlc.2016.04.002

48.

Nakamura

Kunii

Yoshihisa

, et al. Impact of peripheral artery disease on prognosis in hospitalized heart failure patients. Circ J. 2015;79(4):785-793. DOI: https://doi.org/10.1253/circj.CJ-14-1280

49.

Shen

Cai

Chen

Yan

. Fluid intake-related association between urine output and mortality in acute respiratory distress syndrome. Respir Res. 2020;21(1):24. DOI: https://doi.org/10.1186/s12931-020-1286-5

50.

Gallet

Meyer

Ternacle

, et al. Diuretic versus placebo in normotensive acute pulmonary em- bolism with right ventricular enlargement and injury: A double-blind randomised placebo controlled study. Protocol of the diper study. BMJ Open. 2015;5(5):e007466. DOI: https://doi.org/10.1136/bmjopen-2014-007466

51.

Wang

Huang

Zhang

, et al. A novel web-based calculator to predict 30-day all-cause in-hospital mortality for 7,202 elderly patients with heart failure in icus: A multicenter retrospective cohort study in the United States. Front Med (Lausanne). 2023;10:1237229. DOI: https://doi.org/10.3389/fmed.2023.1237229

52.

Liang

Chen

Wang

, et al. Intraoperative hypotension, oliguria and operation time are associated with pulmonary embolism after radical resection of head and neck cancers: A case control study. BMC Anesth. 2021;21(1):304. DOI: https://doi.org/10.1186/s12871-021-01521-4

53.

Miao

, et al. Machine learning for predicting hemodynamic deterioration of patients with intermediate-risk pulmonary embolism in intensive care unit. Shock. 2024;61(1):68-75. DOI: https://doi.org/10.1097/SHK.0000000000002261

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.71 MB

0.02 MB