Abstract
Background
Heart failure (HF) is a primary contributor to morbidity and mortality among patients in intensive care units (ICUs), particularly those experiencing chronic critical illness (CCI). This study aims to develop and validate a machine learning (ML) model for predicting in-hospital mortality in CCI patients with HF.
Methods
Retrospective data from over 200 hospitals were sourced from the Medical Information Mart for Intensive Care III (MIMIC-III), MIMIC-IV, and the eICU Collaborative Research Database (eICU-CRD). Only patients diagnosed with both CCI and HF were included. The MIMIC datasets served as the derivation cohort, while the eICU-CRD dataset was used for external validation. Key predictive variables were identified through recursive feature elimination. A range of ML algorithms, including random forest, K-nearest neighbors, and support vector machine (SVM), were evaluated alongside four other models. Model performance was assessed using the area under the receiver operating characteristic curve (AUROC). Model interpretability was enhanced through SHapley Additive exPlanations (SHAP) and local interpretable model-agnostic explanations.
Results
A total of 780 and 610 patients with CCI and HF were assigned to the derivation and validation cohorts, respectively. Eleven features were selected for model development. The SVM model demonstrated substantial predictive accuracy, with AUROC values of 0.781 and 0.675 in the derivation and validation cohorts. Feature importance analysis using SHAP identified Sequential Organ Failure Assessment score, oxyhemoglobin saturation, and blood pressure as key predictors.
Conclusion
The SVM model developed reliably predicts in-hospital mortality in patients with CCI and HF, offering a valuable tool for early intervention and enhanced patient management.
Keywords
Introduction
Heart failure (HF), characterized by impaired cardiac function, represents a significant cardiovascular health challenge of the 21st century, with both high mortality and widespread morbidity. 1 As a leading cause of cardiovascular death globally, HF affects over 64 million individuals worldwide, including approximately 9 million in China.2,3 Furthermore, the aging population and advances in diagnostic technologies are anticipated to further escalate HF prevalence. 4 Despite therapeutic advancements, HF patients, particularly those in intensive care units (ICUs), continue to face considerable challenges, such as extended ICU stays, high healthcare costs, and increased mortality, placing considerable strain on global healthcare systems.5–7
In 1985, Girard introduced the term chronic critical illness (CCI) to describe patients requiring prolonged ICU care. 8 The prevalence of CCI is approximately 17%, with affected patients often undergoing rapid and severe clinical deterioration. These patients exhibit in-hospital mortality rates of 28% and 1-year mortality rates of 45%. 9 The costs associated with CCI care are six times higher than those for non-CCI patients. 10 The consequences of CCI extend beyond the individual patient, affecting families, healthcare systems, and societal structures. Moreover, the relationship between HF and CCI is intricate, as HF frequently leads to CCI and worsens patient outcomes. 11
The families of patients with CCI and HF often confront substantial prognostic uncertainty, which places considerable pressure on medical resources and complicates clinical decision-making. 12 Enhancing the accuracy of in-hospital mortality predictions for these patients could improve prognostic discussions, support collaborative decision-making, and optimize patient care. This emphasizes the pressing need for more reliable predictive tools for this patient population.
The potential of machine learning (ML) models to predict HF mortality is well-established, though their practical implementation is still developing.13,14 ML models excel at processing large datasets with numerous variables, making them particularly advantageous in the evaluation of complex conditions such as CCI and HF, which are characterized by multiple clinical factors that complicate prognosis. In environments with extensive data, where numerous interrelated variables must be considered, ML's integrated approach enhances clinical decision-making support.15,16 Despite these benefits, the complexity of ML models presents significant challenges regarding interpretability, a vital factor for their successful integration into medical practice. 17 To address this, methods like SHapley Additive exPlanations (SHAP) and local interpretable model-agnostic explanations (LIME) have been developed, offering greater transparency by clarifying the contribution of individual features to model predictions.18,19
Current predictive models often overlook the specific requirements of patients with both CCI and HF, as they predominantly target general ICU populations or focus on individual conditions.20,21 Although traditional ICU and HF severity-of-illness scores and ML predictive models are well-established, they often fail to account for the persistent organ dysfunction that characterizes CCI, particularly when compounded by HF. Therefore, early identification and precise prognostication in this combined CCI and HF population is imperative to guide resource allocation, optimize interventions, and ultimately improve patient outcomes. This study seeks to develop an ML model capable of accurately forecasting in-hospital mortality in this patient cohort. The resulting model will offer valuable assistance to healthcare providers in evaluating disease severity and prognosis.
Method
Data sources and study population
In this study, primary data were drawn from the Medical Information Mart for Intensive Care (MIMIC) and the eICU Collaborative Research Database (eICU-CRD). The MIMIC datasets, including MIMIC-III “CareVue” (version 1.4, 2001–2008) and MIMIC-IV (version 2.2, 2008–2019), offer comprehensive clinical information from patients admitted to the Beth Israel Deaconess Medical Center.22,23 The eICU-CRD, which served as the validation cohort, included de-identified records from over 200,000 ICU admissions across more than 200 US hospitals (2008–2019). 24
Participants included in this study were adults (>18 years) diagnosed with CCI and HF during their initial ICU admission. Although a standardized definition of CCI remains absent, the study defined it based on an ICU stay ≥14 days accompanied by persistent organ dysfunction, indicated by a cardiovascular Sequential Organ Failure Assessment (SOFA) score ≥1 or a SOFA score ≥2 for any other organ system, assessed using the most critical measurements on day 14. These criteria align with recent studies.9,25–27 Figure 1 provided a study flowchart. Access to these databases was authorized for researcher Min He (record ID: 57369428).

Flowchart of the study.
Data extraction
Data from patients diagnosed with CCI and HF during their ICU stay were collected, focusing on laboratory data obtained from the 14th day post-admission. The dataset included demographic details, SOFA scores, laboratory results, and comorbid conditions. Diagnoses were categorized according to the International Classification of Diseases, 9th and 10th editions. Key variables included age, gender, body mass index (BMI), heart rate (HR), systolic blood pressure (SBP), diastolic blood pressure (DBP), mean blood pressure (MBP), oxyhemoglobin saturation (SpO2), urine output, and laboratory tests evaluating blood counts, electrolyte levels, and liver and kidney function. Missing values were addressed using the K-nearest neighbors (KNN) imputation method, which has been shown to outperform other techniques in managing missing data in cardiovascular cohort studies. 28 Variables with more than 50% missing data, as well as patients aged over 89 years, were excluded. The primary outcome of the study was in-hospital mortality.
Statistical analysis
Baseline characteristics of the included patients were summarized using standard descriptive statistics. Normality of continuous variables was assessed with the Shapiro–Wilk test. Normally distributed variables were presented as mean ± standard deviation and compared using
Study design
The study population was split into a training set (70%) and a testing set (30%). Continuous variables were standardized to a mean of 0 and a standard deviation of 1 to improve model stability and comparability. Variables with no variance or high multicollinearity were excluded. Feature selection was then conducted using the recursive feature elimination with 10-fold cross-validation (RFECV) method, applying the random forest (RF) algorithm. This process iteratively ranked and pruned features based on accuracy until the optimal set of features was determined.
A range of ML algorithms, including RF, KNN, support vector machine (SVM), extreme gradient boosting (XGBoost), Naive Bayes (NB), light gradient boosting machine (LGBM), and adaptive boosting (AdaBoost), were assessed for developing predictive models. Hyperparameter optimization was performed using GridSearchCV within a 10-fold cross-validation (CV) framework. Model performance was evaluated based on area under the receiver operating characteristic curve (AUROC), accuracy, sensitivity, specificity, and F1 score. Decision curve analysis (DCA) was employed to assess clinical utility, and calibration curves were generated to compare predicted probabilities with observed outcomes. External validation using an independent cohort confirmed the model's generalizability. Additionally, we compared the performance of our model with the widely used SOFA score, calculated on day 14 of ICU admission. We generated AUROC for in-hospital mortality prediction based on the SOFA score in both the derivation and validation cohorts, and then compared the AUROCs with the ML models. Finally, SHAP and LIME methods were applied to analyze the model's outputs. SHAP, based on game theory, was used to evaluate feature importance and visualize its impact on predictions, while LIME provided localized explanations for individual predictions, enhancing model transparency.
Results
Baseline characteristics
A total of 780 patients with CCI and HF were included in the derivation cohort (451 males [57.8%]), while 610 patients with CCI and HF comprised the validation cohort (343 males [56.2%]). Patients were then categorized into survival and non-survival groups based on their outcomes at discharge. Table 1 presented the baseline characteristics of the derivation cohort, stratified by survival outcome, with an in-hospital mortality rate of 29.5% (
Baseline characteristics of the derivation cohort.
SOFA: Sequential Organ Failure Assessment; BMI: body mass index; HR: heart rate; RR: respiratory rate; SBP: systolic blood pressure; DBP: diastolic blood pressure; MBP: mean blood pressure; SpO2: oxyhemoglobin saturation; pO2: partial pressure of oxygen; pCO2: partial pressure of carbon dioxide; RBC: red blood cell; MCH: mean corpuscular hemoglobin; MCHC: mean corpuscular hemoglobin concentration; MCV: mean corpuscular volume; RDW: red blood cell distribution width; WBC: white blood cell; PT: prothrombin time; PTT: partial thromboplastin time; INR: international normalized ratio; BUN: blood urea nitrogen; AF: atrial fibrillation; AKI: acute kidney injury; CHD: coronary heart disease; CKD: chronic kidney disease; COPD: chronic obstructive pulmonary disease; MI: myocardial infarction; RF: respiratory failure.
Features selected and model performance
The RFECV method identified 11 key predictors from the training dataset, which yielded the highest accuracy (Figure 2). These predictors include: SOFA score, SpO2, DBP, SBP, BUN level, age, RDW, BMI, urine output, pH, and platelet count. Subsequently, predictive models were developed using seven ML algorithms based on these 11 predictors: RF, KNN, SVM, XGBoost, NB, LGBM, and AdaBoost. Hyperparameter tuning was performed using GridSearchCV within a 10-fold CV framework to optimize model performance, as detailed in Supplemental Material 3: Table S3. For the SVM model, the optimized hyperparameters were: cost = 26.68, degree = 1, gamma = 23.36, kernel = polynomial, and type = C-classification.

Results of the recursive feature elimination with 10-fold cross-validation method for identifying the predictive features.
In the derivation cohort, the SVM model demonstrated strong performance, achieving an AUROC of 0.781 (95% CI: 0.712–0.863), accuracy of 0.748, sensitivity of 0.739, specificity of 0.691, and an F1 score of 0.613 (Table 2, Figure 3A). In the validation cohort, its performance remained effective, with an AUROC of 0.675 (95% CI: 0.617–0.790), accuracy of 0.645, sensitivity of 0.607, specificity of 0.656, and an F1 score of 0.443 (Table 2, Figure 3B). Among the six alternative models (RF, KNN, LGBM, NB, AdaBoost, and XGBoost), the AUROCs were 0.759 (95% CI: 0.640–0.785), 0.759 (95% CI: 0.583–0.804), 0.744 (95% CI: 0.619–0.789), 0.751 (95% CI: 0.659–0.792), 0.721 (95% CI: 0.609–0.761), and 0.682 (95% CI: 0.553–0.710) in the derivation cohort (Table 2 and Figure 3A), and 0.643 (95% CI: 0.567–0.708), 0.646 (95% CI: 0.554–0.721), 0.615 (95% CI: 0.543–0.687), 0.638 (95% CI: 0.576–0.678), 0.661 (95% CI: 0.599–0.773), and 0.616 (95% CI: 0.598–0.784) in the validation cohort (Table 2). The SVM algorithm consistently outperformed the other models across both cohorts. DCA of the SVM model in both cohorts (Supplemental material 4: Figure S1) highlighted substantial clinical benefit across a range of decision thresholds. Calibration curves (Supplemental material 5: Figure S2) further confirmed the SVM model's accuracy, with the model's predictions closely aligning with the ideal calibration curve. For comparison, the day-14 SOFA score yielded AUROCs of 0.664 (95%CI: 0.622–0.705) in the derivation cohort and 0.629 (95%CI: 0.574–0.683) in the validation cohort (Supplemental material 6: Figure S3), which were lower than those of the SVM model (0.781 vs. 0.675, respectively). Based on these results, the SVM model was chosen for further explainability analysis.

ROC curve of in-hospital mortality. (A) ROC curves comparing the seven machine learning models in the derivation cohort. (B) ROC curves of the validation cohort based on the SVM model. ROC: receiver operating characteristic curve; SVM: support vector machine.
Performance of the seven machine learning models in the derivation and validation cohorts.
RF: random forest; KNN: K-nearest neighbors; SVM: support vector machine; AUROC: area under the receiver operating characteristic curve; XGBoost: extreme gradient boosting; NB: Naive Bayes; LGBM: light gradient boosting machine; AdaBoost: adaptive boosting.
Explainability of SVM model
The testing set data were analyzed to assess the SHAP values within the SVM model. Figure 4A provides a summary plot illustrating the influence of each feature. Subsequent analysis of effect directionality revealed that elevated SOFA score, BUN level, and RDW correlated with increased mortality risk, while higher SpO2, DBP, and SBP were inversely related to mortality risk. Figure 4B ranks features by SHAP values, highlighting those most predictive of circulatory dysfunction—SOFA score, SpO2, DBP, and SBP—as the most influential. Partial dependence plots demonstrated the relationships between individual features and mortality risk, with positive correlations for SOFA score, BUN level, age, RDW, and platelet count, and negative correlations for SpO2, DBP, SBP, BMI, urine output, and pH (Supplemental material 7: Figure S4).

Interpretation of the support vector machine model. (A) Summary plot based on the SHapley Additive exPlanations (SHAP) values. A higher SHAP value of a feature suggests a higher risk contribution. Colors on the plot denote the magnitude of the feature values, wherein high values are indicated in yellow, while low values are shown in purple. (B) Importance ranking of the 11 identified features according to the mean (|SHAP value|).
SHAP and LIME methods were then applied to assess the impact of clinical features on prognostic outcomes in two patients with differing in-hospital results (i.e., survival vs. non-survival). The SHAP waterfall plots used the mean prediction of the dataset for in-hospital mortality as a reference point, visually illustrating how each feature influenced prediction adjustments (Figure 5). For Patient 1, who experienced death, the SHAP values (Figure 5A) indicated a 47.1% mortality risk, with contributing factors including a BMI of 18.6 kg/m², DBP of 46.9 mmHg, SOFA score of 7, SpO2 of 96.5%, SBP of 109 mmHg, urine output of 2856 ml, age of 71 years, and BUN level of 50 mg/dL. In contrast, a platelet count of 61 × 10⁹/L and a pH of 7.42 were associated with a lower mortality risk. The LIME results for Patient 1 showed a mortality probability of 48% (Supplemental material 8: Figure S5A). In Patient 2, who survived, SHAP analysis suggested a 75.5% survival probability, with low SOFA scores and high SpO2 levels contributing to a higher likelihood of survival (Figure 5B). Correspondingly, LIME results for Patient 2 indicated an 83% survival probability (Supplemental material 8: Figure S5B). To ensure compliance with guidelines for AI-based clinical research, the TRIPOD + AI checklist was completed (Supplemental material 9: Table S4).

Interpretation of the support vector machine model in two patients with distinct prognostic outcomes based on the SHapley Additive exPlanations (SHAP) method. (A) Patient 1, who had death as the outcome, was correctly predicted to experience death. (B) Patient 2, who had survival as the outcome, was accurately predicted to survive. This plot shows the significant features contributing to pushing the model output. Mortality risk is elevated (yellow arrows) or reduced (red arrows).
Discussion
This study developed and validated seven ML models to predict in-hospital mortality in patients with CCI and HF, utilizing multicenter data from over 200 hospitals within the MIMIC and eICU-CRD databases. The RFECV method identified 11 key predictors, ranked by their SHAP values: SOFA score, SpO2, DBP, SBP, BUN level, age, RDW, BMI, urine output, pH, and platelet count. Among the models, the SVM exhibited the best performance. To enhance model interpretability, both SHAP and LIME methods were applied, providing insight into the decision-making processes of the SVM model.
CCI refers to a condition in which patients who recover from an initial acute illness continue to experience severe organ dysfunction, necessitating prolonged ICU care. 29 This ongoing condition places considerable strain on healthcare resources, particularly through increased ICU occupancy. Studies have reported elevated mortality rates among CCI patients with HF, with rates of 28.6% and 33.6% observed in Japan and China, respectively.27,30 In the current study, in-hospital mortality was approximately 30%, emphasizing the need for more accurate prognostic assessments.
The RFECV method in this study identified key predictors essential for assessing mortality risk in CCI patients with HF. As a result, the model incorporated only the most fundamental and common clinical prognostic factors. Despite significant differences between the cohorts, especially in the SOFA score, the model exhibited robust performance, with AUROCs of 0.781 and 0.675 in the derivation and validation cohorts, respectively, demonstrating its generalizability and predictive accuracy across various clinical settings. The use of SHAP and LIME techniques in this analysis mitigated the challenges of “black box” ML models, enhancing transparency. This clarity is critical for improving outcomes in CCI and HF patients.
The impact of clinical features on individual patient outcomes was assessed through detailed analyses of two patients from the testing set. The SHAP and LIME algorithms revealed how various clinical indicators influenced in-hospital mortality risk. For Patient 1 (Figure 5A), the ML model predicted a 47.1% likelihood of mortality, as determined by the SHAP method. Factors contributing to increased risk included BMI, DBP, SOFA score, SpO2, SBP, urine output, age, BUN level, and RDW, while higher platelet counts and improved pH levels helped lower the risk. The patient ultimately succumbed to the disease, confirming the model's prediction. In contrast, Patient 2 (Figure 5B) had a 75.5% survival probability based on SHAP results, driven by low SOFA scores and high SpO2 levels. The patient survived at discharge, validating the prediction. Similar results were observed with the LIME method, which corroborated the SHAP findings and provided additional insights into the directional effects of these features. These analyses enhanced clinical understanding and demonstrated the model's potential for broad clinical application, supporting earlier and more targeted interventions to improve patient outcomes (Supplemental material 8: Figure S5A-B).
This study identified a correlation between increased in-hospital mortality risk in patients with CCI and HF and reduced circulatory dysfunction markers, including SOFA score, SpO2, DBP, and SBP. The SOFA score, a widely used tool for assessing organ dysfunction, evaluates the severity of dysfunction across six systems: Respiratory, circulatory, renal, hematologic, hepatic, and central nervous systems. It has been established as a reliable mortality predictor in HF patients and validated as an independent risk factor in the CCI population.11,31 In this study, the SOFA score demonstrated the strongest predictive value, holding the highest weight within the SVM model for CCI and HF patients. Additionally, previous research has shown that lower BP in HF patients is associated with poorer outcomes, primarily due to decreased cardiac output and impaired pump function. 32 Moreover, reduced SpO2 levels, indicative of worsening HF, significantly impacted patient prognosis. 33
Population aging contributes to increased morbidity and healthcare costs, particularly among older adults, who represent a substantial proportion of CCI cases. A previous study indicated that advanced age significantly heightened mortality risk for critically ill individuals with HF.
34
Our data align with this trend, revealing a notable difference in median age between non-surviving and surviving patients (74.0 years vs. 70.5 years,
Additionally, our study identified elevated BUN levels and reduced urine output as key predictors of in-hospital mortality in patients with CCI and HF. Low urine output, common among ICU patients, is strongly linked to poor outcomes due to its association with renal parenchymal damage. 36 Decreased renal function can lead to fluid retention, increased cardiac load, and aggravated HF symptoms. The interaction between progressive cardiovascular and renal dysfunction often intensifies CCI, creating a feedback loop that exacerbates the disease burden and contributes to cardiorenal syndrome. 37 Previous ML analyses have recognized BUN, as a marker of renal function, as a significant predictor of mortality in HF patients.14,38 These findings highlight the importance of comprehensive clinical assessments for mortality risk prediction through ML models.
The identified variables, including age, SOFA score, SpO2, BP, and BUN, exhibit statistically significant differences between survivors and non-survivors, yet their clinical value lies in their translation into actionable strategies. In the context of a patient's CCI diagnosis on day 14 of ICU admission, these variables should inform clinical decision-making. For instance, in Patient 1 (Figure 5A), who ultimately died, the model predicted a high mortality risk. The key contributors to this prediction included low BMI, low DBP, high SOFA score, elevated BUN, and moderately high urine output. BMI was the largest contributor to mortality risk in this patient, potentially indicating malnutrition and prompting the need for urgent nutritional intervention. The low DBP and high SOFA score suggest poor circulatory function and organ failure, warranting prompt vasopressor use, infection control, and hemodynamic optimization. A relatively low SpO₂ in this clinical context may reflect impaired gas exchange, requiring escalation in respiratory support. Furthermore, elevated BUN and altered urine output indicate possible early renal dysfunction, supporting the need for close renal monitoring and potential nephrology consultation. This patient-specific interpretation demonstrates how SHAP-based model outputs can guide personalized, targeted interventions to improve patient outcomes.
The lower AUROCs observed with the day-14 SOFA score indicate that a standard severity-of-illness index may be less suited for predicting in-hospital mortality in patients with CCI and HF than our SVM model. This result suggests that incorporating more detailed features can improve prognostic accuracy beyond generic ICU scoring systems. Though the SVM model's sensitivity (0.607) and F1 score (0.443) remain moderate, indicating a risk of missed high-risk patients and some misclassifications, it still outperformed the day-14 SOFA score in distinguishing high-risk patients. However, the drop in AUROC from 0.781 in the derivation cohort to 0.675 in the validation cohort indicates potential overfitting or dataset bias. Additionally, it should be noted that the enrolled HF patients in our study were exclusively those admitted to ICUs, and variability in ICU admission criteria among institutions, especially pronounced within the validation cohort (eICU-CRD) comprising more than 200 United States hospitals, may substantially impact model generalizability. Significant baseline differences, including higher SOFA scores and poorer arterial blood gas parameters, were observed in the validation cohort, indicating greater disease severity. Moreover, in-hospital mortality was notably higher in the derivation cohort (29.5%) compared to the validation cohort (22.8%), which may reflect differences in ICU admission criteria, patient management, and treatment effectiveness. Future research incorporating domain adaptation or recalibration techniques using detailed institutional and patient-level data may further enhance model robustness. As shown in Supplemental material 2: Table S2, significant differences in baseline characteristics between these cohorts, highlighting the challenge of applying a single SVM model across heterogeneous populations. Despite this variability, DCA curve analysis (Supplemental material 4: Figure S1) indicates a net clinical benefit over “treat all” or “treat none” strategies, and calibration curves (Supplemental material 5: Figure S2) suggest acceptable agreement between predicted and actual outcomes. In real-world settings, borderline predictions may warrant closer scrutiny, while integrating clinical context can mitigate misclassification. Broader validation in more diverse cohorts, continuous re-calibration, and CV are essential to enhance robustness. Such steps could further refine the model's reliability and clinical impact, ensuring it consistently outperforms standard ICU scoring systems for patients with CCI and HF.
To better understand the model's clinical safety and usability, we performed an error analysis in the validation cohort. Among 139 patients who died, 55 (39.6%) were misclassified as survivors (false negatives), often exhibiting intermediate SOFA scores and non-critical vital signs. Among 471 survivors, 162 (34.4%) were predicted to die (false positives), frequently associated with older age or mild laboratory abnormalities. False positives may lead to overtreatment, but false negatives pose greater clinical risk due to potential undertreatment. These findings underscore the importance of clinician oversight and appropriate threshold selection when applying ML predictions to critical care decision-making.
One of the most important next steps for applying this ML-based method in day-to-day clinical practice is integrating it with hospital-based electronic health records. Through these platforms, patient-specific information can directly power integrated ML functions, providing real-time updates on the likelihood of in-hospital mortality for patients with CCI and HF. A secure cloud infrastructure can host the model, which can be accessed via an application programming interface that gathers relevant ICU metrics and delivers continuous prognostic insights to the clinical team. Whenever the predicted risk surpasses predefined thresholds, immediate alerts would prompt specialized evaluations or more intensive care. This strategy optimizes patient-flow management and ensures consistent, transparent use of the model's outputs in routine critical care, ultimately establishing a direct route from predictive analysis to actionable plans that support improved decision-making.
This study has several limitations. First, as the data are retrospectively extracted from the MIMIC and eICU-CRD databases, selection and information biases cannot be fully ruled out. Additionally, the absence of cardiology-specific parameters (e.g., brain natriuretic peptide levels, detailed echocardiographic data, and New York Heart Association [NYHA] classifications) prevented us from distinguishing HF subtypes, such as HF with reduced ejection fraction or HF with preserved ejection fraction, and from incorporating severity grading into the analysis, which significantly affects prognosis and limits the model's clinical applicability. Consequently, the analysis focuses on more universally available laboratory indicators to ensure broader applicability within the ICU setting. Second, the study relies on laboratory data from the 14th day of ICU admission, which fails to capture the dynamic changes in disease progression or organ function over time. For example, dynamic measures such as lactate clearance, which could offer deeper insights into the evolving clinical course of HF patients, are not included. 39 Future integration of dynamic trajectory models for SOFA scores may offer deeper insights into the evolving clinical course of patients with CCI and HF. While all eligible patients from the MIMIC and eICU-CRD databases are included based on the established criteria, the overall sample size remains limited, which may impact the robustness of the findings and increase the risk of overfitting, thus compromising the model's generalizability. Additionally, the analysis focuses solely on in-hospital mortality, omitting longer-term outcomes such as post-discharge survival and functional status, which are equally vital for clinical decision-making. Moreover, although external validation was conducted using the eICU-CRD database, this dataset predominantly includes Western populations. Despite involving multiple independent hospitals, these institutions were all located within the United States, potentially limiting the generalizability of our results to other non-Western or low-resource settings. Nevertheless, the included cohorts represent real-world clinical scenarios from diverse academic and community hospitals in the United States, although they may not fully reflect global HF populations. Extensive external validation should be prioritized, ideally with global, multicenter enrollment and larger sample sizes that capture heterogeneous ethnicities, diverse clinical environments, and healthcare settings. Another potential avenue for refinement is phenotyping. While we concentrated on a specific CCI-HF cohort, recent ML research in other disease contexts (e.g., takotsubo syndrome) demonstrates how cluster-based methods can reveal subgroups with distinct risks and therapeutic responses. 40 Implementing similar phenotyping approaches in a broader CCI-HF population might yield more granular insights and improved risk stratification. Future prospective, multicenter studies involving internationally diverse populations, dynamic clinical variables, and broader outcome measures are essential to enhance the model's robustness and clinical relevance.
Conclusions
The SVM model developed in this study presents a reliable ML tool for predicting in-hospital mortality in patients with CCI and HF. Additionally, global and local interpretability techniques provide a comprehensive understanding of the model's data, potentially enhancing its clinical utility. These insights can inform targeted management strategies aimed at improving survival rates in patients with CCI and HF.
Supplemental Material
sj-docx-1-dhj-10.1177_20552076251347785 - Supplemental material for Interpretable machine learning models for predicting in-hospital mortality in patients with chronic critical illness and heart failure: A multicenter study
Supplemental material, sj-docx-1-dhj-10.1177_20552076251347785 for Interpretable machine learning models for predicting in-hospital mortality in patients with chronic critical illness and heart failure: A multicenter study by Min He, Yongqi Lin, Siyu Ren, Pengzhan Li, Guoqing Liu and Liangbo Hu, Xueshuang Bei, Lingyan Lei, Yue Wang, Qianghong Zhang, Xiaocong Zeng in DIGITAL HEALTH
Supplemental Material
sj-docx-2-dhj-10.1177_20552076251347785 - Supplemental material for Interpretable machine learning models for predicting in-hospital mortality in patients with chronic critical illness and heart failure: A multicenter study
Supplemental material, sj-docx-2-dhj-10.1177_20552076251347785 for Interpretable machine learning models for predicting in-hospital mortality in patients with chronic critical illness and heart failure: A multicenter study by Min He, Yongqi Lin, Siyu Ren, Pengzhan Li, Guoqing Liu and Liangbo Hu, Xueshuang Bei, Lingyan Lei, Yue Wang, Qianghong Zhang, Xiaocong Zeng in DIGITAL HEALTH
Supplemental Material
sj-docx-3-dhj-10.1177_20552076251347785 - Supplemental material for Interpretable machine learning models for predicting in-hospital mortality in patients with chronic critical illness and heart failure: A multicenter study
Supplemental material, sj-docx-3-dhj-10.1177_20552076251347785 for Interpretable machine learning models for predicting in-hospital mortality in patients with chronic critical illness and heart failure: A multicenter study by Min He, Yongqi Lin, Siyu Ren, Pengzhan Li, Guoqing Liu and Liangbo Hu, Xueshuang Bei, Lingyan Lei, Yue Wang, Qianghong Zhang, Xiaocong Zeng in DIGITAL HEALTH
Supplemental Material
sj-docx-4-dhj-10.1177_20552076251347785 - Supplemental material for Interpretable machine learning models for predicting in-hospital mortality in patients with chronic critical illness and heart failure: A multicenter study
Supplemental material, sj-docx-4-dhj-10.1177_20552076251347785 for Interpretable machine learning models for predicting in-hospital mortality in patients with chronic critical illness and heart failure: A multicenter study by Min He, Yongqi Lin, Siyu Ren, Pengzhan Li, Guoqing Liu and Liangbo Hu, Xueshuang Bei, Lingyan Lei, Yue Wang, Qianghong Zhang, Xiaocong Zeng in DIGITAL HEALTH
Supplemental Material
sj-tif-5-dhj-10.1177_20552076251347785 - Supplemental material for Interpretable machine learning models for predicting in-hospital mortality in patients with chronic critical illness and heart failure: A multicenter study
Supplemental material, sj-tif-5-dhj-10.1177_20552076251347785 for Interpretable machine learning models for predicting in-hospital mortality in patients with chronic critical illness and heart failure: A multicenter study by Min He, Yongqi Lin, Siyu Ren, Pengzhan Li, Guoqing Liu and Liangbo Hu, Xueshuang Bei, Lingyan Lei, Yue Wang, Qianghong Zhang, Xiaocong Zeng in DIGITAL HEALTH
Supplemental Material
sj-tif-6-dhj-10.1177_20552076251347785 - Supplemental material for Interpretable machine learning models for predicting in-hospital mortality in patients with chronic critical illness and heart failure: A multicenter study
Supplemental material, sj-tif-6-dhj-10.1177_20552076251347785 for Interpretable machine learning models for predicting in-hospital mortality in patients with chronic critical illness and heart failure: A multicenter study by Min He, Yongqi Lin, Siyu Ren, Pengzhan Li, Guoqing Liu and Liangbo Hu, Xueshuang Bei, Lingyan Lei, Yue Wang, Qianghong Zhang, Xiaocong Zeng in DIGITAL HEALTH
Supplemental Material
sj-tif-7-dhj-10.1177_20552076251347785 - Supplemental material for Interpretable machine learning models for predicting in-hospital mortality in patients with chronic critical illness and heart failure: A multicenter study
Supplemental material, sj-tif-7-dhj-10.1177_20552076251347785 for Interpretable machine learning models for predicting in-hospital mortality in patients with chronic critical illness and heart failure: A multicenter study by Min He, Yongqi Lin, Siyu Ren, Pengzhan Li, Guoqing Liu and Liangbo Hu, Xueshuang Bei, Lingyan Lei, Yue Wang, Qianghong Zhang, Xiaocong Zeng in DIGITAL HEALTH
Supplemental Material
sj-tif-8-dhj-10.1177_20552076251347785 - Supplemental material for Interpretable machine learning models for predicting in-hospital mortality in patients with chronic critical illness and heart failure: A multicenter study
Supplemental material, sj-tif-8-dhj-10.1177_20552076251347785 for Interpretable machine learning models for predicting in-hospital mortality in patients with chronic critical illness and heart failure: A multicenter study by Min He, Yongqi Lin, Siyu Ren, Pengzhan Li, Guoqing Liu and Liangbo Hu, Xueshuang Bei, Lingyan Lei, Yue Wang, Qianghong Zhang, Xiaocong Zeng in DIGITAL HEALTH
Supplemental Material
sj-tif-9-dhj-10.1177_20552076251347785 - Supplemental material for Interpretable machine learning models for predicting in-hospital mortality in patients with chronic critical illness and heart failure: A multicenter study
Supplemental material, sj-tif-9-dhj-10.1177_20552076251347785 for Interpretable machine learning models for predicting in-hospital mortality in patients with chronic critical illness and heart failure: A multicenter study by Min He, Yongqi Lin, Siyu Ren, Pengzhan Li, Guoqing Liu and Liangbo Hu, Xueshuang Bei, Lingyan Lei, Yue Wang, Qianghong Zhang, Xiaocong Zeng in DIGITAL HEALTH
Footnotes
Acknowledgments
The authors acknowledge Bullet Edits Limited for their assistance with linguistic editing and proofreading of the manuscript.
ORCID iDs
Ethics considerations
Approval to access the dataset (certification number: 57369428) was granted for this study. All methodologies followed relevant guidelines and regulations, including the
Author contributions
Min He and Xiaocong Zeng contributed to the conception and design of the study. Yongqi Lin, Siyu Ren, and Pengzhan Li performed the data analysis. Guoqing Liu, Liangbo Hu, and Xueshuang Bei assisted with the analysis and interpretation of results. Lingyan Lei, Yue Wang, and Qianghong Zhang helped with data collection. Min He drafted the manuscript, and Xiaocong Zeng revised it critically for important intellectual content.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Natural Science Foundation of China (grant numbers: 82260069).
Declaration of conflicting interest
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability
The datasets generated during and analyzed during the current study are available from the corresponding author on reasonable request.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
