Sage Journals: Discover world-class research

Abstract

Background: Thrombolytic therapy is essential for acute ischemic stroke (AIS) management but poses a risk of hemorrhagic transformation (HT), necessitating accurate prediction to optimize patient care. Methods: A comprehensive search was conducted across PubMed, Web of Science, Scopus, Embase, and Google Scholar, covering studies from inception until July 10, 2024. Studies were included if they used machine learning (ML) or deep learning algorithms to predict HT in AIS patients treated with thrombolysis. Exclusion criteria included studies involving endovascular treatments and those not evaluating model effectiveness. Data extraction and quality assessment were performed following PRISMA guidelines and using the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) and Prediction Model Risk of Bias Assessment Tool (PROBAST) tools. Results: Out of 1943 identified records, 12 studies were included in the final analysis, encompassing 18 007 AIS patients who received thrombolytic therapy. The ML models demonstrated high predictive performance, with pooled area under the curve (AUC) values ranging from 0.79 to 0.95. Specifically, XGBoost models achieved AUCs of up to 0.953 and Artificial Neural Network (ANN) models reached up to 0.942. Sensitivity and specificity varied significantly, with the highest sensitivity at 0.90 and specificity at 0.99. Significant predictors of HT included age, glucose levels, NIH Stroke Scale (NIHSS) score, systolic and diastolic blood pressure, and radiomic features. Despite these promising results, methodological disparities and limited external validation highlighted the need for standardized reporting and further rigorous testing. Conclusion: ML techniques, especially XGBoost and ANN, show great promise in predicting HT following thrombolysis in AIS patients, enhancing risk stratification and clinical decision-making. Future research should focus on prospective study designs, standardized reporting, and integrating ML assessments into clinical workflows to improve AIS management and patient outcomes.

Keywords

machine learning hemorrhagic transformation thrombolysis acute ischemic stroke systematic review meta-analysis predictive models

Introduction

Stroke is the third leading cause of disability worldwide and ranks as the second leading cause of death globally, with acute ischemic stroke (AIS) being its most prevalent form.¹ The cornerstone of early AIS management is thrombolytic therapy, particularly through the prompt administration of recombinant tissue plasminogen activator (rt-PA).² While thrombolysis is an established effective intervention, it carries inherent risks, notably the risk of hemorrhagic transformation (HT). HT can occur spontaneously in cases of acute ischemic cerebral infarction, with reported incidences ranging widely from 10% to 48%, depending on various risk factors.³ These hemorrhagic events pose critical concerns due to their potential to exacerbate neurological deficits, increase disability, and in severe cases, lead to mortality.

Given the severe implications of HT, there is a growing emphasis on predicting its occurrence in AIS patients undergoing thrombolysis. Early identification of patients at risk of HT can guide clinical decisions and improve patient outcomes.⁴ Management of HT involves careful monitoring and supportive care, including blood pressure control, reversal of anticoagulation, and in some cases, surgical intervention to manage intracerebral hemorrhage.⁵ Machine Learning (ML) is emerging as a powerful tool in this context. Known for their predictive capabilities across various medical fields, ML algorithms excel at interpreting complex data patterns, which enables the development of precise, patient-specific prediction models.⁶ In the field of AIS and thrombolysis, these models show promise in determining the probability of HT events, thereby offering a new dimension of insight.⁷

Amid increasing interest in ML-based predictive models for post-thrombolysis HT, there is a pressing need for a comprehensive synthesis of the existing evidence.⁸ A systematic review and meta-analysis of relevant studies can reveal nuances about the overall efficacy, benefits, and limitations of these predictive frameworks.⁹ By pooling insights from multiple research initiatives, a clearer understanding of the predictive accuracy inherent in ML algorithms can be achieved.⁹ In light of this, our systematic review and meta-analysis aims to meticulously analyze the body of literature on ML-driven prediction models tailored for post-thrombolysis HT in AIS patients. Our goal is to provide clinicians and researchers with profound insights, fostering an understanding of the models’ potential clinical applicability while also highlighting areas needing further research and refinement.

Materials & Methods

This systematic review is anchored in the guidelines delineated by PRISMA¹⁰ and was duly registered on PROSPERO with the ID: CRD42023453818.¹¹

Eligibility Criteria

Inclusion Criteria: Studies are eligible for inclusion if they meet the following conditions: 1) They focus on patients with AIS where thrombolysis is targeted; 2) They utilize machine learning or deep learning algorithms in predictive models; 3) The predictive models are specifically tailored for patients with AIS who have undergone thrombolytic treatment; 4) The models are based on studies examining HT as a patient outcome.Exclusion Criteria: Studies will be excluded if they: 1) Include endovascular treatments such as thrombectomy or other invasive methods for AIS in the study population; 2) Build models using patient samples that include non-AIS diseases such as hemorrhagic stroke, cerebral venous sinus thrombosis, or cerebral arteritis, in addition to AIS thrombolysis patients; 3) Lack an evaluation of model effectiveness; 4) Studies that include post-thrombolytic treatment data in their predictive modeling variables. For focused analysis, conference abstracts, review manuscripts, open letters, commentary pieces, editorials, and erratum were excluded due to their intrinsic succinctness.

Search Strategies

Under the guidance of experts in evidence-based medicine, we conducted a comprehensive search using the keywords of ischemic stroke, hemorrhage, and artificial intelligence. A detailed overview of our search technique is provided in Supplementary Table S1. We systematically searched quintessential databases: PubMed, Web of Science, Scopus, Embase, and Google Scholar, from their respective inceptions until July 10, 2024. These databases predominantly covered relevant computer science journals and conferences.

Study Selection

Study selection and review were conducted independently by two reviewers, YJ and LL. Initial steps involved eliminating duplicates, comments, conference abstracts, letters, narrative reviews, editorials, and errata. The remaining records underwent a preliminary screening based on titles and abstracts, followed by a comprehensive assessment via full-text reading. Any discrepancies between the primary reviewers were addressed through discussions, and unresolved differences were referred to a third reviewer, FL.

Data Extraction

For the eligible studies, data was independently extracted by two reviewers (YJ and FL) using a predefined form. Extracted details included: author's name, year of publication, demographic characteristics of the study population (age, gender, ethnicity/place of recruitment), sample sizes of the training, testing, and external validation cohorts, included variables (clinical information, laboratory test results, and brain imaging), imaging modalities used for model development, models based on different algorithms, and model validation information. Performance metrics extracted from the included papers included Area Under the Curve (AUC), sensitivity, specificity, and diagnostic odds ratio (DOR). Performance metrics were restricted to those pertaining to the “best-performing” model. In cases where discrepancies arose, a third reviewer (YL) was consulted to resolve disagreements. This three-tier review process ensured a rigorous and unbiased data extraction process. We outlined specific procedural steps followed to address disputes, ensuring clarity and systematic resolution. Furthermore, detailed descriptions of the training protocols followed by the reviewers were provided to ensure consistency and accuracy in data extraction. This included training on the use of the extraction form, familiarization with the inclusion and exclusion criteria, and calibration exercises to minimize variability between reviewers. Additionally, to enhance the transparency of our methodology, we listed specific software tools used for managing and analyzing the extracted data in the Supplemental Materials, allowing for replication and verification by other researchers.

Risk of Bias (RoB) Assessment and Quality of Evidence

The RoB for each study was independently assessed by two sets of investigators. Group 1, consisting of YJ and YL, and Group 2, which included LL and ZW, utilized the Prediction Model Risk of Bias Assessment Tool (PROBAST), which encompasses 20 queries spread across four key domains: participants, predictors, outcomes, and analysis. Each domain was subsequently categorized as either ‘low’ or ‘high’ risk of bias. To ensure a meticulous evaluation of adherence to reporting standards, we employed the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) protocol. Any discrepancies that arose during the assessment process were mediated by a third investigator, YC, to reach a consensus. The detailed assessments, including any modifications made to the TRIPOD and PROBAST for this specific review, are documented in Supplementary Tables S2 and S3.

Data Synthesis

To visualize the types of characteristic variables in each model, network diagrams were used. The discriminative capacity of each predictive model was primarily assessed using the AUC, sensitivity, specificity, and DOR. AUC values, along with their 95% confidence intervals (CIs), were collated from the included studies. The standard error (SE) of each AUC was calculated using the actual counts of positive and negative cases provided in the studies. When 95% CIs were not explicitly mentioned, they were derived from the available data using the formula: 95% CI = effect size (AUC) ± 1.96 × SE. In situations where the SE was not reported, it was estimated using the formula that includes sensitivity, specificity, and the number of positive and negative cases, following the method described by Hanley et al, who developed a nonparametric approach to compare areas under correlated ROC curves.¹²

In addition to AUC, sensitivity and specificity for each study were systematically analyzed. Sensitivity (SEN), specificity (SPE), and DOR were calculated from individual studies, and their standard errors were derived using binomial distribution formulas. SPE and SPE were calculated from individual studies, and their standard errors were derived using binomial distribution formulas. Forest plots were employed to illustrate the variability and precision of these estimates across studies through both fixed and random-effects models.

Statistical heterogeneity among the studies was rigorously evaluated using Cochran's Q-test, with significance set at P ≤ .10, and the I^2 statistic, where values above 50% indicated notable heterogeneity. When significant heterogeneity was observed, a random-effects model was used for pooling the AUCs, sensitivities, specificities, and DORs. If heterogeneity was minimal, a fixed-effects model was applied. Additionally, correlations between AUC values and corresponding variables such as sensitivity, specificity, and DOR were visualized using network diagrams. This analysis elucidated the relationships and potential trade-offs between model sensitivity and specificity in detecting true positives and correctly identifying negatives, respectively. All meta-analyses and network diagrams were constructed in R Studio, using R version 4.2.3, providing a robust platform for statistical computation and graphical representation.

Results

Search Results

We initially identified 1943 records across various databases, from which 564 duplicate records were removed before screening. After screening 1379 titles and abstracts, we excluded 1276 records for irrelevance or inappropriateness, leaving 103 articles for full-text assessment. Of these, 91 were further excluded due to irrelevance to the study criteria or methodological issues, resulting in 12 studies being included in the final systematic review and meta-analysis (Figure 1).

Figure 1.

Flowchart of study selection process.

Basic Characteristics

A concise overview of the characteristics of the eligible studies^13–24 is provided in Table 1. Out of the 12 studies, a collective total of 18 007 participants who had AIS and received thrombolytic therapy were included. The age of participants in these studies had a mean or median ranging from 64.00 to 69.32 years. The male participants’ percentage across the studies varied between 55.1% and 69.9%. Among the 12 studies, 8 were conducted in mainland China.^{13–15,17,21–24} The other studies were singular representations from South Korea,¹⁸ Thailand,¹⁹ the United Kingdom,¹⁶ and Taiwan.²⁰ Two studies used public databases,^16,24 and the others were collected retrospectively.^{13–15,17–23} Six out of these studies carried out external validation for their predictive model.^{14,16–18,22,24} Details about individual study characteristics, methodologies, and key findings are elaborated in Tables 1 and 2.

Table 1.

Basic Characteristics.

Study	Region of Recruitment, Periods of Recruitment	Year of Published	Sample Size (T/EV)	Mean Age(years)	Gender (Male vs Female)	Imaging Modality of HT
Ren et al	Training:Chongqing and Hunan, 　China; 2012-2021 External Validation: Hunan, China; 2012-2021	2023	355/72	67.02	64.4% versus 35.6%	CT
Lee et al	All data set:Seoul, South Korea; 2004-2008	2015	958/954	68.00	59.6% versus 40.4%	MRI
Cui et al	All data set:Chongqing and Hunan, China;	2022	322	67.39	62.0% versus 38%	CT
Dharmasaroja et al	All data set:Bangkok, Thailand; 2007-2010	2012	197	64.00	60.0 versus 40.0%	CT and MRI
Liu et al	Training: VISTA dataset External Validation: Sichaun, China; 2012-2020	2022	1738/296	68.37	58.5% versus 41.5%	CT
Xu et al	All data set:Jiangsu, China; 2016-2021	2022	276	68.04	69.9% versus 35.1%	CT and MRI
McMeekin et al	Training: SITS-UK dataset; 2002-2010 External Validation: VISTA dataset	2015	2358	66.08	55.1% versus 44.9%	CT
Chung et al	All data set:Taiwan; 2009-2018	2020	331	69.12	59.9% versus 40.1%	CT
Wang et al	All data set:Shanghai,China; 2013-2016	2020	1620	66.47	64.3% versus 35.7%	CT and MRI
Liu et al	All data set: Sichuan, China; 2012-2020	2021	104	68.23	56.7% versus 43.3%	CT
Meng et al	All data set:Shandong, China; 2016-2018	2022	136	64.00	63.6% versus 36.4%	CT and MRI
Wen et al	All data set:Shenyang, China; 2019-2021	2023	6369/1921	65.00	71.0% versus 29.0%	CT

Abbreviations: T, training set; EV, external validation dataset; NCCT, non-contrast computed tomography; MRI, magnetic resonance imaging; VISTA, the Virtual International Stroke Trials Archive; SITS-UK, the Safe Implementation of Treatments in Stroke registry.

Table 2.

Model Development Using Conventional Machine Learning Algorithms.

References	Model	Missing Value	Variables			Validation Method	External Validation	Best Performance Model	Model Performance
References	Model	Missing Value	Variables Amounts	Images Variables	Variables Selection	Validation Method	External Validation	Best Performance Model	AUC (95% CI)	Others
Ren et al²¹	RF, SVM, RLR, SGD, XGB, Linear SVC	K-nearest neighbor	45	Yes	LASSO	Ten-fold Cross-Validation	Yes	XGBoost	IV:0.95	ACC: 0.89 PPV: 0.88 NPV: 0.91 Sensitivity: 0.90 Specificity: 0.89
Cui et al²²	RLR, RF, SVM, XGB	MissForest	66	Yes	LASSO	Ten-fold Cross-Validation	No	XGBoost	IV:0.95 EV:0.91	ACC: 0.86 F1: 0.82 Sensitivity: 0.76 Specificity: 0.94
Lee et al¹⁸	RLR	Patients with missing data were excluded	18	No	Logistic regression	Bootstrapping	Yes	RLR	IV:0.83 EV:0.82	Calibration slope: 0.68
Dharmasaroja et al¹⁹	RBF, MLP, PNN, SVM, PFA，SDT	Patients with missing data were excluded	29	No	Chi-square test and T-test	Ten-fold Cross-Validation	No	PNN	IV:0.79	ACC: 0.90 PPV: 1.00 NPV: 0.89 F1: 0.52 Sensitivity: 0.35 Specificity: 1 .00
Liu et al²⁴	SVM	Multiple imputations	28	No	GINI index of RF model	Ten-fold Cross-Validation	Yes	SVM	IV:0.87 EV:0.74	sensitivity: 0.67, specificity: 0.87
Xu et al¹³	RLR, RF	Patients with missing data were excluded	61	No	Logistic regressions	Leave-One-Out Cross-Validation	No	RF	IV:0.80	sensitivity: 0.67, specificity: 0.81
McMeekin et al¹⁶	Decision Tree	Patients with missing data were excluded	23	No	Logistic regressions	Ten-fold Cross-Validation	No	Decision analytic model	IV:0.79 EV:0.78	/
Wang et al²³	RLR， Neural Network, SVM, RF, AdaBoost	Missing-indicator method	24	No	Chi-square test and T-test	Ten-fold Cross-Validation	Yes	Netural network	IV: 0.82	/
Chung et al²⁰	ANN(Multilayer perceptron architecture, a 3-layer neural network for fine-tuning)	Patients with missing data were excluded	25	No	Fisher's Exact Test and One-way ANOVA	5-fold cross validation	No	ANN	IV: 0.94	Mean precision :0.81 Sensitivity: 0.85 specificity:0.93
Liu et al¹⁷	RLR	Patients with missing data were excluded	14	Yes	LASSO	5-fold cross validation	No	LASSO-based LR	IV:0.87 EV:0.74	Sensitivity: 0.60; Specificity: 0.85
Meng et al¹⁵	RF	Patients with missing data were excluded	20	Yes	LASSO	5-fold cross validation	No	RF	IV:0.85	/
Wen et al¹⁴	RLR,RF,SVM,MLP,GBDT	Patients with missing data were excluded	15	No	LASSO	5-fold cross validation	Yes	RLR	EV:0.87	Sensitivity: 0.78 Specificity: 0.88 PPV:0.66 NPV:0.80

Abbreviations: RF, random forest; SVM, support vector machine; RLR, regularized logistic regression; SGD, stochastic gradient descent; XGB, eXtreme gradient boosting; Linear SVC, linear support vector classification; LASSO, least absolute shrinkage and selection operator; SHAP, shapley additive explanations; RBF, radial basis function; MLP, multilayer perceptron; PNN, probabilistic neural network; PFA, principal factor analysis; SDT, single decision tree; AdaBoost, adaptive boosting.

Features Selection and Model Development

Out of the studies reviewed: 12 implemented various ML algorithms for crafting predictive models.^{13–19,21–24} One study made use of the Artificial Neural Network (ANN) algorithm for its prediction tasks.²⁰ The algorithms employed across the studies can be categorized as follows: Decision Trees, Random Forest, Regularized Logistic Regression (RLR), Support Vector Machine (SVM), eXtreme Gradient Boosting (XGBoost), Linear Support Vector Classification (Linear SVC), and Probabilistic Neural Network (PNN). Data completeness plays a pivotal role in predictive modeling. The approaches to handle missing data among the studies varied: Four studies leveraged advanced imputation techniques, namely multiple imputations, miss-forest, or the missing-indicator method, to manage gaps in data.^21–24 Other studies opted to exclude participants who had any missing data, either in feature variables or outcome variables (Table 2).^13–20,24

Important Predictors and Best Model Performance

In the Figure 2, several features emerged as significant predictors in the ML models used for forecasting outcomes. These encompass: age, glucose levels, NIH Stroke Scale (NIHSS) score, International Normalized Ratio (INR), Systolic Blood Pressure (SBP), Diastolic Blood Pressure (DBP), onset to treatment time, and White Blood Cell (WBC) count, consistently emerged as significant predictors in ML-based models. Additionally, some studies incorporated imaging features derived from radiomic analysis into their predictive models. These imaging features, which assess parameters such as vascular health, local blood flow, and tissue integrity, were used to enhance the predictive accuracy of the models.

Figure 2.

Key Predictors identified in machine learning models.

All studies reported AUC values, and 7 studies fully reported confidence intervals and SE,^{13–15,17,18,21,24} 1 article only reported SE,²⁰ and the remaining 4 studies reported data that can be used to calculate SE and confidence intervals.^16,19,22,23 We used the following formulas to further calculate, and finally obtained the 95% confidence intervals of SE and AUC in the study:

95 % CI = AUC \pm 1.96 \times {SE}_{AUC}

(1)

{SE}_{AUC} = \sqrt{\frac{AUC \times (1 - AUC) + (n_{1} - 1) \times (Q 1 - {AUC}^{2}) + (n_{2} - 1) \times (Q 2 - {AUC}^{2})}{n_{1} \times n_{2}}}

(2)

Across the evaluated models, AUC values ranged from 0.79 to 0.95, with a pooled AUC of 0.88 (95% CI: 0.87-0.89). Notably, two studies utilizing the XGBoost algorithm achieved outstanding AUCs of 0.953 and 0.945,^21,22 while a model leveraging a Deep Learning ANN algorithm reported an AUC of 0.94.²⁰ For external validation, the combined AUC was 0.80 (95% CI: 0.76-0.84)^{14,16–18,22,24} with Cui et al achieving a standout AUC of 0.91.²² Sensitivity and specificity varied significantly across studies, as depicted in the forest plots²² (Figure 3).

Figure 3.

Forest plots illustrating the AUC values for internal validation (a) and external validation (b) cohorts, highlighting the discriminative ability and robustness of the machine learning models across different datasets.

Figure 4.

Forest plots of model performance metrics, including sensitivity (a), specificity (b), and diagnostic odds ratio (DOR) (c), demonstrating the variability and precision of the predictive models’ performance.

The highest sensitivity was observed in Ren²¹ with 0.90 (95% CI: 0.869, 0.931), while the pooled sensitivity was 0.701 (95% CI: 0.613, 0.788). Specificity peaked in Dharmasaroja¹⁹ at 0.99 (95% CI: 0.970, 1.010) and in Cui²² at 0.94 (95% CI: 0.915, 0.965), with a pooled specificity of 0.905 (95% CI: 0.897, 0.913). DOR further highlighted the variability in model performance. Ren²¹ demonstrated the highest DOR of 72.82 (95% CI: 56.57, 89.07), followed by Wen¹⁴ with a DOR of 26.00 (95% CI: 21.98, 30.02). These results underscore the variability in model performance across studies, with Ren²¹ and Chung (2022) standing out due to their high AUCs, sensitivity, specificity, and DOR values, suggesting robust diagnostic capabilities in predicting post-thrombolysis hemorrhagic transformation in acute ischemic stroke patients (Figure 4).

Risk of Bias and Reporting Quality

Upon evaluating the risk of bias, five ML-based studies^{14,15,17,23,24} and one DL-based study²⁰ showcased a low risk across all domains, as detailed in Supplementary Table S5. Conversely, some studies presented a high risk in domains such as missing data, variable outcome definitions, and validation methods. Nonetheless, the overall reporting quality was commendable, with a majority of studies adhering to over 70% of the “good” reporting criteria as per Supplementary Table S6. The funnel plot presented in Figure S1 shows variation in study outcomes, with some studies falling outside the expected range. This distribution hints at potential publication bias, where studies with positive or significant results may be more frequently published than those with negative or non-significant outcomes, potentially leading to an overestimation of the models’ effectiveness. The asymmetry observed in the funnel plot suggests small study effects or heterogeneity in methodologies and patient populations across the studies.

Discussion

Our meta-analysis underscores the significant potential of advanced ML techniques, such as XGBoost and ANN, in predicting HT following thrombolysis in patients with AIS. These ML models demonstrated excellent discriminative abilities, with high pooled AUC values, sensitivity, and specificity, indicating robust predictive capabilities. For instance, XGBoost models achieved the highest predictive performance, while ANN-based models also showed strong validity, effectively distinguishing patients with higher and lower risks of HT, which is crucial for clinical decision-making.^20,22 The variability in DOR underscores the diagnostic robustness of advanced ML models. For example, Chung et al achieved a DOR of 75.29, indicating a strong association between predicted risks and actual outcomes.²⁰ However, significant differences in the effectiveness of various ML models are evident, with DOR values ranging widely. The integration of comprehensive patient data has been shown to significantly enhance the predictive performance of machine learning models. Ren et al successfully combined radiomics data with clinical data, leveraging radiomics features to provide detailed insights into tissue characteristics and vascular health, thus offering a more holistic view of the patient's condition.²¹ This fusion of clinical and imaging data underscores the potential of advanced ML models to assimilate various data types and improve overall prediction accuracy and reliability.²²

When comparing different algorithmic frameworks, advanced models like XGBoost and ANN demonstrate superior capabilities over simpler models such as LR and SVM.²⁵ XGBoost, a gradient boosting framework, excels at capturing complex, non-linear relationships in data through its iterative approach of building and combining multiple decision trees.²⁶ This allows XGBoost to handle interactions between variables more effectively and improve predictive performance.²⁶ ANNs, with their multi-layered architecture, are particularly adept at learning intricate patterns within data through backpropagation and non-linear activation functions.²⁷ This deep learning approach enables ANNs to model complex relationships that simpler algorithms might overlook. The flexibility of ANNs in adjusting weights and biases through extensive training processes results in high sensitivity and specificity, making them highly reliable in clinical applications.²⁸ In contrast, models like LR and SVM, while useful for linear and binary classification problems, often fall short in handling the complexity and variability inherent in medical data. LR assumes a linear relationship between predictors and outcomes, which limits its ability to model non-linear interactions effectively.²⁹ SVM, though powerful in finding optimal hyperplanes for classification, can struggle with large, high-dimensional datasets and may require extensive tuning of parameters.¹⁷ The advanced architectures of XGBoost and ANN, combined with their ability to integrate diverse data sources, make them particularly robust and reliable for clinical applications.^22,23 These models have demonstrated high performance metrics, including AUC, sensitivity, specificity, and DOR values, indicating their strong diagnostic capabilities. This makes them crucial for improving the prediction of HT after thrombolysis in AIS patients, enabling more accurate risk stratification, better-informed clinical decisions, and ultimately improved patient outcomes.²⁶

The integration of comprehensive patient data has been shown to significantly enhance the predictive performance of these models. For instance, Ren et al successfully integrated radiomics data with clinical data, leveraging radiomics features to provide detailed insights into tissue characteristics and vascular health, thereby supplementing clinical predictors and offering a more comprehensive view of the patient's condition.²¹ The fusion of clinical and imaging data underscores the potential of ML models to assimilate various data types and improve overall prediction accuracy and reliability. Combining clinical variables such as age, blood glucose level, NIHSS score, international normalized ratio, systolic blood pressure, diastolic blood pressure, treatment start time, and white blood cell count with radiomics features has proven to be a powerful approach to enhancing model performance.³⁰ This comprehensive data integration allows for a more nuanced understanding of the patient's condition, capturing complex patterns and relationships that may not be apparent through clinical data alone. Significant predictors identified in these models highlight the multifactorial nature of HT risk, underscoring the necessity for integrating diverse data types in predictive modeling.³¹ Advanced ML algorithms, such as XGBoost and ANN, have demonstrated excellent performance metrics, including high AUC, sensitivity, specificity, and DOR values, indicating their robustness and reliability in clinical applications.²⁷ These findings suggest that integrating clinical and radiomics data using advanced ML algorithms is crucial for improving the prediction of HT after thrombolysis in AIS patients. This approach enables more accurate risk stratification, better-informed clinical decisions, and ultimately improved patient outcomes.²⁶

Our comprehensive review reveals significant methodological disparities in the reporting of HT outcomes across studies, emphasizing the need for standardized reporting practices to improve the comparability and reproducibility of predictive models. Current inconsistencies in the classification of HT categories, such as symptomatic forms like PH2 and those aggregated under categories HT1, HT2, and H, compromise the accuracy of model predictions.³² This variability, along with limited details on key variables such as the timing and severity of HT occurrences, underscores the necessity for detailed documentation, uniform classification methods, and clear reporting to optimize model validation in clinical settings.³³ Future research must prioritize enhanced methodological transparency and strict adherence to established guidelines to improve the reliability and reproducibility of ML models in clinical practice.³⁴

Furthermore, our review identifies critical barriers to the widespread adoption of ML in clinical practice, including limited external validation and a lack of transparency in model development. With only half of the models externally validated, there is a clear need for more rigorous external data to test models across varied patient demographics and clinical settings, ensuring their applicability and robustness.³⁵ The opacity surrounding these models, with few studies disclosing complete details on their algorithmic configurations, hampers the ability of other researchers to verify or adapt these models independently.³⁶ This lack of transparency can lead to difficulties in assessing the true performance and generalizability of ML models, thereby limiting their integration into clinical workflows.³⁷

To bridge the gap between promising research findings and tangible clinical impact, we advocate for a shift towards prospective research designs. Integrating ML-based assessments into routine risk evaluation protocols for patients undergoing AIS thrombolysis can enhance personalized treatment strategies.²⁸ This integration should be supported by policy changes that promote the inclusion of ML assessments in clinical workflows, ensuring that advancements in ML are effectively translated into practice. By addressing these challenges and implementing the recommended improvements, we can advance the development of reliable and clinically viable prediction models, thereby enhancing the quality of care and patient outcomes in thrombolytic therapy.²⁵ By fostering a collaborative effort towards methodological rigor, transparency, and comprehensive validation, the clinical community can harness the full potential of ML models, transforming stroke management and improving patient prognoses.³⁸ This collaborative approach is essential for overcoming existing barriers and ensuring the integration of ML models into clinical practice, ultimately leading to more effective and personalized stroke care.

Our study has several limitations. Firstly, not all included studies reported sensitivity and specificity, which constrained our ability to comprehensively evaluate the predictive performance of the models. To mitigate this, we calculated sensitivity and specificity from available positive and negative predictive values, and we attempted to contact the original researchers for additional data. Secondly, while a bivariate meta-analysis would have been more appropriate for this type of study, we were unable to perform it due to data constraints. Nevertheless, we rigorously collected and calculated metrics such as AUC, sensitivity, specificity, and DOR to provide a robust assessment of model effectiveness. Additionally, the heterogeneity in study methodologies and patient populations may impact the generalizability of our findings. Future research should aim to standardize reporting practices and conduct bivariate meta-analyses to enhance the robustness and clinical applicability of conclusions.

Conclusion

Our meta-analysis underscores the significant potential of advanced ML techniques, such as XGBoost and ANN, in predicting HT following thrombolysis in AIS patients. These models demonstrate robust predictive capabilities, enhancing clinical decision-making and patient outcomes through high sensitivity and specificity. Integrating diverse data types, including clinical and radiomic data, significantly improves predictive accuracy. However, methodological disparities and limited external validation challenge the reliability and reproducibility of these models. Future research should emphasize prospective study designs, standardized reporting practices, and policy changes to integrate ML assessments into routine clinical workflows. By prioritizing methodological rigor and transparency, the clinical community can fully leverage ML models to transform AIS management and improve patient prognoses.

Supplemental Material

sj-pdf-2-cat-10.1177_10760296241279800 - Supplemental material for Advanced Machine Learning Models for Predicting Post-Thrombolysis Hemorrhagic Transformation in Acute Ischemic Stroke Patients: A Systematic Review and Meta-Analysis

Supplemental material, sj-pdf-2-cat-10.1177_10760296241279800 for Advanced Machine Learning Models for Predicting Post-Thrombolysis Hemorrhagic Transformation in Acute Ischemic Stroke Patients: A Systematic Review and Meta-Analysis by You-li Jiang, Qing-shi Zhao, Ao Li, Zong-bi Wu, Lin-lin Liu, Fu Lin and Yan-feng Li in Clinical and Applied Thrombosis/Hemostasis

Supplemental Material

sj-docx-3-cat-10.1177_10760296241279800 - Supplemental material for Advanced Machine Learning Models for Predicting Post-Thrombolysis Hemorrhagic Transformation in Acute Ischemic Stroke Patients: A Systematic Review and Meta-Analysis

Supplemental material, sj-docx-3-cat-10.1177_10760296241279800 for Advanced Machine Learning Models for Predicting Post-Thrombolysis Hemorrhagic Transformation in Acute Ischemic Stroke Patients: A Systematic Review and Meta-Analysis by You-li Jiang, Qing-shi Zhao, Ao Li, Zong-bi Wu, Lin-lin Liu, Fu Lin and Yan-feng Li in Clinical and Applied Thrombosis/Hemostasis

Footnotes

Acknowledgments

We extend our deepest gratitude to Professor Qingshi Zhao and Professor Yanfeng Li for their invaluable guidance throughout this study. We are also immensely grateful to Director Zongbi Wu for his insightful assistance, particularly in the methodological facet of our research.

Author Contributions

Z. W. and Q.Z. designed this study. Y.J., A.L., F.L., and L.L. extracted data. Y.J. and A.L. performed the statistical analyses. Y.L. and Q.Z. performed the literature search, collected the data. Y.J. wrote the manuscript. All authors provided critical feedback and contributed to the final manuscript. Correspondence and requests for materials should be addressed to Y.J.

Declaration of Conflicting Interests

The authors declare that there are potential conflicts of interest with respect to the research, authorship, and publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: the project “the Scientific Research Projects of Medical and Health Institutions of Longhua District, Shenzhen” with grant number 2020101.

ORCID iD

You-li Jiang

Supplemental Material

Supplemental material for this article is available online.

References

Feigin

Brainin

Norrving

, et al. World Stroke Organization (WSO): global stroke fact sheet 2022. Int J Stroke. 2022;17(1):18-29. doi:https://doi.org/10.1177/17474930211065917

Jovin

Chamorro

Cobo

, et al. Thrombectomy within 8 h after symptom onset in ischemic stroke. N Engl J Med. 2015;372(24):2296-2306. doi:https://doi.org/10.1056/NEJMoa1503780

Lees

Bluhmki

von Kummer

, et al. Time to treatment with intravenous alteplase and outcome in stroke: an updated pooled analysis of ECASS, ATLANTIS, NINDS, and EPITHET trials. Lancet (London, England). 2010;375(9727):1695-1703. doi:https://doi.org/10.1016/S0140-6736(10)60491-6

Hong

Hsu

Zhang

Cheng

. Neuroimaging prediction of hemorrhagic transformation for acute ischemic stroke. Cerebrovasc Dis (Basel, Switzerland). 2022;51(4):542-552. doi:https://doi.org/10.1159/000521150

Steiner

Al-Shahi Salman

Beer

, et al. European Stroke Organisation (ESO) guidelines for the management of spontaneous intracerebral hemorrhage. Int J Stroke. 2014;9(7):840-855. doi:https://doi.org/10.1111/ijs.12309

Obermeyer

Emanuel

. Predicting the future – big data, machine learning, and clinical medicine. N Engl J Med. 2016;375(13):1216-1219. doi:https://doi.org/10.1056/NEJMp1606181

Lansberg

Schrooten

Bluhmki

Thijs

Saver

. Treatment time-specific number needed to treat estimates for tissue plasminogen activator therapy in acute stroke based on shifts over the entire range of the modified rankin scale. Stroke. 2009;40(6):2079-2084. doi:https://doi.org/10.1161/STROKEAHA.108.540708

Smith

Saver

Cox

, et al. Increase in endovascular therapy in get with the guidelines-stroke after the publication of pivotal trials. Circulation. 2017;136(24):2303-2310. doi:https://doi.org/10.1161/CIRCULATIONAHA.117.031097

Singh

Lanchantin

Sekhon

. Attend and predict: understanding gene regulation by selective attention on chromatin. Adv Neural Inf Process Syst. 2017;30:6785-6795.

10.

Moher

Liberati

Tetzlaff

Altman

Prisma

. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Int J Surg. 2010;8(5):336-341. doi:https://doi.org/10.1016/j.ijsu.2010.02.007

11.

Booth

Clarke

Ghersi

Moher

Petticrew

Stewart

. An international registry of systematic-review protocols. Lancet (London, England). 2011;377(9760):108-109. doi:https://doi.org/10.1016/S0140-6736(10)60903-8

12.

Hanley

Mcneil

. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143(1):29-36. doi:https://doi.org/10.1148/radiology.143.1.7063747

13.

Zhang

Jiang

. Machine learning-based model for prediction of hemorrhage transformation in acute ischemic stroke after alteplase. Front Neurol. 2022;13:897903. doi:https://doi.org/10.3389/fneur.2022.897903

14.

Wen

Wang

Bian

, et al. Machine learning-based prediction of symptomatic intracerebral hemorrhage after intravenous thrombolysis for stroke: a large multicenter study. Front Neurol. 2023;14:1247492. doi:https://doi.org/10.3389/fneur.2023.1247492

15.

Meng

Wang

Liu

Shi

. Prediction model of hemorrhage transformation in patient with acute ischemic stroke based on multiparametric MRI radiomics and machine learning. Brain Sci. 2022;12(7):858. doi:https://doi.org/10.3390/brainsci12070858

16.

Mcmeekin

Flynn

Ford

Rodgers

Gray

Thomson

. Development of a decision analytic model to support decision making and risk communication about thrombolytic treatment. BMC Med Inform Decis. 2015;15(1):1-11. doi:https://doi.org/10.1186/s12911-015-0213-z

17.

Liu

Tao

Wang

Chen

Liu

. Radiomics-based prediction of hemorrhage expansion among patients with thrombolysis/thrombectomy related-hemorrhagic transformation using machine learning. Ther Adv Neurol Disord. 2021;14:91626179. doi:https://doi.org/10.1177/17562864211060029

18.

Lee

Kim

Kang

, et al. A novel computerized clinical decision support system for treating thrombolysis in patients with acute ischemic stroke. J Stroke. 2015;17(2):199. doi:https://doi.org/10.5853/jos.2015.17.2.199

19.

Dharmasaroja

. Prediction of intracerebral hemorrhage following thrombolytic therapy for acute ischemic stroke using multiple artificial neural networks. Neurol Res. 2012;34(2):120-128. doi:https://doi.org/10.1179/1743132811Y.0000000067

20.

Chung

Chan

Bamodu

Hong

Chiu

. Artificial neural network based prediction of postthrombolysis intracerebral hemorrhage and death. Sci Rep-UK. 2020;10(1):20501. doi:https://doi.org/10.1038/s41598-020-77546-5

21.

Ren

Song

Wang

, et al. A clinical-radiomics model based on noncontrast computed tomography to predict hemorrhagic transformation after stroke by machine learning: a multicenter study. Insights Imaging. 2023;14(1):52. doi:https://doi.org/10.1186/s13244-023-01399-5

22.

Cui

Song

Ren

, et al. Prediction of hemorrhagic complication after thrombolytic therapy based on multimodal data from multiple centers: an approach to machine learning and system implementation. J Pers Med. 2022;12(12):2052. doi:https://doi.org/10.3390/jpm12122052

23.

Wang

Huang

Xia

, et al. Personalized risk prediction of symptomatic intracerebral hemorrhage after stroke thrombolysis using a machine-learning model. Ther Adv Neurol Disord. 2020;13:1279123926. doi:https://doi.org/10.1177/1756286420902358

24.

Liu

Chen

Guo

Wang

Liu

. Machine learning prediction of symptomatic intracerebral hemorrhage after stroke thrombolysis: a cross-cultural validation in Caucasian and Han Chinese cohort. Ther Adv Neurol Disord. 2022;15:81556828. doi:https://doi.org/10.1177/17562864221129380

25.

Zhang

Jia

Shang

. Research and application of XGBoost in imbalanced data. Int J Distrib Sens Netw. 2022;18(6):1975860471.

26.

Huang

Chen

Deng

, et al. Interpretable machine learning for predicting 28-day all-cause in-hospital mortality for hypertensive ischemic or hemorrhagic stroke patients in the ICU: a multi-center retrospective cohort study with internal and external cross-validation. Front Neurol. 2023;14:1185447. doi:https://doi.org/10.3389/fneur.2023.1185447

27.

Choi

Seo

Kim

, et al. Prediction of hemorrhagic transformation after ischemic stroke using machine learning. J Pers Med. 2021;11(9):863. doi:https://doi.org/10.3390/jpm11090863

28.

Liu

. Predicting mortality of patients with acute kidney injury in the ICU using XGBoost model. PLoS One. 2021;16(2):e246306. doi:https://doi.org/10.1371/journal.pone.0246306

29.

Xie

Hao

Chow

. Predicting the easiness and complexity of English health materials for international tertiary students with linguistically enhanced machine learning algorithms: development and validation study. JMIR Med Inform. 2021;9(10):e25110. doi:https://doi.org/10.2196/25110

30.

Jiang

Zhou

Yong

, et al. A deep learning-based model for prediction of hemorrhagic transformation after stroke. Brain Pathol (Zurich, Switzerland). 2023;33(2):e13023. doi:https://doi.org/10.1111/bpa.13023

31.

Payabvash

Acosta

Haider

, et al. Prediction of clinical outcome in supratentorial intracerebral hemorrhage: application of baseline CT scan radiomics feature extraction and machine learning classifiers. Stroke. 2020;51(Suppl 1). doi:https://doi.org/10.1161/str.51.suppl_1.WMP101

32.

van Kranendonk

Treurniet

Boers

AMM

, et al. Added prognostic value of hemorrhagic transformation quantification in patients with acute ischemic stroke. Front Neurol. 2020;11:582767. doi:https://doi.org/10.3389/fneur.2020.582767

33.

Kim

Jeon

Kim

. Thromboelastography for prediction of hemorrhagic transformation in patients with acute ischemic stroke. Am J Emerg Med. 2020;38(9):1772-1777. doi:https://doi.org/10.1016/j.ajem.2020.05.052

34.

Krishnamoorthy

Singh

Jose K

, et al. Biomarkers in the prediction of hemorrhagic transformation in acute stroke: a systematic review and meta-analysis. Cerebrovasc Dis (Basel, Switzerland). 2022;51(2):235-247. doi:https://doi.org/10.1159/000518570

35.

Groot

Bindels

BJJ

Ogink

, et al. Availability and reporting quality of external validations of machine-learning prediction models with orthopedic surgical outcomes: a systematic review. Acta Orthop. 2021;92(4):385-393. doi:https://doi.org/10.1080/17453674.2021.1910448

36.

Watson

Hutyra

Clancy

, et al.

Overcoming barriers to the adoption and implementation of predictive modeling and machine learning in clinical care: what can we learn from US academic medical centers?

JAMIA Open. 2020;3(2):167-172. doi:https://doi.org/10.1093/jamiaopen/ooz046

37.

Varghese

. Artificial intelligence in medicine: chances and challenges for wide clinical adoption. Visc Med. 2020;36(6):443-449. doi:https://doi.org/10.1159/000511930

38.

Chavva

Crawford

Mazurek

, et al. Deep learning applications for acute stroke management. Ann Neurol. 2022;92(4):574-587.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.04 MB

0.02 MB