Sage Journals: Discover world-class research

Abstract

Background:

Prediction of bleeding is critical for acute myocardial infarction (AMI) patients after percutaneous coronary intervention (PCI). Machine learning methods can automatically select the combination of the important features and learn their underlying relationship with the outcome.

Objectives:

We aimed to evaluate the predictive value of machine learning methods to predict in-hospital bleeding for AMI patients.

Design:

We used data from the multicenter China Acute Myocardial Infarction (CAMI) registry. The cohort was randomly partitioned into derivation set (50%) and validation set (50%). We applied a state-of-art machine learning algorithm, eXtreme Gradient Boosting (XGBoost), to automatically select features from 98 candidate variables and developed a risk prediction model to predict in-hospital bleeding (Bleeding Academic Research Consortium [BARC] 3 or 5 definition).

Results:

A total of 16,736 AMI patients who underwent PCI were finally enrolled. 45 features were automatically selected and were used to construct the prediction model. The developed XGBoost model showed ideal prediction results. The area under the receiver-operating characteristic curve (AUROC) on the derivation data set was 0.941 (95% CI = 0.909–0.973, p < 0.001); the AUROC on the validation set was 0.837 (95% CI = 0.772–0.903, p < 0.001), which was better than the CRUSADE score (AUROC: 0.741; 95% CI = 0.654–0.828, p < 0.001) and ACUITY-HORIZONS score (AUROC: 0.731; 95% CI = 0.641–0.820, p < 0.001). We also developed an online calculator with 12 most important variables (http://101.89.95.81:8260/), and AUROC still reached 0.809 on the validation set.

Conclusion:

For the first time, we developed the CAMI bleeding model using machine learning methods for AMI patients after PCI.

Trial registration:

NCT01874691. Registered 11 Jun 2013.

Keywords

in-hospital bleeding machine learning percutaneous coronary intervention prognosis

Introduction

Dual antiplatelet therapy is critical to reduce thrombus events for acute myocardial infarction (AMI) patients after percutaneous coronary intervention (PCI). However, it is a two-edged sword; once patients occurred severe bleeding, the risk of death will increase significantly.^1,2 Therefore, it is very important to identify high-risk bleeding patients with AMI. The traditional bleeding scores for acute coronary syndrome patients includes Can Rapid risk stratification of Unstable angina patients Suppress ADverse outcomes with Early implementation of the ACC/AHA Guidelines (CRUSADE) score,³ Acute Catheterization and Urgent Intervention Triage Strategy and Harmonizing Outcomes with Revascularization and Stents in Acute Myocardial Infarction (ACUITY-HORIZONS) score,⁴ etc. However, these scores are derived from limited clinical parameters using traditional statistical methods, so their predictive values are relatively limited. The recommendation level of CRUSADE score in the ESC guidelines for patients with non–ST-segment elevation acute coronary syndromes, downgraded from Ib class in 2011⁵ to IIb class in 2015.⁶

Machine learning is based on computer technology and big data developed from the artificial intelligence science. In recent years, with the emergence of big medical data, machine learning methods have improved the performance of the predictive models rapidly. In previous studies, machine learning shows better predictive values than traditional prediction models.^7–12 However, so far, we have not seen any research report on the establishment of in-hospital bleeding risk prediction model for AMI patients by machine learning method. Therefore, we aimed to establish a new bleeding score by machine learning, in order to reduce the risk of bleeding in patients with AMI.

Methods

Data description

Our study was a prospective cohort analysis of data from the China Acute Myocardial Infarction (CAMI) registry, a prospective, nationwide, multicenter observational study for AMI patients, which was organized and conducted by the Fuwai Hospital, National Center for Cardiovascular Diseases, China. The methodology of the CAMI registry (NCT01874691) was described previously.¹³ The patients fulfilled the inclusion/exclusion criteria were consecutively selected into current analysis. The registry includes three levels of hospitals (at the provincial, prefecture, and county level) which reflect typical Chinese governmental and administrative model and covers all the provinces and municipalities with broad coverage of geographic region across mainland China (except for Hong Kong and Macau). Patients with AMI were consecutively enrolled and data were collected upon their arrival and throughout their hospital stay until discharge. Data were collected, validated, and submitted by trained clinical cardiologists or cardiovascular fellows to ensure the accuracy and reliability of data at each participating site. The study was conducted in accordance with the Declaration of Helsinki and was approved by the Ethics Committee of Fuwai Hospital (Approval No. 2012-431), and all of the patients provided written informed consent.

Study population

Among 41,348 consecutive AMI patients in the registry study between January 1, 2013 and June 30, 2016, 24,612 patients were excluded according to our cohort exclusion criteria, including patients not received PCI (n = 22,607), only received bare metal stent (n = 892) or not received dual antiplatelet therapy (n = 1113). The study population consisted of 16,736 patients, among which 70 (0.42%) patients experienced in-hospital bleeding (Figure 1).

Figure 1.

Patient flow chart for the study cohort.

Endpoint and follow-up

We used the Third Universal Definition of Myocardial Infarction as a diagnostic criterion for AMI.¹⁴ Major bleeding was defined by the Bleeding Academic Research Consortium (BARC),¹⁵ which included BARC type 3 or 5 of non-coronary artery bypass grafting–related bleeding. Major bleeding events occurred in hospital were recorded in detail.

Statistical analysis

Baseline characteristics

Continuous variables were presented as mean ± standard deviation and compared using student’s t test. Categorical variables were expressed as number and proportion and compared by means of the chi-square test or Fisher exact test. All tests were two-tailed and a value of p < 0.05 was considered to represent statistical significance. All analyses were performed with R statistical software version 3.4.3 (R Foundation for Statistical Computing).

Model construction and validation

We randomly split the study population into derivation set (n = 8368, 50%) and validation set (n = 8368, 50%). The 50/50 scheme was used to guarantee the stability of the validation results given the relatively low in-hospital bleeding rate. Based on the derivation data set with 98 candidate variables (Table 1), we developed a risk prediction model to predict in-hospital bleeding using eXtreme Gradient Boosting (XGBoost) algorithm.¹⁶ XGBoost is a recently developed state-of-the-art machine learning algorithm, which ensembles a series of decision trees into a stronger classifier. And XGBoost integrated the sparsity-aware algorithm devoted to accurately handle missing values. It can automatically rank the most important variables that have the largest contribution to the prediction of the clinical outcome. The model parameters were tuned using 10-fold cross-validation.

Table 1.

A total of 98 variables were included for analysis.

Demographics	age, gender, marriage, medical insurance, education, BMI
Disease histories	hypertension, hyperlipidemia, diabetes, angina history, old myocardial infarction, heart failure history, emergency revascularization history, selective PCI history, CABG history, aortic disease, chronic renal failure, dialysis, COPD, rheumatism or immunological diseases, oncology, peripheral arterial disease, bleeding history
Drug use	nonsteroidal anti-inflammatory drug, immunosuppressive, nitrate history, in-hospital nitrate, Beta-blockers history, in-hospital Beta-blockers, calcium channel blockers history, in-hospital calcium channel blockers, ACEI/ARB history, in-hospital ACEI/ARB, antiarrhythmic history, in-hospital antiarrhythmic, aldosterone antagonists history, in-hospital aldosterone antagonists, diuretic history, in-hospital diuretic, statin before primary PCI, oral Chinese traditional medicine history, in-hospital oral Chinese traditional medicine, intravenous Chinese traditional medicine history, in-hospital intravenous Chinese traditional medicine, bivalirudin or not
Risk factors	present smoking, drinking history, premature coronary heart disease family history, greasy foods, regular exercise, snore
Presentation characteristics	admission diagnosis: STEMI, persistent chest pain, clinical symptoms, first symptoms feature, predisposing factors, prodromal symptoms, arrive by ambulance, heart rate, systolic blood pressure, diastolic blood pressure, cardiac shock, malignant arrhythmias, cardiac arrest, Killip class, FMC to onset of symptoms > 6 h
ECG	ST-segment change, Ischemic site of ECG: anterior wall, ST-segment resolution
Coronary angiography	triple-vessel lesion, TIMI flow grade, femoral artery puncture
revascularization	emergency revascularization, emergency PCI, revascularization to onset of symptoms > 6 h,
Laboratory test	CK-MB baseline, abnormal troponin T, abnormal troponin I, glucose, hematocrit, hemoglobin, platelets, white blood cell, neutrophils, NT-proBNP, BNP, Cholesterol, LDL-C, HDL-C, triacylglycerol, Potassium, sodium, chlorine, total bilirubin, direct bilirubin, HBA1c, Hs-CRP, creatinine clearance

ACEI, angiotensin-converting enzyme inhibitors; ARB, angiotensin receptor antagonist; BMI, body mass index; BNP, B type natriuretic peptide; CABG, coronary artery bypass grafting; CK-MB, creatine kinase isoenzyme-MB; COPD, chronic obstructive pulmonary diseases; ECG, electrocardiogram; FMC, first medical contact; HDL-C, high-density lipoprotein cholesterol; Hs-CRP, high-sensitivity C-reactive protein; LDL-C, low density lipoprotein cholesterol; NT-proBNP, N-terminal pro brain natriuretic peptide; PCI, percutaneous coronary intervention; STEMI, ST-segment elevation myocardial infarction; TIMI, thrombolysis in myocardial infarction.

The predictive performance of the constructed XGBoost model in predicting BARC 3 or 5 bleeding was validated in the validation data set. The discriminatory capacity of the XGBoost model was assessed by AUROC (area under the receiver-operating characteristic curve), and the calibration was estimated by the Hosmer–Lemeshow goodness-of-fit test. To gain insights into the relative contribution of each predictor to the prediction of in-hospital bleeding, we also computed the importance scores of the variables in the XGBoost model. And in order to test the practicability and robustness of the model, we evaluated the impact on the prediction performance of the model when reducing the number of variables. The least important variable was eliminated for each evaluation. We stratified the cohort into low/medium/high risk groups according to the patient distribution.

Results

Baseline characteristics

Among 16,736 patients, the mean age of the patients was 60.08 ± 12.64 years, and 13,311 (79.5%) were male. 16,736 patients (100%) took aspirin, 16,552 (98.90%) took P2Y₁₂ receptor inhibitors, and 103 (0.62%) took bivalirudin. A total of 70 patients (0.42%) had in-hospital bleeding (BARC 3 or 5 bleeding). As showed in Table 2, patients who had in-hospital bleeding were older (p = 0.010), more often female (p < 0.001), less often current smoker (p = 0.019) compared with patients without in-hospital bleeding. More in-hospital bleeding patients had sudden persistence (p = 0.007) and obvious clinical symptoms (p = 0.002) when the disease occurred. They also had higher heart rate (p = 0.013), lower systolic blood pressure (p = 0.001), lower diastolic blood pressure (p = 0.040) and higher Killip class (p < 0.001). In addition, more in-hospital bleeding patients had triple-vessel lesion (p = 0.004) and femoral artery puncture (p < 0.001). And, creatine kinase isoenzyme-MB (CK-MB) (p < 0.001), white Blood Count (p = 0.011), potassium (p = 0.001), N-terminal pro brain natriuretic peptide (NT-proBNP) (p = 0.001) at baseline were also much higher, while much lower of hemoglobin in the in-hospital bleeding group (p < 0.001) (Table 2).

Table 2.

Baseline characteristics of the study population.

Variables	Non-bleeding group (N = 16666)	In-hospital bleeding group (N = 70)	P-value
Age, years	59.82 ± 19.47	65.81 ± 15.68	0.010
Gender, female	3407 (20.4)	30 (42.9)	< 0.001
BMI, kg/m²	24.73 ± 11.04	22.97 ± 2.81	0.201
Current Smoking	8233 (49.7)	24 (34.8)	0.019
Admission diagnosis, STEMI	13,023 (80.0)	60 (85.7)	0.295
Hypertension	8391 (51.0)	41 (59.4)	0.201
Diabetes	3192 (19.7)	10 (14.5)	0.349
Old myocardial infarction	942 (6.1)	7 (10.6)	0.203
Peripheral arterial disease	12,91 (7.7)	8 (11.4)	0.355
Bleeding history	203 (1.4)	2 (2.9)	0.615
Persistent chest pain	12,516 (75.5)	63 (90.0)	0.007
Clinical symptoms, yes	14,388 (86.5)	70 (100.0)	0.002
Heart rate, bpm	76.06 ± 16.72	81.10 ± 25.64	0.013
Systolic blood pressure, mmHg	128.77 ± 24.39	118.99 ± 29.75	0.001
Diastolic blood pressure, mmHg	79.27 ± 16.73	75.07 ± 17.41	0.040
Killip (%)			< 0.001
I	13,273 (80.3)	43 (62.3)
II	2459 (14.9)	14 (20.3)
III	394 (2.4)	3 (4.3)
IV	413 (2.5)	9 (13.0)
TIMI flow grade (%)			0.110
0	5342 (68.5)	38 (77.6)
I	917 (11.8)	3 (6.1)
II	584 (7.5)	6 (12.2)
III	955 (12.2)	2 (4.1)
Bivalirudin	103 (0.6)	0 (0.0)	1.000
Triple-vessel lesion	2946 (36.7)	30 (56.6)	0.004
Femoral artery puncture (%)	892(5.4)	11(15.4)	< 0.001
ECG ST-segment resolution (%)			0.065
fall back to baseline	961 (15.3)	9 (20.5)
fall back > 50%	4055 (64.4)	21 (47.7)
fall back < 50%	1281 (20.3)	14 (31.8)
CK-MB	688.49 ± 877.03	1127.83 ± 999.59	< 0.001
Glucose	8.09 ± 7.89	9.72 ± 5.69	0.090
Creatinine	81.63 ± 129.63	101.78 ± 58.26	0.200
Creatinine clearance	104.15 ± 800.69	53.55 ± 30.44	0.713
White blood count, ×10⁹/L	10.24 ± 3.54	11.33 ± 3.57	0.011
Neutrophils, ×10⁹/L	73.86 (14.69)	77.32 (13.66)	0.051
Hematocrit, (%)	41.91 ± 38.10	38.00 ± 7.38	0.398
Hemoglobin, g/L	138.96 ± 19.30	128.07 ± 25.23	< 0.001
Platelet, ×10⁹/L	210.89 ± 63.51	198.09 ± 61.38	0.095
Total cholesterol, mmol/L	4.65 ± 5.72	4.42 ± 1.19	0.743
Triglyceride, mmol/L	1.90 ± 9.16	1.65 ± 1.24	0.825
LDL-C, mmol/L	2.86 ± 2.00	2.64 ± 1.02	0.392
HDL-C, mmol/L	1.16 ± 2.32	1.17 ± 0.45	0.987
Potassium, mmol/L	3.93 ± 0.49	4.12 ± 0.99	0.001
Sodium, mmol/L	138.97 ± 5.66	139.16 ± 4.52	0.777
Total bilirubin, μmol/L	15.04 ± 8.19	14.98 ± 10.73	0.954
Direct bilirubin, μmol/L	4.64 ± 3.92	3.82 ± 3.22	0.088
Hs-CRP, mmol/L	16.31 ± 28.14	23.40 ± 41.19	0.163
NT-proBNP, pg/ml	1496.60 ± 3155.18	3168.99 ± 5273.05	0.001
In-hospital statin	1310 (9.8)	2 (3.1)	0.115
In-hospital nitrate	9746 (67.9)	44 (63.8)	0.545

Values are mean ± SD or n (%).

BMI, body mass index; CK-MB, creatine kinase isoenzyme-MB; ECG, electrocardiogram; HDL-C, high-density lipoprotein cholesterol; Hs-CRP, high-sensitivity C-reactive protein; LDL-C, low-density lipoprotein cholesterol; NT-proBNP, N-terminal pro brain natriuretic peptide; STEMI, ST-segment elevation myocardial infarction; TIMI, thrombolysis in myocardial infarction.

Prediction of in-hospital bleeding

In total, 45 features were automatically selected from the 98 candidate features (Table 1) and were used to construct the CAMI bleeding score. The discrimination ability of different models, on derivation set and validation set, as represented by, was shown in Figure 2. The XGBoost model showed significant better discrimination ability compared with the CRUSADE risk score and ACUITY-HORIZONS score. On the derivation data set, the AUROC of the XGBoost model was 0.941 (95% CI = 0.909–0.973, p < 0.001), the AUROC of CRUSADE score was 0.765 (95% CI = 0.686–0.845, p < 0.001), and the AUROC of ACUITY-HORIZONS score was 0.639 (95% CI = 0.541–0.738, p = 0.003); similarly, the AUROC of the XGBoost model on the validation data set was 0.837 (95% CI = 0.772–0.903, p < 0.001), which was also higher than that of the CRUSADE score [AUROC (0.741, 95% CI = 0.654–0.828), p < 0.001] and the ACUITY-HORIZONS score [AUROC (0.731, 95% CI = 0.641–0.820), p < 0.001] (Table 3). Calibration of the XGBoost model was evaluated using the quintile plot of observed versus predicted risk and Hosmer–Lemeshow goodness-of-fit test (Figure 3). The Hosmer–Lemeshow test statistic was 11.507, with p = 0.201, which indicated that the XGBoost model on the validation data set suggested no evidence of lack of fit (p > 0.05).

Figure 2.

AUROC of the XGBoost model, the CRUSADE risk score, and the ACUITY-HORIZONS score on training data (a) and validation data (b).

Table 3.

AUROCs of the XGBoost model, CRUSADE score, and ACUITY-HORIZONS score in the derivation and validation data sets.

Model	Derivation data set	Validation data set
XGBoost	0.941 (0.909–0.973, p < 0.001)	0.837 (0.772–0.903, p < 0.001)
CRUSADE score	0.765 (0.686–0.845, p < 0.001)	0.741 (0.654–0.828, p < 0.001)
ACUITY-HORIZONS score	0.639 (0.541–0.738, p = 0.003)	0.731 (0.641–0.820, p < 0.001)

ACC, American College of Cardiology; ACUITY-HORIZONS, Acute Catheterization and Urgent Intervention Triage Strategy and Harmonizing Outcomes with Revascularization and Stents in Acute Myocardial Infarction; AHA, American Heart Association; AUROC, Area Under the Receiver-Operating Characteristic curve; CRUSADE, Can Rapid risk stratification of Unstable angina patients Suppress ADverse outcomes with Early implementation of the ACC/AHA Guidelines; XGBoost, eXtreme Gradient Boosting.

Figure 3.

Calibration plot of the XGBoost model on the validation data set.

We found that the model performance was basically stable as the number of variables decreased. When 17 variables remained in the model, the AUROC of the model was 0.830 (95% CI = 0.764–0.895, p < 0.001). Even if only 12 variables were included, the model still showed good prediction performance with AUROC of 0.809 (95% CI = 0.745–0.873, p < 0.001) on the validation data set (Figure 4).

Figure 4.

AUROC of XGBoost corresponding to different number of features on the validation data set. The model performance was basically stable as the number of variables decreased. When 12 variables were included, the AUROC was of 0.809 (95% CI = 0.745–0.873, p < 0.001).

Variable importance

Figure 5 demonstrated the importance scores of the 12 relatively important variables in the XGBoost model for the bleeding outcome. Clinical characteristics (age, systolic blood pressure, diastolic blood pressure, and heart rate) and blood test indicators (creatinine clearance rate, direct bilirubin, hematocrit, CK-MB, potassium, NT-proBNP, hemoglobin, and total bilirubin) were important predictors. We established the calculator from these 12 clinical variables online (http://101.89.95.81:8260/), and estimated the low, medium, and high risk of bleeding.

Figure 5.

Relative contribution of the 12 most important features.

Validity analysis of scores in STEMI and NSTEMI subgroups

In subgroup analysis, validation data set (n = 8368) was divided into STEMI group (n = 4792) and NSTEMI group (n = 3576). The XGBoost model showed similar predictive value in the STEMI group (AUROC: 0.826) and NSTEMI group (AUROC: 0.821), and outperformed the CRUSADE risk score and ACUITY-HORIZONS score, respectively (Table 4).

Table 4.

Validity analysis of scores in STEMI and NSTEMI subgroups.

	AUROC
	STEMI (n = 4792)	NSTEMI (n = 3576)
XGBoost	0.826 (95% CI = 0.751–0.900)	0.821 (95% CI = 0.668–0.974)
CRUSADE score	0.737 (95% CI =0.649–0.824)	0.812 (95% CI = 0.625–0.999)
ACUITY-HORIZONS score	0.748 (95% CI = 0.660–0.836)	0.701 (95% CI = 0.507–0.895)

ACUITY-HORIZONS, Acute Catheterization and Urgent Intervention Triage Strategy and Harmonizing Outcomes with Revascularization and Stents in Acute Myocardial Infarction; AUROC, area under the receiver-operating characteristic curve; CI, confidence interval; CRUSADE, Can Rapid risk stratification of Unstable angina patients Suppress ADverse outcomes with Early implementation of the ACC/AHA Guidelines; NSTEMI, Non-ST-segment elevation myocardial infarction; STEMI, ST-segment elevation myocardial infarction; XGBoost, eXtreme Gradient Boosting.

Discussion

With the rapid development of medical technology, clinical prediction model and risk assessment have entered a new era of large sample and artificial intelligence. This study analyzed a multicenter, large-sample (16,736 patients) CAMI study database, including the patients’ clinical basic information, coronary angiography data, treatment, NT-proBNP, and so on, a total of 98 baseline indicators. For the first time, the advanced XGBoost method was used to establish a bleeding risk prediction model for patients with AMI. The results are as follows: (1) we have established a good predictive model by using the method of machine learning to identify the risk of in-hospital bleeding in AMI patients. The model shows good predictive effect in both training set and verification set; (2) the model prediction effect established by the XGBoost risk prediction model is better than the traditional CRUDADE score and ACUITY-HORIZONS score in both the training set and the verification set; and (3) we set up a calculator for the 12 most important clinical parameters, which can be applied to the clinical practice.

It is generally believed that acute coronary syndrome patients have higher risk of ischemic events during in-hospital period, and previously D’Ascenzo et al.¹⁷ also demonstrated a significant excess of average daily ischemic risk in the first 2 weeks. As a result, early antithrombotic therapy is important to prevent ischemic events during in-hospital period. However, it is a two-edged sword, once patients occurred severe bleeding, the risk of death will increase significantly. Therefore, it is very important to identify high-risk bleeding patients with AMI. However, the predictive value of the bleeding score which recommended by the guideline is relatively limited,¹⁸ and the related research is obviously lagging behind. This study established a prediction model of in-hospital bleeding in AMI patients by machine learning method. The study showed that whether the CAMI bleeding score is in the derivation data set (AUROC: 0.941), or the validation data set (AUROC: 0.837) showed good prediction results. And, the good fit of the XGBoost model on the validation data set was proved by goodness-of-fit test (p = 0.201). Machine learning method is an artificial intelligence science based on computer technology and big data. In recent years, it has shown good clinical application value in many medical fields, including tumor,⁷ neurology,¹⁹ diabetes,²⁰ and pathological diagnosis.²¹ It has also come to the fore recently in the cardiovascular field, including cardiovascular images (such as cardiac ultrasound¹² and cardiac CT¹¹); predicting poor cardiovascular outcomes, such as death after PCI,⁹ rehospitalization of heart failure,⁹ and in-hospital death after transcatheter aortic valve replacement.⁸ For the first time, we assessed and reported the risk of in-hospital bleeding in patients with AMI by machine learning.

It is worth noting that this study included more risk prediction factors and found that in addition to traditional bleeding risk factors, such as creatinine clearance rate, age, systolic blood pressure, and other bleeding prediction indicators, some nontraditional factors were also found to be related to bleeding, such as serum potassium, NT-proBNP, bilirubin, and so on. This often happens in the machine learning methods results. For example, Al’Aref et al.¹⁰ predicted the in-hospital death of PCI by machine learning method, the results showed that there were both traditional indicators such as body mass index and serum creatinine, and nontraditional indicators were also incorporated in risk scores such as day of the week. This is because the previous Logistic regression model based on the presence of a significant bivariate relationship with the primary outcome, tends to ignore the potential prognostic value of the interaction between weaker risk factors and preliminary results. Second, continuous variables are often divided into different intervals, which may also lead to the loss of prediction information.

To the best of our knowledge, no study has reported the relationship between the serum bilirubin, potassium, and in-hospital bleeding in patients with AMI. However, some studies have shown that serum bilirubin and potassium are associated with poor in-hospital prognosis of AMI.^22,23 Recently, a meta-analysis (323,891 patients) showed²⁴ that total bilirubin was significantly positively related to in-hospital cardiovascular mortality and major adverse cardiac events. Celik et al.²⁵ reported that serum bilirubin levels were independently associated with no-reflow and in-hospital major adverse cardiac events in patients with STEMI undergoing primary PCI. Similarly, previous studies have shown that serum potassium is associated with poor prognosis in patients with AMI.^26,27 Goyal et al.²⁶ showed that a U-shaped relationship between serum potassium levels and in-hospital mortality. The advantage of machine learning score is that it can find out the correlation between variables and prognosis, regardless of whether the correlation is linear or not. Our results suggest that bilirubin and serum potassium are closely related to AMI bleeding. In addition to the fact that bilirubin and serum potassium are related to the severity of patients with AMI, it may also be related to the reflection of hepatic or renal insufficiency. At present, it is consistent to consider that the risk of bleeding is significantly increased in patients with hepatic or renal insufficiency.

Our study showed that the machine learning model had a higher predictive value for identifying in-hospital bleeding than the traditional CRUSADE score and the ACUITY-HORIZONS score. Machine learning has a clear advantage over traditional statistical analysis. In 2016, Obermeyer and Emanuel²⁸ mentioned that traditional prognostic models are limited to a small number of variables, while machine learning greatly improves the outcome. In our CAMI bleeding score, 45 valuable indicators from 98 candidate indicators including clinical baseline data, coronary angiography data, drug treatment, and the biomarkers. In comparison, the traditional CRUSADE score published in 2009 included eight variables, and the predictive value of bleeding was moderate (C statistic 0.72)³ in original study. As well as the ACUITY score⁴ published in 2010 included seven variables for bleeding, and the C statistic was 0.74. In short, with the increasing number of variables in the current big clinical electronic database and the integration of new biomarkers, machine learning model has the ability to better sort out and analyze the massive data of patients. It breaks through the limitations of the accuracy and scope of application of the traditional statistical model, and brings us a better prediction effect than the traditional score in the real world.

At the end of the study, we performed a weight analysis on the valuable variables. It was found that when the variables were reduced from 45 to 12, the model’s AUROC in the verification set only decreased from 0.837 to 0.809, which still maintained a good predictive value. Although the size of 12 is greater than the traditional bleeding score in the number of parameters, they are all clinically easy to obtain indicators. In addition, we established the calculator from these 12 clinical variables which is expected to be easily applied to clinical practice. This is in line with the general trend of the development of artificial intelligence in cardiovascular precision medicine, and improves the prediction effect of AMI in-hospital bleeding. In the future, we look forward to further evaluating the predictive value of this model through external verification.

Limitations

First, our research has not been externally validated, and external validation of large samples should be conducted in the future to further evaluate the predictive value of the model; second, although the XGBoost model has been validated by the internal validation data set, there may still be overfitting; third, this study removed patients who did not receive PCI, so whether it is suitable for thrombolytic therapy and conservative treatment patients, further study is needed; fourth, because the inclusion parameters of different clinical center databases are different, and the number of parameters included in the machine learning model is higher than the traditional score, these reasons may cause the model to be inconvenient to use. Therefore, we selected 12 clinical parameters from the included 45 parameters according to the relative importance, and established a calculator to facilitate useful exploration for clinical use; fifth, since most of the patients included in this study were taking clopidogrel, future evaluation of the predictive value of ticagrelor or prasugrel remains to be studied.

Conclusion

This is the first study to use a large-sample, multicenter CAMI database, and the results showed that the XGBoost method can establish a risk model for accurately predicting in-hospital hemorrhage in patients with AMI, and its predictive value is better than the traditional bleeding scores. At the same time, we optimize the model parameters and set up a calculator for the most important 12 clinical parameters, which is expected to be used in clinical practice.

Footnotes

Acknowledgements

We thank all staff members for data collection, data entry, and monitoring as part of this study.

Declarations

ORCID iD

Jinqing Yuan

References

Cohen

Predictors of bleeding risk and long-term mortality in patients with acute coronary syndromes. Curr Med Res Opin 2005; 21: 439–445.

Eikelboom

Mehta

Anand

, et al. Adverse impact of bleeding on prognosis in patients with acute coronary syndromes. Circulation 2006; 114: 774–782.

Subherwal

Bach

Chen

, et al. Baseline risk of major bleeding in non-ST-segment-elevation myocardial infarction: the CRUSADE (Can Rapid risk stratification of Unstable angina patients Suppress ADverse outcomes with Early implementation of the ACC/AHA Guidelines) Bleeding Score. Circulation 2009; 119: 1873–1882.

Mehran

Pocock

Nikolsky

, et al. A risk score to predict bleeding in patients with acute coronary syndromes. J Am Coll Cardiol 2010; 55: 2556–2566.

Hamm

Bassand

Agewall

, et al. ESC Guidelines for the management of acute coronary syndromes in patients presenting without persistent ST-segment elevation: the Task Force for the management of acute coronary syndromes (ACS) in patients presenting without persistent ST-segment elevation of the European Society of Cardiology (ESC). Eur Heart J 2011; 32: 2999–3054.

Roffi

Patrono

Collet

, et al. 2015 ESC Guidelines for the management of acute coronary syndromes in patients presenting without persistent ST-segment elevation: Task Force for the Management of Acute Coronary Syndromes in Patients Presenting without Persistent ST-Segment Elevation of the European Society of Cardiology (ESC). Eur Heart J 2016; 37: 267–315.

Burki

TK.

Predicting lung cancer prognosis using machine learning. Lancet Oncol 2016; 17: e421.

Hernandez-Suarez

Kim

Villablanca

, et al. Machine learning prediction models for in-hospital mortality after transcatheter aortic valve replacement. JACC Cardiovasc Interv 2019; 12: 1328–1338.

Zack

Senecal

Kinar

, et al. Leveraging machine learning techniques to forecast patient prognosis after percutaneous coronary intervention. JACC Cardiovasc Interv 2019; 12: 1304–1311.

10.

Al’Aref

Singh

van Rosendael

, et al. Determinants of in-hospital mortality after percutaneous coronary intervention: a machine learning approach. J Am Heart Assoc 2019; 8: e011160.

11.

Singh

Al’Aref

Van Assen

, et al. Machine learning in cardiac CT: basic concepts and contemporary data. J Cardiovasc Comput Tomogr 2018; 12: 192–201.

12.

Narula

Shameer

Salem Omar

, et al. Machine-learning algorithms to automate morphological and functional assessments in 2D echocardiography. J Am Coll Cardiol 2016; 68: 2287–2295.

13.

Yang

, et al. The China Acute Myocardial Infarction (CAMI) registry: a national long-term registry-research-education integrated platform for exploring acute myocardial infarction in China. Am Heart J 2016; 175: 193–201.e3.

14.

Thygesen

Alpert

Jaffe

, et al. Third universal definition of myocardial infarction. Eur Heart J 2012; 33: 2551–2567.

15.

Mehran

Rao

Bhatt

, et al. Standardized bleeding definitions for cardiovascular clinical trials: a consensus report from the Bleeding Academic Research Consortium. Circulation 2011; 123: 2736–2747.

16.

Chen

Guestrin

XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, San Francisco, CA, 13–17 August 2016, pp. 785–794. New York: ACM.

17.

D’Ascenzo

Biolè

Raposeiras-Roubin

, et al. Average daily ischemic versus bleeding risk in patients with ACS undergoing PCI: insights from the BleeMACS and RENAMI registries. Am Heart J 2020; 220: 108–115.

18.

Baber

Predicting risk for bleeding after PCI: another step in the right direction but work remains. Int J Cardiol 2018; 254: 45–46.

19.

Claassen

Doyle

Matory

, et al. Detection of brain activation in unresponsive patients with acute brain injury. N Engl J Med 2019; 380: 2497–2505.

20.

Gulshan

Peng

Coram

, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 2016; 316: 2402–2410.

21.

Jha

Topol

EJ.

Adapting to artificial intelligence: radiologists and pathologists as information specialists. JAMA 2016; 316: 2353–2354.

22.

Gul

Uyarel

Ergelen

, et al. Prognostic value of total bilirubin in patients with ST-segment elevation acute myocardial infarction undergoing primary coronary intervention. Am J Cardiol 2013; 111: 166–171.

23.

Shen

Zeng

, et al. Prognostic value of total bilirubin in patients with acute myocardial infarction: a meta-analysis. Medicine (Baltimore) 2019; 98: e13920.

24.

Lan

Liu

, et al. Is serum total bilirubin a predictor of prognosis in arteriosclerotic cardiovascular disease? A meta-analysis. Medicine (Baltimore) 2019; 98: e17544.

25.

Celik

Kaya

Akpek

, et al. Does Serum Bilirubin level on admission predict TIMI flow grade and in-hospital MACE in patients with STEMI undergoing primary PCI. Angiology 2014; 65: 198–204.

26.

Goyal

Spertus

Gosch

, et al. Serum potassium levels and mortality in acute myocardial infarction. JAMA 2012; 307: 157–164.

27.

Colombo

Kirchberger

Amann

, et al. Association of serum potassium concentration with mortality and ventricular arrhythmias in patients with acute myocardial infarction: a systematic review and meta-analysis. Eur J Prev Cardiol 2018; 25: 576–595.

28.

Obermeyer

Emanuel

EJ.

Predicting the future – big data, machine learning, and clinical medicine. N Engl J Med 2016; 375: 1216–1219.

Machine learning for prediction of bleeding in acute myocardial infarction patients after percutaneous coronary intervention

Abstract

Background:

Objectives:

Design:

Results:

Conclusion:

Trial registration:

Keywords

Introduction

Methods

Data description

Study population

Endpoint and follow-up

Statistical analysis

Baseline characteristics

Model construction and validation

Results

Baseline characteristics

Prediction of in-hospital bleeding

Variable importance

Validity analysis of scores in STEMI and NSTEMI subgroups

Discussion

Limitations

Conclusion

Footnotes

Acknowledgements

Declarations

ORCID iD

References