Sage Journals: Discover world-class research

Abstract

Objective

The purpose of this study is to develop and validate an improved CA-VTE risk prediction model based on semi-supervised learning (SSL) algorithm.

Methods

This study used a combined retrospective and prospective cohort design. First, data from 2100 cancer patients in a tertiary hospital in Beijing were retrospectively collected, including a “labeled cohort” with CA-VTE outcomes (N = 1036) and an “unlabeled cohort” without outcomes (N = 1064). Then, another dataset were prospectively collected as an external validation set (N = 321). Eight supervised machine learning (ML) algorithms were used to develop CA-VTE risk prediction models and one SSL algorithm was used to improve generalizability of the models (pre- and post-imputation ML models). Model performance were evaluated using the Area Under the Curve (AUC) and Brier score in the prospective cohort, and compare them with the Khorana score.

Results

The eight post-imputation ML models (AUC: 0.816-0.868; Brier score: 0.118-0.160) performed better on the external validation set than the pre-imputation models (AUC: 0.798-0.841; Brier score: 0.133-0.171). In contrast, the AUC of the Khorana score remained unchanged (AUC: 0.693), while its Brier score increased (Brier score: 0.172 vs 0.178).

Conclusion

Based on a retrospective and prospective cohort study design, this study developed eight ML models that outperformed the Khorana score. Using SSL algorithm improved the external validation performance of the models and enhanced prediction accuracy. This study can provide an important reference for the early identification of high-risk factors and stratified preventive care for CA-VTE.

Keywords

venous thromboembolism cancer semi-supervised learning risk factors prediction model

Introduction

Deep vein thrombosis (DVT) and/or pulmonary embolism (PE) occurring in cancer patients are defined as cancer-associated venous thromboembolism (CA-VTE).¹ CA-VTE would occur in 4∼20% of cancer patients at some stage, with the highest risk immediately following cancer diagnosis.² Previous studies showed that the presence of cancer increased the risk of VTE by 4 to 9-fold.^3–5 VTE is a common and life-threatening condition in patients with cancer, which can cause the increase of need for emergency care visits and/or hospitalization, prolonged therapeutic anticoagulation, increased risk of bleeding and recurrent VTE and other symptoms or morbidity, delay or interruption of anti-cancer treatment, higher mortality and health-care costs, worsened overall survival and quality of life.¹ Therefore, it is crucial to conduct CA-VTE related risk factors identification, risk assessment, and prevention management.⁶

CA-VTE related risk factors and mechanisms are multidimensional and interactive, including individual-related factors, tumor-related factors, treatment related factors, and multiple biomarkers.¹ Many cancer-specific VTE risk assessment or prognostic prediction tools using clinical features, biomarkers, and genetic have been developed such as Khorana Score,⁷ Vienna CATS Score,⁸ Protecht Score,⁹ CONKO Score,¹⁰ ONKOTEV Score,¹¹ COMPASS-CAT Score,¹² CATS/MICA Score,¹³ Tic-ONCO Score,¹⁴ MD Anderson Cancer Center CAT Score,¹⁵ and Rising-VTE/NEJ037 Score.^16,17 However, many of them have low discriminatory capability and have not yet been validated.^2,18,19 Khorana Score remains the most used method, but it also has the same limits and need more research to modify and improve its generalizability.^20,21

In recent years, machine learning (ML) algorithms have been increasingly applied in clinical prediction model related research. Many studies have used ML algorithms to develop CA-VTE prediction models, achieving higher prediction accuracy than the previous traditional tools such as the Khorana Score.^22–28 However, the following limitations hinder the clinical application and model generalizability of these ML based CA-VTE models: i. insufficient sample size to train models: i.large sample size, for example, at least more than 10∼20 Events Per Variable (EPV)^29,30 was required to train accurate ML based CA-VTE models and reduce the risk of model overfitting regardless of the fact that few studies met the sample size requirement; ii. lack of external validation: most studies of ML based CA-VTE models and previous traditional tools have not yet been sufficiently validated; iii. existence of missing data about VTE labels in clinical practice: not all cancer patients will undergo routine venous ultrasound or pulmonary artery CT examination, so that a large number of cases lack outcome labels of VTE. Most ML based CA-VTE models were trained using supervised learning (SL) algorithms and all patients with VTE labels were essential.

Fortunately, semi-supervised learning (SSL) algorithms have received more attention for the ability to simultaneously use labeled and unlabeled data for model development and improvement.³¹ Several studies mainly developed multiple disease diagnosis and prognosis prediction models such as osteoporosis,³² colorectal cancer,^33,34 type 2 diabetes,³⁵ sepsis,³⁶ and cardiovascular disease³⁷ based on image data and electronic health records (EHR) using SSL algorithms, which have basically demonstrated that SSL algorithms may have the potential to enhance the generalizability of clinical prediction models. However, few studies were conducted to improve the CA-VTE model generalizability using SSL algorithms based on the labeled and unlabeled EHR data. Therefore, the purpose of this study was to apply SSL algorithms to improve and validate the CA-VTE risk prediction models for healthcare providers, based on a retrospective and prospective cohort study.

Material and Methods

Study Design and Participants

The Retrospective and Prospective Cohort Study

The retrospective cohort of 2100 cancer patients (with CA-VTE label: N = 1036; without CA-VTE label: N = 1064) who were treated in a tertiary hospital in Beijing from January 2017 to October 2019 was used as training set for model development; The prospective cohort of 321 cancer patients who were treated in the same hospital from November 2019 to October 2021 was used as external validation set to evaluate the model performance and generalizability.

The Inclusion and Exclusion Criteria

The inclusion criteria have five items: i) patients ≥18 years old; ii) hospital stay ≥48 h; iii) patients with a confirmed pathological diagnosis of a malignant tumor before being diagnosed with CA-VTE; iv) having at least one test result of blood routine and D-Dimer; v). informed consent (only required for the prospective cohort).

The exclusion criteria include three items: i) having a diagnosis of acute leukemia; ii) being pregnant or lactating; iii) having a diagnosis of VTE (DVT or PE) upon admission or receiving anticoagulation treatment instead of thromboprophylaxis.

Sample Size

This study will include 30 alternative predictive variables, and the sample sizes of two datasets were estimated separately. For training set, according to the 10 EPV rule of thumb,²⁹ 316 patients developed CA-VTE. For external validation set, a study³⁸ suggested that the sample size for external validation set should be at least 200 cases, with 100 positive cases and 100 negative cases respectively. The sample size of negative group (n = 243) meets the requirement, while the sample size of positive cases (n = 78) is slightly below the recommended threshold.

Candidate Predictors

Based on the literature review of CA-VTE related risk factors, this study included a total of 30 candidate predictive factors from four dimensions: patients-related factors, cancer-specific factors, treatment-related factors, and laboratory variables (Supplementary Figure 1). More details about candidate predictors can be found in our past studies.^26,39

Outcomes

The outcome of this study were the occurrence of CA-VTE (1 = yes; 0 = no) including CA-DVT and CA-PE. The diagnosis of CA-DVT was objectively confirmed by color Doppler ultrasonography during hospitalization. The diagnostic methods of CA-PE included computed tomography (CT), magnetic resonance imaging (MRI), pulmonary arteriography, radionuclide lung ventilation or blood flow perfusion scanning, etc.

Ethical Considerations

This study was approved by the Institutional Review Board of Peking University (IRB00001052-18037).

Data Collection, Model Development, and Model Validation

Data Collection

All data were manually collected in an electronic medical record system (EMRS) from a tertiary hospital by two well-trained researchers using standard case report form (CRF). To control the data quality, all CRFs were recorded twice and double-checked in the Epidata software (v 3.1). Both in the retrospective and prospective cohorts, all candidate predictors were recorded before the screen for CA-VTE.

Data Preprocessing

Before model development, missing rate for candidate variables were calculated. Categorical variables had no missing values. Missing rate of continuous variables were low (0.2%∼0.9%) and the median values were used to fill the missing continuous variables.

Statistical Description and Statistical Inference

Continuous variables were described by median with interquartile range. Categorical variables were described by frequency and percentage. The chi-square test, Mann-Whitney test, Kruskal-Wallis test, and one-way ANOVA were appropriately used to conduct univariable analysis.

Predictors Selection

Predictors selection was conduct in training set using univariable analysis and Lasso regression. Variables with a P-value <.100 in univariable analysis and having a non-zero coefficient in the Lasso regression were entered into the model.

Model Development and Validation Based on SSL Algorithm

This study used self training algorithm, a kind of typical SSL algorithm, to perform pseudo label imputation of CA-VTE and model retraining. Model development and validation in self training algorithm: i) Eight ML algorithms, including linear discriminant analysis (LDA), logistic regression (LR), classification, regression and tree (CART), random forest (RF), gradient boosting machine tree (GBM), extreme gradient boosting tree (XGB), support vector machine (SVM), and artificial neural network (ANN), were firstly used to train the “before imputed models” in the “labeled cohort”(N = 1036, before imputed training set); ii) The best model was selected to perform “CA-VTE pseudo label imputation” on the “unlabeled cohort” (N = 1064); iii) The “labeled cohort” and “unlabeled cohort” were merged as the “imputed training set” (N = 2100) and to retrain the eight “imputed models”; iv) Compared the external validation performance between the “imputed models” and the “before imputed models” in the same prospective cohort (N = 321); v) Selected the optimal model, and finally make model presentation and report. To determine the best hyperparameters of CART, RF, GBM, XGB, SVM, and ANN, five-fold cross-validation (repeated three times) and grid research method were used both in “before imputed training set” and “imputed training set”, respectively. The study flowchart was shown in Figure 1.

Figure 1.

The study flowchart.

Model Performance

Model performance were compared among eight ML models and Khorana Score. This study comprehensively evaluated model performance from four dimensions: discrimination, calibration, clinical utility, and model improvement. i) Model discrimination: AUC and ROC curves; ii) Model calibration: Brier Score and calibration curves; iii) Clinical utility: Decision curve analysis (DCA) curves; iv) Model improvement: Category based net reclassification index (Category based NRI) and integrated discrimination improvement (IDI).

Model Presentation and Report

It is necessary to conduct model presentation, report, and explanation for the recommended optimal CA-VTE risk prediction model. Among eight ML models trained in this study, LDA, LR, and CART belong to interpretable models, while RF, GBM, XGB, SVM, and ANN belong to black box models. If the optimal model is LDA or LR, this study will present it in nomogram; If the optimal model is CART, present it in a decision tree; If the optimal model is a black box model such as RF, GBM, XGB, SVM, and ANN, it can be explained from two dimensions: dataset level and instance level.^40,41 For the dataset level, global explanation method such as variable importance ranking (VIM) will be used. In terms of the instance level, local explanation method such as break-down plot will be used.^40,41

We used Statistical Package for Social Sciences 20.0 and R 3.6.1 (https://www.r-project.org/) to conduct statistical analysis. A two-sided P value <.05 was regarded as statistically significant. This study was written and reported in accordance with the TRIPOD + AI guideline and the BMJ step-by-step guide.^42,43

Results

Characteristics of the Retrospective Cohort and Prospective Cohort

The results of the comparison of variable characteristics of the two cohorts were shown in Table 1. A total of 2100 patients (mean [SD] age, 58.71 [12.51] years; 1060 [50.50%] male) and 321 patients (mean [SD] age, 62.54 [13.28] years; 194 [60.40%] male) were included in the retrospective cohort and prospective cohort, respectively. There were 316 patients (15.00%) who developed confirmed CA-VTE in the retrospective cohort and a total of 78 patients (24.30%) were diagnosed with CA-VTE in the prospective cohort. The differences in the following variables in the two cohorts were significant: Gender, Age, Smoking, Bed rest, Site of tumor, Tumor stage, Radiotherapy, Targeted or immunotherapy, CVC, PICC, Transfusion, NSAID, Previous VTE history, Edema, CCI, BMI, Hb, FIB, Khorana Score, Khorana Score + D-dimer, Khorana + D-dimer risk level, and CA-VTE.

Table 1.

Characteristics of the Retrospective Cohort and Prospective Cohort(n(%)/Mean(SD)).

Variables	Value	Retrospective Cohort (N = 2100)	Prospective Cohort (N = 321)	P
Gender	Men	1060 (50.5)	194 (60.4)	.001
	Women	1040 (49.5)	127 (39.6)
Age	Year	58.71 (12.51)	62.54 (13.28)	<.001
Smoking	Yes	260 (12.4)	24 (7.5)	.014
	No	1840 (87.6)	297 (92.5)
Drinking	Yes	226 (10.8)	23 (7.2)	.060
	No	1874 (89.2)	298 (92.8)
Bed rest	<3 days	747 (35.6)	86 (26.8)	.003
	≥3 days	1353 (64.4)	235 (73.2)
Site of tumor	Low risk	455 (21.7)	34 (10.6)	<.001
	High risk	1159 (55.2)	184 (57.3)
	Very high risk	486 (23.1)	103 (32.1)
Tumor stage	I	110 (5.2)	8 (2.5)	<.001
	II	212 (10.1)	23 (7.2)
	III	234 (11.1)	77 (24.0)
	IV	1131 (53.9)	183 (57.0)
	X	413 (19.7)	30 (9.3)
Chemotherapy	Yes	840 (40.0)	133 (41.4)	.670
	No	1260 (60.0)	188 (58.6)
Surgery	Yes	946 (45.0)	136 (42.4)	.401
	No	1154 (55.0)	185 (57.6)
Radiotherapy	Yes	58 (2.8)	1 (0.3)	.014
	No	2042 (97.2)	320 (99.7)
Targeted or immunotherapy	Yes	232 (11.0)	118 (36.8)	<.001
	No	1868 (89.0)	203 (63.2)
CVC	Yes	494 (23.5)	19 (5.9)	<.001
	No	1606 (76.5)	302 (94.1)
PICC	Yes	293 (14.0)	71 (22.1)	<.001
	No	1807 (86.0)	250 (77.9)
Transfusion	Yes	656 (31.2)	79 (24.6)	.019
	No	1444 (68.8)	242 (75.4)
NSAID	Yes	805 (38.3)	155 (48.3)	.001
	No	1295 (61.7)	166 (51.7)
Lymphadenopathy	Yes	163 (7.8)	21 (6.5)	.512
	No	1937 (92.2)	300 (93.5)
Previous VTE history	Yes	43 (2.0)	34 (10.6)	<.001
	No	2057 (98.0)	287 (89.4)
Varicose veins	Yes	70 (3.3)	9 (2.8)	.742
	No	2030 (96.7)	312 (97.2)
Edema	Yes	52 (2.5)	101 (31.5)	<.001
	No	2048 (97.5)	220 (68.5)
ICU/CCU	Yes	135 (6.4)	19 (5.9)	.275
	No	1965 (93.6)	302 (94.1)
CCI	/	6.46 (3.38)	7.58 (3.72)	<.001
BMI	kg/m²	22.86 (3.82)	22.15 (4.12)	.002
WBC	10⁹/L	6.20 (2.97)	6.30 (3.05)	.593
PLT	10⁹/L	241.97 (106.39)	235.68 (101.14)	.321
Hb	g/L	121.52 (22.81)	114.04 (23.37)	<.001
PT	s	12.20 (3.02)	12.41 (5.57)	.299
APTT	s	30.46 (4.07)	30.16 (3.51)	.225
TT	s	14.79 (13.55)	15.23 (8.06)	.565
FIB	g/L	3.65 (1.10)	3.35 (0.92)	<.001
D-Dimer	µg/L	760.45 (1640.74)	922.49 (1705.12)	.101
Khorana Score	/	1.75 (1.00)	1.93 (0.95)	.002
Khorana risk level	Low risk	180 (8.6)	20 (6.2)	.257
	Moderate risk	1508 (71.8)	230 (71.7)
	High risk	412 (19.6)	71 (22.1)
Khorana Score + D-dimer	/	2.29 (1.21)	2.58 (1.12)	<.001
Khorana + D-dimer risk level	Low risk	106 (5.0)	9 (2.8)	<.001
	Moderate risk	1140 (54.3)	134 (41.7)
	High risk	854 (40.7)	178 (55.5)
CA-VTE	Yes	316 (15.0)	78 (24.3)	<.001
	No	1784 (85.0)	243 (75.7)

†CVC, Central venous catheter; PICC, Peripherally inserted central catheter; NSAID, Nonsteroidal Anti-inflammatory Drugs; ICU/CCU, intensive care unit or cardiology intensive care; CCI, Charlson Comorbidity Index; BMI, body mass index; WBC, white blood cell count; PLT, platelet count; Hb hemoglobin; PT, Prothrombin time; APTT, activated partial thromboplastin time; TT, thrombin time; FIB, fibrinogen.

Univariate Analysis and Predictor Selection

In the Before imputed training set, univariate analysis (P ≤ .100) and LASSO regression (the regression coefficient was not zero) screened out 22 and 16 variables, respectively (Supplementary Tables 1 and Table 2). In the Imputed training set, univariate analysis and LASSO regression screened out 15 and 17 variables, respectively (Supplementary Tables 1 and Table 2). Finally, 13 and 14 variables were selected in the Before imputed training set and the Imputed training set, respectively (Table 2), according to two screening criterias, univariate analysis (P ≤ .100) and LASSO regression (the regression coefficient was not zero).

Table 2.

Results of Predictor Selection in Before Imputed Training set and Imputed Training set.

Methods	Criteria	Before Imputed Training Set	Imputed Training Set
Univariate analysis	P ≤ .100	15	22
LASSO	Β>0	17	16
AND	AND	13	14

Hyperparameter Tuning and Model Training

This study trained eight models in the Before imputed training set and the Imputed training set, respectively, including LDA, LR, CART, RF, GBM, XGB, SVM and ANN. This study used the five-fold cross-validation (repeated three times) and grid research method for hyperparameter tuning, and the reference indicator was AUC. Hyperparameter tuning results can be found in Supplementary Table 2. LDA and LR have no hyperparameters and do not require tuning.

Model Validation and Model Performance

Model Discrimination

The model discrimination results of the Before imputed training set, Imputed training set, and External validation set were shown in Table 3 and Supplementary Figure 2. In the three datasets, all model AUC in Imputed training set were higher than those in the Before imputed training set. As for the same algorithm model, the AUC of the joint D-Dimer model was higher than the AUC of the model non-combined D-Dimer. Among different models, the RF model had the highest AUC in both Before imputed and Imputed training set, and the RF model AUC in the external validation set is medium.

Table 3.

Results of Model Performance in Training set and Validation set.

Models	Before Imputed Models				Imputed Models				Before Imputed Models				Imputed Models
	Before Imputed Training Set(N = 1036)				Imputed Training Set(N = 2100)				Validation Set(N = 321)				Validation Set(N = 321)
	AUC		Brier Score		AUC		Brier Score		AUC		Brier Score		AUC		Brier Score
D-Dimer	Yes	No	Yes	No	Yes	No	Yes	No	Yes	No	Yes	No	Yes	No	Yes	No
LDA	0.799	0.783	0.141	0.146	0.866	0.838	0.093	0.100	0.803	0.787	0.137	0.144	0.867	0.854	0.120	0.124
LR	0.799	0.784	0.140	0.145	0.867	0.842	0.092	0.099	0.807	0.791	0.140	0.146	0.856	0.846	0.124	0.127
CART	0.866	0.848	0.106	0.112	0.902	0.866	0.064	0.079	0.798	0.721	0.171	0.182	0.816	0.732	0.160	0.184
RF	1.000	1.000	0.027	0.034	1.000	1.000	0.012	0.016	0.832	0.775	0.134	0.149	0.855	0.822	0.126	0.138
GBM	0.858	0.832	0.117	0.130	0.961	0.936	0.048	0.063	0.823	0.754	0.137	0.161	0.847	0.822	0.139	0.146
XGB	0.842	0.818	0.125	0.135	0.886	0.858	0.085	0.098	0.822	0.782	0.135	0.146	0.863	0.825	0.119	0.132
SVM	0.848	0.820	0.128	0.142	0.926	0.886	0.065	0.087	0.827	0.794	0.133	0.139	0.844	0.813	0.133	0.143
ANN	0.817	0.798	0.133	0.141	0.882	0.853	0.084	0.095	0.841	0.755	0.133	0.159	0.868	0.819	0.118	0.139
Khorana	0.640	0.585	0.171	0.175	0.692	0.617	0.120	0.124	0.693	0.611	0.172	0.180	0.693	0.611	0.178	0.187

†LDA, linear discriminant analysis; LR, logistic regression; CART, classification, regression and tree; RF, random forest; GBM, gradient boosting machine tree; XGB, extreme gradient boosting tree; SVM, support vector machine; ANN, artificial neural network; Khorana, Khorana Score.

Model Calibration

The model calibration results of the three datasets were also shown in Table 3 and Supplementary Figure 3. In the three datasets, all model Brier Score in Imputed training set were lower than those in the Before imputed training set. As for the same algorithm model, the Brier Score of the joint D-Dimer model was lower than the Brier Score of the model non-combined D-Dimer. Among different models, the RF model had the lowest Brier Score in both Before imputed and Imputed training set, and the RF model Brier Score in the external validation set is medium.

Model Clinical Utility

Figure 2 presented the DCA curves of the before imputed models and imputed models in the external validation set, respectively. In the external validation set, the DCA curves of the eight imputed models were all closer to the upper right of the coordinate axis than the DCA curves of the before imputed models. The clinical net benefit of the eight imputed models were higher than the eight before imputed models within the same probability threshold range.

Figure 2.

DCA curves for eight models and Khorana Score in validation set. ‡Figure A, B, C, and D show DCA curves in validation set; “_DD” indicates models combing with D-dimer; “_imp” indicates the imputed models; †LDA, linear discriminant analysis; LR, logistic regression; CART, classification, regression and tree; RF, random forest; GBM, gradient boosting machine tree; XGB, extreme gradient boosting tree; SVM, support vector machine; ANN, artificial neural network; Khorana, Khorana Score.

Model Improvement

Taking into account the model discrimination, calibration and clinical utility, the RF model was determined as the optimal model. Table 4 presented the comparison results of two model improvement indicators of the before imputed and imputed RF models: Category-based NRI and IDI. There was no significant difference in Category-based NRI of the two models (Category-based NRI=−0.009, P = .765; Category-based NRI=−0.027, P = .434). There was a significant difference in IDI between the two models (IDI = 0.083, P < .001).

Table 4.

Category-Based NRI and IDI of RF Models in Validation set.

Before Imputed Models	Imputed Models	Cut-Off	Category-Based NRI/IDI	95%CI	P
RF_DD	RF_DD_imp	0.387	−0.009	(−0.066∼0.048)	.765
RF_DD	RF_DD_imp	0.450	0.027	(−0.040∼0.093)	.434
RF_DD	RF_DD_imp	NA	0.083	(0.060∼0.105)	<.001

‡RF, random forest; “_DD” indicates models combing with D-dimer; “_imp” indicates the imputed models; NRI, net reclassification index; IDI, integrated discrimination improvement.

Model Report and Interpretation

Supplementary Figure 4 and Figure 3 respectively present the variable importance ranking (VIM) of the eight models and Khorana score before and after imputation in the training set. The VIM for the eight models before and after imputation are similar. The first two predictive factors are D-Dimer and Age, while the top five predictive factors are mostly D-Dimer, Age, VTEHistory, Bed rest, and CCI.

Figure 3.

Variable importance (VIM) of eight models and Khorana Score in imputed training set. ‡Figure A, B, and C show VIM in imputed training set; “_DD” indicates models combing with D-dimer; “_imp” indicates the imputed models; †LDA, linear discriminant analysis; LR, logistic regression; CART, classification, regression and tree; RF, random forest; GBM, gradient boosting machine tree; XGB, extreme gradient boosting tree; SVM, support vector machine; ANN, artificial neural network; Khorana, Khorana Score; §DDimer, D-Dimer; VTEHistory, Previous VTE history; NSAID, Nonsteroidal Anti-inflammatory Drugs; TumorStage, tumor of stage; CCI, Charlson Comorbidity Index; Hb hemoglobin; WBC, white blood cell count; APTT, activated partial thromboplastin time; tumor_of_site, tumor of site; PICC, Peripherally inserted central catheter; ICUCCU, intensive care unit or cardiology intensive care; K_DDimer, D-Dimer≥243 µg/L; K_Hb_EPO, Hb < 100 g/L or using erythropoietin drugs; K_WBC, WBC>11 × 10⁹/L; K_PLT, PLT≥350 × 10⁹/L; K_BMI, BMI≥24 kg/m².

Taking the RF_DD model and the RF_DD_imp model as an example, the break-down plot method of locally interpretable models was used to randomly selected three patients from the external validation set for model interpretation. The three patient IDs were 103 (CA-VTE = no), 194 (CA-VTE = yes), and 298 (CA-VTE = no). The interpretation results are shown in Figure 4.

Figure 4.

Break-down plot of RF models in validation set. ‡RF, random forest; “_DD” indicates models combing with D-dimer; “_imp” indicates the imputed models; †Randomly select three patients from validation set: ID = 103 (CA-VTE = No), 194 (CA-VTE = Yes), and 298 (CA-VTE = No); §Intercept, RF model intercept, baseline risk of CA-VTE for this patient; prediction, the probability of CA-VTE for this patient predicted by RF model; ξDDimer, D-Dimer; Bed, Bed rest≥3days; VTEHistory, Previous VTE history; NSAID, Nonsteroidal Anti-inflammatory Drugs; TumorStage, tumor of stage; CCI, Charlson Comorbidity Index; Hb hemoglobin; WBC, white blood cell count; APTT, activated partial thromboplastin time; tumor_of_site, tumor of site; PICC, Peripherally inserted central catheter; ICUCCU, intensive care unit or cardiology intensive care; Target_Immuno, targeted therapy or immunotherapy.

Discussion

This study is based on a retrospective cohort of 2100 cases (1036 labeled and 1064 unlabeled) and a prospective cohort of 321 cases. The SSL (self-training) algorithm was used to perform “labeled outcome imputation” on the CA-VTE of the unlabeled cohort. Eight machine learning algorithms CA-VTE risk prediction models (LDA, LR, CART, RF, GBM, XGB, SVM, and ANN) were trained and validated, and the before and after imputation models were compared based on model discrimination, calibration, clinical utility, and model improvement indicators. The results were also compared with the Khorana Score, and the models with and without combining with D-Dimer were also compared.

More Predictive Factors Were Selected from the Imputation Training set

Two methods were used to screen predictive factors: univariate analysis (P ≤ .100) and LASSO regression (retaining variables with non-zero coefficients). The imputation training set screened more predictive factors compared to the before imputation training set. When conducting univariate analysis, 15 variables with P ≤ .100 were selected from the before imputation training set, while 22 variables with P ≤ .100 were selected from the imputation training set. Using LASSO regression analysis, 16 and 15 variables were selected from before and after imputation training set, respectively. Simultaneously considering two criteria, 13 and 14 variables were selected, respectively. After imputation, two variables (site of tumor and radiotherapy) were added compared to before imputation, and one variable (targeted or immunotherapy) was removed.

The “site of tumor” variable added after imputation is important. The classification of “site of tumor” is mainly based on Khorana Score⁷ and other tumor specific scores.^8,13 When considering cancer-related variables, non-specific score mostly only includes one variable “patients with advanced cancer".^44–46 However, the CA-VTE risk varies among different types of tumor patients. The “site of tumor” did not meet the criteria in the univariate analysis of the before imputation training set (P ≤ .100), but meet the criteria in the imputation training set, which may be related to sample size expansion. Some studies suggest that the larger the sample size, the easier it is for statistical tests to be significant.⁴⁷ Therefore, it is necessary to expand the sample size as much as possible when conducting research on predictive models. When selecting predictive factors, the screening criteria can be appropriately relaxed (such as P ≤ .100, P ≤ .150, etc) to avoid missing important predictive factors. At the same time, even if some important predictive factors are not statistically significant, considering the support of literature evidence and professional understanding, they should still be included in the model.

The Model Performance After Imputation Was Superior to That Before Imputation

This study compared the performance of the models before and after imputation across four dimensions: model discrimination (Table 3 and Supplementary Figure 2), calibration (Table 3 and Supplementary Figure 3), clinical utility (Figures 2), and model improvement metrics (Table 4). The comprehensive performance of the models after imputation was superior to that before imputation. This study also compared the performance of the eight CA-VTE models with and without the inclusion of D-Dimer, as well as the performance of these eight CA-VTE models compared to the Khorana score. For the same model, performance was superior when combined with D-Dimer compared to without it. Among different models, all eight CA-VTE models outperformed the Khorana score. Among the eight CA-VTE models, the RF model demonstrated the best overall performance. Wang et al,⁴⁸ who compared the effectiveness of nine machine learning algorithms in predicting VTE risk in general patients (27.13% were cancer patients), also found that the RF model performed better than others. Similarly, Lei et al,²⁴ who compared five models (RF, Adaboost, Xgboost, logistic regression, and KNN) for predicting VTE risk in lung cancer patients, also found the RF model to be optimal. Therefore, this study recommends the RF model combined with D-Dimer as the optimal model.

Based on the comparison of model metrics across multiple dimensions, this study found that the models after imputation demonstrated better performance upon external validation. This finding is consistent with conclusions from other related studies. Hou et al,⁴⁹ utilizing a SSL algorithm developed by their team with 271 labeled cases and 19,945 unlabeled cases, improved the prediction accuracy for the risk of type 2 diabetes patients. Within the same dataset, their SSL model achieved an AUC of 0.763, whereas the LASSO model only achieved an AUC of 0.488. Chi et al³⁴ developed a semi-supervised logistic regression (SSLR) algorithm, which showed superior overall performance compared to LR, NB, RF, SVM, and ANN models in predicting mortality risk in colorectal cancer patients. To date, few studies have applied SSL algorithms to train CA-VTE risk prediction models based on data after “label outcome imputation”. By applying the SSL algorithm, this study effectively utilized the vast amount of clinical “unlabeled” data, expanded the sample size of the training set, and uncovered more predictors, which is beneficial for the external validation and generalization of the model.

Interpretability Analysis Can Be Performed for Black-Box Machine Learning Models

Unlike traditional models (eg, LR, COX) that use regression coefficients, ORs (or HRs), score tools, and nomograms for interpretation, the optimal model in this study is RF. The Random Forest (RF) is an ensemble model composed of multiple CART decision trees. Ensemble models belong to the category of black-box models and are not easily interpretable. Therefore, this study employed two interpretation methods for black-box models – Variable Importance Measure (VIM) and break down plot – to provide explanations at the dataset level and the instance level, respectively.

This study compared the VIM ranking results of the eight models and the Khorana score on the training sets before and after imputation (Supplementary Figure 4). The ranking results were similar across the eight models, with slight differences for the Khorana score. It was observed that predictors such as D-Dimer, age, prior VTE history, bed rest, and CCI were consistently ranked relatively high across all eight models, generally within the top five positions. This indicates the strong association of these five predictors with CA-VTE; even when different algorithms are used for modeling, it does not affect their importance ranking. Yuan et al⁵⁰ developed a nomogram based on LR for predicting gastric cancer-associated VTE, which included predictors such as advanced malignant tumor (OR = 3.870, P < .001), CVC (OR = 2.239, P < .001), D-Dimer (OR = 2.096, P < .001), ECOG score (OR = 1.406, P = .001), and adjuvant chemotherapy (OR = −0.366, P = .550). In the study by Lei et al²⁴ predicting lung cancer-associated VTE, according to the RF model's ranking, prior VTE history was the most important variable among many, followed by tumor stage, with age being less important. The VTE risk prediction model for Chinese breast cancer patients developed by Li et al⁵¹ also included predictors such as age, D-Dimer, and number of cardiovascular comorbidities.

At the instance level, this study used the break down plot, taking the RF_DD model with the best overall performance as an example, to provide detailed, personalized explanations for the predictions of three randomly selected patients from the external validation set (ID = 103, CA-VTE = no; ID = 194, CA-VTE = yes; ID = 298, CA-VTE = no) (Figure 4). Taking patient 194 who developed CA-VTE as an example, both the pre- and post-imputation RF_DD models predicted this patient as high risk (RF_DD: cut-off = 38.7%, predicted risk 44.8%; Imputed RF_DD: cut-off = 45.0%, predicted risk 47.0%). Using the break down plot, it is possible to explain in detail the magnitude and direction (ie, increasing or decreasing risk) of the contribution of each predictor to the predicted risk value for patient 194. Furthermore, based on the break down plot method, the same value of a predictor might contribute differently for different patients. This reflects the complexity of the CA-VTE pathogenesis, involving the combined effect of multiple risk factors, where the relationship between risk factors and CA-VTE risk is not simply linear. This aids in personalized risk assessment and guides CA-VTE prevention and management. The application of black-box model interpretation techniques like break down plot, LIME and Shapley in disease risk prediction is increasingly common and garnering significant attention.^40,52 Lundberg et al⁵³ used various machine learning algorithms for modeling and applied the Shapley method to dynamically assess and interpret the predicted risk of intraoperative hypoxemia. Karhade et al⁵⁴ used multiple machine learning algorithms to predict the 90-day and 1-year mortality risk in patients with spinal metastases and employed the LIME method to interpret the prediction results.

Strengths and Limitations

This study has the following strengths: i) This study employed a SSL algorithm for “outcome label imputation” in an “unlabeled cohort,” thereby expanding the training set sample size, uncovering more useful predictors, improving the external validation performance of the CA-VTE prediction model, and enhancing its generalizability. ii) The model performance was comprehensively compared across multiple dimensions—including model discrimination, calibration, clinical utility, and model improvement metrics—among eight ML CA-VTE prediction models, against the Khorana score. This comparative analysis ultimately identified the optimal models in this study (RF_DD and Imputed RF_DD). iii) Interpretations of the optimal black-box model were provided at both the dataset level and the instance level. This involved comparing the importance of different predictors and offering personalized explanations using patient examples, which aids clinical healthcare staff in understanding and applying the model, thereby facilitating CA-VTE risk assessment and preventive management.

This study also has the following limitations: i) Study design: This is a single-center study with retrospective and prospective cohorts. Model training was primarily based on the retrospective cohort. The generalizability of the models still requires further validation through future multi-center, prospective studies. Nevertheless, the application of the SSL algorithm in this study improved the model's external validation performance to some extent. ii) Predictors: This study included readily available multi-dimensional clinical predictors (tumor-related, individual, treatment-related, and laboratory indicators) but could not incorporate novel tumor-specific molecular markers that are currently of widespread interest, such as soluble P-selectin. iii) Sample size: The sample size of this study was somewhat insufficient. The pre-imputation training set met the 5 Events Per Variable (EPV) rule but not the 10 EPV rule, while the post-imputation training set met the 10 EPV rule. The total sample size and the sample size of the negative group in the external validation set basically met requirements, but the number of positive cases was slightly lower than the sample size requirement. However, by using the SSL algorithm to perform “label outcome imputation” on the “unlabeled cohort,” the training set sample size was expanded to a certain degree.

Conclusion

In this study, we employed the SSL algorithm to perform CA-VTE “label outcome imputation” on an unlabeled cohort. This approach expanded the sample size, screened for more predictors, and improved the external validation performance of eight ML CA-VTE prediction models and the Khorana score. This contributes to promoting the clinical application of high-accuracy ML prediction models for CA-VTE and provides support for healthcare professionals in implementing CA-VTE risk assessment and management. Future research should focus on external validation studies of CA-VTE models based on multi-center, prospective, large-sample cohorts.

Supplemental Material

sj-docx-1-cat-10.1177_10760296261416914 - Supplemental material for Semi-Supervised Learning to Improve Generalizability of Cancer Associated-Venous Thromboembolism Risk Prediction Models

Supplemental material, sj-docx-1-cat-10.1177_10760296261416914 for Semi-Supervised Learning to Improve Generalizability of Cancer Associated-Venous Thromboembolism Risk Prediction Models by Shuai Jin, Chong Wang, Dan Qin, Baosheng Liang, Lichuan Zhang, Weiyin Gao, Xiao Wang, Bo Jiang, Benqiang Rao, Hanping Shi, Lihui Liu and Qian Lu in Clinical and Applied Thrombosis/Hemostasis

Footnotes

Acknowledgments

This work was partially supported by Department of Gastrointestinal Surgery in Beijing Shijitan Hospital, Capital Medical University. It was also conducted and supervised by the Nursing School of Peking University, Nursing School of Capital Medical University, and Public Health School of Peking University. We are also grateful to Lichuan Zhang, Xiaoxia Wei, Weiyin Gao, Xiao Wang, and others for their assistance in data collection.

ORCID iD

Shuai Jin

Ethics Statement

This study was approved by the Institutional Review Board of Peking University (IRB00001052-18037). Informed consent was obtained from all participants in the prospective cohort.

Author Contributions Statement

Shuai Jin: Data curation, Formal analysis, Visualization, Writing-original draft, Writing-review & editing, and Funding acquisition; Chong Wang: Data curation, Writing-original draft, and Writing-review & editing; Dan Qin: Data curation, Writing-original draft, and Writing-review & editing; Baosheng Liang: Conceptualization, Methodology, and Writing-Review & Editing; Lichuan Zhang: Data curation, Writing-original draft, and Writing-review & editing; Weiyin Gao: Data curation and Writing-review & editing; Xiao Wang: Data curation and Writing-review & editing; Bo Jiang: Conceptualization, Methodology, Resources, and Writing-review & editing; Benqiang Rao: Conceptualization, Resources, and Writing-review & editing; Hanping Shi: Conceptualization, Resources, Writing-review & editing, and Supervision; Lihui Liu: Resources, Writing-review & editing, and Supervision; Qian Lu: Conceptualization, Funding acquisition, Writing-original draft, Writing-review & editing, Supervision, and Project administration.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Natural Science Foundation of Beijing Municipality, Capital Medical University Research and Cultivation Foundation, The National Key Research and Development Project of China, Capital Medical University Basic Clinical Collaborative Research Project on Digital Intelligence Nursing, (grant number 7244285, PYZ23028, 2017YFC1309204, SZHL23Q01, SZHL23Q08).

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Declaration of Generative AI and AI-Assisted Technologies in the Writing Process

No AI tools/services were used during the preparation of this work.

Data Availability Statement

The data and code that support the findings of this study are available from the corresponding author upon reasonable request.

Supplemental Material

Supplemental material for this article is available online.

References

Khorana

Mackman

Falanga

, et al. Cancer-associated venous thromboembolism. Nat Rev Dis Primers. 2022;8(1):11. doi:10.1038/s41572-022-00336-y

Drăgan

. Novel insights in venous thromboembolism risk assessment methods in ambulatory cancer patients: From the guidelines to clinical practice. Cancers (Basel). 2024;16(2):458-478. doi:10.3390/cancers16020458

Streiff

Holmstrom

Angelini

, et al. Cancer-Associated venous thromboembolic disease, version 2.2024, NCCN clinical practice guidelines in oncology. J Natl Compr Cancer Netw JNCCN. 2024;22(7):483-506. doi:10.6004/jnccn.2024.0046

Mulder

Horvàth-Puhó

van Es

, et al. Venous thromboembolism in cancer patients: A population-based cohort study. Blood. 2021; 137(14):1959-1969. doi:10.1182/blood.2020007338

Xiong

Chatani

Yamashita

. Cancer-associated venous thromboembolism: Changes over the past 20 years. JACC CardioOncol. 2023;5(6):773-774. doi:10.1016/j.jaccao.2023.10.007

Dave

Khorana

. Management of venous thromboembolism in patients with active cancer. Clevel Clin J Med. 2024;91(2):109-117. doi:10.3949/ccjm.91a.23017

Khorana

Kuderer

Culakova

, et al. Development and validation of a predictive model for chemotherapy-associated thrombosis. Blood. 2008;111(10):4902-4907. doi:10.1182/blood-2007-10-116327

Dunkler

Marosi

, et al. Prediction of venous thromboembolism in cancer patients. Blood. 2010;116(24):5377-5382. doi:10.1182/blood-2010-02-270116

Verso

Agnelli

Barni

, et al. A modified Khorana risk assessment score for venous thromboembolism in cancer patients receiving chemotherapy: The Protecht score. Intern Emerg Med. 2012;7(3):291-292. doi:10.1007/s11739-012-0784-y

10.

Pelzer

Sinn

Stieler

, et al.

Primary pharmacological prevention of thromboembolic events in ambulatory patients with advanced pancreatic cancer treated with chemotherapy?

Dtsch Med Wochenschr. 2013;138(41):2084-2088. doi:10.1055/s-0033-1349608

11.

Cella

Di Minno

Carlomagno

, et al. Preventing venous thromboembolism in ambulatory cancer patients: The ONKOTEV study. Oncologist. 2017;22(5):601-608. doi:10.1634/theoncologist.2016-0246

12.

Gerotziafas

Taher

Abdel-Razeq

, et al. A predictive score for thrombosis associated with breast, colorectal, lung, or ovarian cancer: The prospective COMPASS-cancer-associated thrombosis study. Oncologist. 2017;22(10):1222-1231. doi:10.1634/theoncologist.2016-0414

13.

Pabinger

van Es

Heinze

, et al. A clinical prediction model for cancer-associated venous thromboembolism: A development and validation study in two independent prospective cohorts. Lancet Haematol. 2018;5(7):e289-e298. doi:10.1016/s2352-3026(18)30063-2

14.

Munoz Martin

Ortega

Font

, et al. Multivariable clinical-genetic risk model for predicting venous thromboembolic events in patients with cancer. Br J Cancer. 2018;118(8):1056-1061. doi:10.1038/s41416-018-0027-8

15.

Rojas-Hernandez

Tang

Sanchez-Petitto

, et al. Development of a clinical prediction tool for cancer-associated venous thromboembolism (CAT): The MD Anderson cancer center CAT model. Support Care Cancer. 2020;28(8):3755-3761. doi:10.1007/s00520-019-05150-z

16.

Tsubata

Hotta

Hamai

, et al. A new risk-assessment tool for venous thromboembolism in advanced lung cancer: A prospective, observational study. J Hematol Oncol. 2022;15(1):40. doi:10.1186/s13045-022-01259-7

17.

Tsubata

Kawakado

Hamai

, et al. Identification of risk factors for venous thromboembolism and validation of the Khorana score in patients with advanced lung cancer: Based on the multicenter, prospective rising-VTE/NEJ037 study data. Int J Clin Oncol. 2023;28(1):69-78. doi:10.1007/s10147-022-02257-y

18.

Huang

Chen

Meng

, et al. External validation of the Khorana score for the prediction of venous thromboembolism in cancer patients: A systematic review and meta-analysis. Int J Nurs Stud. 2024;159:104867. doi:10.1016/j.ijnurstu.2024.104867

19.

Yan

Samarawickrema

Naunton

, et al. Models for predicting venous thromboembolism in ambulatory patients with lung cancer: A systematic review and meta-analysis. Thromb Res. 2024;234:120-133. doi:10.1016/j.thromres.2024.01.003

20.

Khorana

Francis

. Risk prediction of cancer-associated thrombosis: Appraising the first decade and developing the future. Thromb Res. 2018;164(Suppl 1):S70-S76. doi:10.1016/j.thromres.2018.01.036

21.

Patell

Zwicker

Singh

, et al. Machine learning in cancer-associated thrombosis: Hype or hope in untangling the clot. Bleed Thromb Vasc Biol. 2024;3(Suppl 1):21-29. doi:10.4081/btvb.2024.123

22.

Kawaler

Cobian

Peissig

, et al. Learning to predict post-hospitalization VTE risk from EHR data. AMIA Annu Symp Proc. 2012;2012:436-445. doi: 23304314

23.

Ferroni

Zanzotto

Scarpato

, et al. Risk assessment for venous thromboembolism in chemotherapy-treated ambulatory cancer patients. Med Decis Making. 2017;37(2):234-242. doi:10.1177/0272989X16662654

24.

Lei

Zhang

, et al. Development and validation of a risk prediction model for venous thromboembolism in lung cancer patients using machine learning. Front Cardiovasc Med. 2022;9:845210. doi:10.3389/fcvm.2022.845210

25.

Danilatou

Nikolakakis

Antonakaki

, et al. Outcome prediction in critically-ill patients with venous thromboembolism and/or cancer using machine learning algorithms: External validation and comparison with scoring systems. Int J Mol Sci. 2022;23(13):7132-7156. doi:10.3390/ijms23137132

26.

Jin

Qin

Liang

B-S

, et al. Machine learning predicts cancer-associated deep vein thrombosis using clinically available variables. Int J Med Inf. 2022;161:104733. doi:https://doi.org/10.1016/j.ijmedinf.2022.104733

27.

Mantha

Chatterjee

Singh

, et al. Application of machine learning to the prediction of cancer-associated venous thromboembolism. Res Sq. 2023; 8(3):1-31. doi:10.21203/rs.3.rs-2870367/v1

28.

Meng

Wei

Fan

, et al. Development and validation of a machine learning model to predict venous thromboembolism among hospitalized cancer patients. Asia-Pac J Oncol Nurs. 2022;9(12):100128. doi:10.1016/j.apjon.2022.100128

29.

Riley

Ensor

Snell

KIE

, et al. Calculating the sample size required for developing a clinical prediction model. BMJ Br Med J. 2020;368:m441. doi:10.1136/bmj.m441

30.

Riley

Snell

KIE

Archer

, et al. Evaluation of clinical prediction models (part 3): Calculating the sample size required for an external validation study. Br Med J. 2024;384:e074821. doi:10.1136/bmj-2023-074821

31.

van Engelen

Hoos

. A survey on semi-supervised learning. Mach Learn. 2020;109(2):373-440. doi:10.1007/s10994-019-05855-6

32.

Zheng

Wang

Zhou

X-Y

, et al. Semi-supervised learning for bone mineral density estimation in Hip X-ray images. In: de Bruijne

Cattin

Cotin

Padoy

Speidel

Zheng

Essert

, eds. Medical image computing and computer assisted intervention – MICCAI 2021. Springer International Publishing; 2021:33-42.

33.

Sun

, et al. Accurate recognition of colorectal cancer with semi-supervised deep learning on pathological images. Nat Commun. 2021;12(1):6311. doi:10.1038/s41467-021-26643-8

34.

Chi

Tian

, et al. Semi-supervised learning to improve generalizability of risk prediction models. J Biomed Inform. 2019;92:103117. doi:10.1016/j.jbi.2019.103117

35.

Hou

Guo

Cai

. Surrogate assisted semi-supervised inference for high dimensional risk prediction. J Mach Learn Res JMLR. 2023:24:1-58.

36.

Jiang

Gai

Treggiari

, et al. Soft phenotyping for sepsis via EHR time-aware soft clustering. J Biomed Inform. 2024;152:104615. doi:10.1016/j.jbi.2024.104615

37.

Zhou

Liu

, et al. Semi-supervised learning for multi-label cardiovascular diseases prediction: A multi-dataset study. IEEE Trans Pattern Anal Mach Intell. 2024;46(5):3305-3320. doi:10.1109/tpami.2023.3342828

38.

Riley

Debray

TPA

Collins

, et al. Minimum sample size for external validation of a clinical prediction model with a binary outcome. Stat Med. 2021;40(19):4230-4251. doi:10.1002/sim.9025

39.

Jin

Qin

Wang

, et al. Development, validation, and clinical utility of risk prediction models for cancer-associated venous thromboembolism: A retrospective and prospective cohort study. Asia-Pac J Oncol Nurs. 2025;12:100691. doi:10.1016/j.apjon.2025.100691

40.

Christoph

. Interpretable machine learning: a guide for making black box models explainable. https://christophm.github.io/interpretable-ml-book/index.html, 2020.

41.

Biecek

Burzykowski

. Explanatory model analysis: explore, explain and examine predictive models. Chapman and Hall/CRC; 2021.

42.

Collins

Moons

KGM

Dhiman

, et al. TRIPOD+AI statement: Updated guidance for reporting clinical prediction models that use regression or machine learning methods. Br Med J. 2024;385:e078378. doi:10.1136/bmj-2023-078378

43.

Efthimiou

Seo

Chalkou

, et al. Developing clinical prediction models: A step-by-step guide. Br Med J. 2024;386:e078276. doi:10.1136/bmj-2023-078276

44.

Wells

Anderson

Rodger

, et al. Evaluation of D-dimer in the diagnosis of suspected deep-vein thrombosis. N Engl J Med. 2003;349(13):1227-1235. doi:10.1056/NEJMoa023153

45.

Autar

. The management of deep vein thrombosis: The Autar DVT risk assessment scale re-visited. J Orthop Nurs. 2003;7(3):114-124. doi:10.1016/S1361-3111(03)00051-7

46.

Cronin

Dengler

Krauss

, et al. Completion of the updated Caprini risk assessment model (2013 version). Clin Appl Thromb Hemost. 2019;25:1076029619838052. doi:10.1177/1076029619838052

47.

Mascha

Vetter

. Significance, errors, power, and sample size: The blocking and tackling of statistics. Anesth Analg. 2018;126(2):691-698. doi:10.1213/ane.0000000000002741

48.

Wang

Yang

Liu

, et al. Comparing different venous thromboembolism risk assessment machine learning models in Chinese patients. J Eval Clin Pract. 2020;26(1):26-34. doi:10.1111/jep.13324

49.

Hou

Guo

Cai

. Surrogate assisted semi-supervised inference for high dimensional risk prediction. J Mach Learn Res. 2021;24(265):1-58.

50.

Yuan

Zhang

, et al. A nomogram for predicting risk of thromboembolism in gastric cancer patients receiving chemotherapy. Front Oncol. 2021;11:598116. doi:10.3389/fonc.2021.598116

51.

Qiang

Wang

, et al. Development and validation of a risk assessment nomogram for venous thromboembolism associated with hospitalized postoperative Chinese breast cancer patients. J Adv Nurs. 2021;77(1):473-483. doi:10.1111/jan.14571

52.

Watson

Krutzinna

Bruce

, et al. Clinical applications of machine learning algorithms: Beyond the black box. BMJ Br Med J. 2019;364:1-4. doi:10.1136/bmj.l886

53.

Lundberg

Nair

Vavilala

, et al. Explainable machine-learning predictions for the prevention of hypoxaemia during surgery. Nat Biomed Eng. 2018;2(10):749-760. doi:10.1038/s41551-018-0304-0

54.

Karhade

Thio

Ogink

, et al. Predicting 90-day and 1-year mortality in spinal metastatic disease: Development and internal validation. Neurosurgery. 2019;85(4):e671-e681. doi:10.1093/neuros/nyz070

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

1.06 MB