Sage Journals: Discover world-class research

Abstract

Study Design

Retrospective Cohort Study.

Objectives

This study aimed to develop survival prediction models for spinal Ewing’s sarcoma (EWS) based on machine learning (ML).

Methods

We extracted the SEER registry’s clinical data of EWS diagnosed between 1975 and 2016. Three feature selection methods extracted clinical features. Four ML algorithms (Cox, random survival forest (RSF), CoxBoost, DeepCox) were trained to predict the overall survival (OS) and cancer-specific survival (CSS) of spinal EWS. The concordance index (C-index), integrated Brier score (IBS) and mean area under the curves (AUC) were used to assess the prediction performance of different ML models. The top initial ML models with best performance from each evaluation index (C-index, IBS and mean AUC) were finally stacked to ensemble models which were compared with the traditional TNM stage model by 3-/5-/10-year Receiver Operating Characteristic (ROC) curves and Decision Curve Analysis (DCA).

Results

A total of 741 patients with spinal EWS were identified. C-index, IBS and mean AUC for the final ensemble ML model in predicting OS were .693/0.158/0.829 during independent testing, while .719/0.171/0.819 in predicting CSS. The ensemble ML model also achieved an AUC of .705/0.747/0.851 for predicting 3-/5-/10-year OS during independent testing, while .734/0.779/0.830 for predicting 3-/5-/10-year CSS, both of which outperformed the traditional TNM stage. DCA curves also showed the advantages of the ensemble models over the traditional TNM stage.

Conclusion

ML was an effective and promising technique in predicting survival of spinal EWS, and the ensemble models were superior to the traditional TNM stage model.

Keywords

Deep learning Ewing’s sarcoma Spinal cancer Machine learning Survival prediction

Introduction

Malignant primary bone tumors are rare diseases,¹ and the spine is a typical region that may be tricky for Surgery because of the surrounding vessels and nerves. These tumors are highly aggressive with an unfavorable prognosis, of which spinal Ewing’s sarcoma (EWS) presents a great challenge for clinicians.² Malignant spinal cord compression may cause neurological disability (up to 5% of all patients with cancer), an important and dangerous complication for patients with spinal EWS.³ Local management strategies generally include palliative radiotherapy or posterior surgical decompression with or without instrumentation or total en bloc spondylectomy. Thus, clinicians must choose the appropriate treatment to maximize the patient's survival. Therefore, accurate prediction of patient survival outcome is of great significance for treating and understanding the disease. Usually, clinicians may adopt proportional hazard models to estimate the survival of cancer patients.⁴ However, these models rely on linearity assumptions and fail to integrate the non-linear features, which is more common in a real-life setting.⁵

Recently, machine learning (ML) has been increasingly popular in survival prediction⁶ because of its powerful capacity to integrate the features with non-linear relationships. Several ML algorithms have been adopted to predict the survival of cancer patients, and their prediction accuracy was encouraging.⁷ Therefore, ML-based prediction models are expected to accurately predict the prognosis of rare diseases like spinal Ewing’s sarcoma. However, there is no ML-based model for spinal Ewing’s sarcoma survival prediction. The study aims to develop survival prediction models for spinal Ewing’s sarcoma based on common ML algorithms.

Methods

Sources of Databases

The Surveillance, Epidemiology and End Results (SEER) Program of the National Cancer Institute (NCI) is authoritative information on cancer incidence and survival in the United States. Currently, the program collects and publishes cancer incidence and survival data from population-based cancer registries covering approximately 34.6 percent of the U.S. population. The database contains basic information (age, race, gender, etc.), the diagnosis and treatment of the tumor (tumor size, grade, surgery, etc.) and other information (marital status, insurance, etc). Ethical approval was not required for this study because the SEER database is free of any sensitive patient information or identifiers.

Study Criteria

For the purposes of this analysis, we selected the research data from 1975-2016. Based on the International Classification of Disease for Oncology, version 3(ICD-O-3), we searched the SEER database to identify all registered cases of Ewing’s sarcoma (ICD-O-3 code 9260). At the same time, the primary site of the tumor was set in the vertebral column (ICD-O-3 code 412) and pelvis (ICD-O-3 code 414). This study followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guideline.

Data Preprocessing

All potential features of patients (including marital status, race, gender, age, overall survival status, cancer-specific survival status, survival time, tumor size, tumor extension, lymph metastasis, distant metastasis, the surgery type, and tumor grade etc.) were selected for further analyses. The primary outcome of interest was overall survival (OS) and cancer-specific survival (CSS). Only cases that have been histologically confirmed and active follow-up were included in our study. Cases with SEER cause-specific death classification for patients in “missing/unknown COD” or “N/A not first tumor” were also excluded.

Surgery of primary site is coded according to the SEER Program Code Manual using source documents. Our study’s subsequent surgical procedures were separated into 6 categories: Amputation; Biopsy; Local excision; No Surgery; Radical excision and Surgery, NOS. Radiotherapy was reclassified as yes and none/unknown. Reason no cancer-directed Surgery (Cancer directed Surgery) was reclassified as Surgery performed and Not performed. Marital status was reclassified as yes and no. Insurance was reclassified as yes and no.

Different methods of imputation processed missing data according to the variable type of features. Predictive mean matching, logistic regression and polynomial regression were used for continuous, binary and categorical features.⁸ Predictive mean matching is a semi-parametric method which restricts the imputations to the observed values. Logistic regression and polynomial regression are the default methods for missing value imputation of binary and categorical features in “mice” package in R language, which is a general purpose package for multivariate imputation in ML tasks.

Feature Selection and Training

All included data went through 3 feature selection methods, namely Cox, random survival forest (RSF) and CoxBoost. For Cox feature selection, a univariate Cox regression and then a multivariate Cox regression performed to assess the factors associated with survival. P values <.05 were retained for following algorithms developed. Magnitude of statistical significance was expressed with hazard ratios (HR) and 95% confidence intervals (CI). For RSF, we implemented a feature-importance ranking algorithm based on random survival forest model to output the significant features. For CoxBoost, we developed a likelihood-based boosting model to output the variables whose P value < .05 to estimate the associated features of survival.

We split the data into training and testing sets (training: testing = 9: 1), and all training data went through 5-fold cross-validation. Four ML algorithms (Cox, RSF, CoxBoost, DeepCox) were trained to predict the overall survival (OS) and cancer-specific survival (CSS) of spinal Ewing’s sarcoma. The Cox model is a semi-parametric model for survival analysis. It measures the impact of the covariates and assumes that the log-hazard of every patient is a linear combination of the patient's features.⁹ RSF is a popular non-linear ML model for survival analysis.¹⁰ By tree structures based on random forests, it can generate ensemble estimates for the cumulative hazard function. CoxBoost is a semi-parametric survival model which is designed to handle high-dimensional datasets by fitting the Cox models with likelihood-based boosting for a single endpoint or competing risks.¹¹ DeepSurv is a multi-layer perception-feed forward network whose output layer is a Cox regression, predicting a patient’s risk of death and parameterized by the weights of the network.

The concordance index (C-index), integrated Brier score (IBS) and mean area under the curves (AUC) were used to assess the prediction performance of different ML initial models. C-index can be interpreted as the concordance probability between the observed and the predicted survival. Higher C-index indicates better performance of the prediction model. IBS is also known as prediction error rate, and lower IBS indicates better prediction performance. Mean AUC defines as the integral of the time-dependent AUC curves over the survival time (T) divided by the interval of the integral.

The top ML initial models with best performance from each evaluation index (C-index, IBS and mean AUC) during independent testing were selected and further stacked as the final ensemble model, which was compared with the traditional TNM stage model by 3-/5-/10-year Receiver Operating Characteristic (ROC) curves and Decision Curve Analysis (DCA). Codes about the ensemble model were available from https://github.com/Huatsing-Lau/Spinal-EWS-Surv. All data processing was conducted on Python (version 3.6). Two-tailed P values of <.05 were considered significant.

Results

Patient Demographics

In total, 741 patients with spinal Ewing’s sarcoma were identified. Workflow of data selecting was delineated in Figure 1. The average age of patients was 19.85±12.15 (SD) years, 470 (63.4%) were male, and the majority of patients (85.3%) were unmarried. The stage of cancers was localized (n = 156, 21.1%), regional (n = 297, 40.1%), and distant (n = 288, 38.9%). The survival months was 68.40±80.50 (SD) months, and 355 (47.9%) patients were still alive at the last follow-up time for OS while 377(50.9%) for CSS (Table 1). There were missing data in several variables including race, marital status, insurance, laterality, grade, stage, TNM stage, T stage, N stage, M stage, tumor size, extension, lymph nodes invasion, surgery of primary site, etc. The percentage of missing values were shown in Supplementary table 1, range from 0 to 76.4%.

Figure 1.

Workflow of the study.

Table 1.

Baseline characteristics of included patients with spinal Ewing’s sarcoma.

Characteristics	Level	Overall (n =741)	Training (including validation) dataset (n = 666)	Testing dataset (n = 75)	P
Survival months (mean (SD))		68.40 (80.50)	68.67 (80.64)	65.96 (79.25)	.782
Overall survival (%)	Alive	355 (47.9)	315 (47.3)	40 (53.3)	.384
	Dead	386 (52.1)	351 (52.7)	35 (46.7)
Disease specific survival (%)	Alive	377 (50.9)	335 (50.3)	42 (56.0)	.416
	Dead	364 (49.1)	331 (49.7)	33 (44.0)
Age (mean (SD))		19.85 (12.15)	19.87 (11.89)	19.64 (14.24)	.876
Gender (%)	Female	271 (36.6)	238 (35.7)	33 (44.0)	0.2
	Male	470 (63.4)	428 (64.3)	42 (56.0)
Race (%)	American Indian/Alaska Native	13 (1.8)	11 (1.7)	2 (2.7)	.391
	Asian or Pacific Islander	37 (5.0)	34 (5.1)	3 (4.0)
	Black	21 (2.8)	21 (3.2)	0 (.0)
	White	670 (90.4)	600 (90.1)	70 (93.3)
Marital status (%)	No	631 (85.2)	567 (85.1)	64 (85.3)	1
	Yes	110 (14.8)	99 (14.9)	11 (14.7)
Insurance (%)	No	169 (22.8)	156 (23.4)	13 (17.3)	.295
	Yes	572 (77.2)	510 (76.6)	62 (82.7)
Region (%)	Alaska	3 (.4)	2 (.3)	1 (1.4)	.132
	East	249 (33.6)	228 (34.2)	21 (28.0)
	Northern Plains	65 (8.8)	58 (8.7)	7 (9.3)
	Pacific Coast	334 (45.1)	303 (45.5)	31 (41.3)
	Southwest	90 (12.1)	75 (11.3)	15 (20.0)
Sequence number (%)	More than 1	29 (3.9)	26 (3.9)	3 (4.0)	1
	One primary only	712 (96.1)	640 (96.1)	72 (96.0)
Primary site (%)	C41.2-Vertebral column	216 (29.1)	190 (28.5)	26 (34.7)	.33
	C41.4-Pelvic bones	525 (70.9)	476 (71.5)	49 (65.3)
Laterality (%)	Bilateral, single primary	4 (.5)	4 (.6)	0 (.0)	.301
	Left - origin of primary	152 (20.5)	140 (21.0)	12 (16.0)
	Not a paired site	420 (56.7)	370 (55.6)	50 (66.7)
	Right - origin of primary	165 (22.3)	152 (22.8)	13 (17.3)
Grade (%)	Grade I	55 (7.4)	49 (7.4)	6 (8.0)	.731
	Grade II	56 (7.6)	50 (7.5)	6 (8.0)
	Grade III	154 (20.8)	135 (20.3)	19 (25.3)
	Grade IV	476 (64.2)	432 (64.9)	44 (58.7)
Stage (%)	Distant	286 (38.6)	255 (38.3)	31 (41.3)	.245
	Localized	166 (22.4)	145 (21.8)	21 (28.0)
	Regional	289 (39.0)	266 (39.9)	23 (30.7)
TNM Stage (%)	II	316 (42.6)	286 (42.9)	30 (40.0)	.799
	III	145 (19.6)	131 (19.7)	14 (18.7)
	IV	280 (37.8)	249 (37.4)	31 (41.3)
T Stage (%)	T1	170 (22.9)	147 (22.1)	23 (30.6)	.419
	T2	218 (29.4)	198 (29.7)	20 (26.7)
	T3	124 (16.7)	113 (17.0)	11 (14.7)
	TX	229 (30.9)	208 (31.2)	21 (28.0)
N Stage (%)	N0	473 (63.8)	425 (63.8)	48 (64.0)	.978
	N1	124 (16.7)	111 (16.7)	13 (17.3)
	NX	144 (19.4)	130 (19.5)	14 (18.7)
M Stage (%)	M0	348 (47.0)	312 (46.8)	36 (48.0)	.228
	M1	264 (35.6)	233 (35.0)	31 (41.3)
	MX	129 (17.4)	121 (18.2)	8 (10.7)
Tumor size (mean (SD))		86.81 (54.96)	87.14 (55.85)	83.81 (46.23)	.619
Extension (%)	Extended	531 (71.7)	483 (72.5)	48 (64.0)	.156
	Local	210 (28.3)	183 (27.5)	27 (36.0)
lymph nodes invasion (%)	No	642 (86.6)	579 (86.9)	63 (84.0)	.596
	Yes	99 (13.4)	87 (13.1)	12 (16.0)
Total number of in situ malignant tumors (mean (SD))		1.03 (.19)	1.03 (.19)	1.03 (.16)	.731
Surgery of primary site (%)	Amputation	51 (6.9)	46 (6.9)	5 (6.6)	.287
	Biopsy	38 (5.1)	36 (5.4)	2 (2.7)
	Local excision	123 (16.6)	104 (15.6)	19 (25.3)
	No surgery	407 (54.9)	368 (55.3)	39 (52.0)
	Radical excision	81 (10.9)	73 (11.0)	8 (10.7)
	Surgery, NOS	41 (5.5)	39 (5.9)	2 (2.7)
Regional lymph node surgery (%)	No	665 (89.7)	593 (89.0)	72 (96.0)	.092
	Yes	76 (10.3)	73 (11.0)	3 (4.0)
Surgical procedure of other site (%)	No	651 (87.9)	586 (88.0)	65 (86.7)	.884
	Yes	90 (12.1)	80 (12.0)	10 (13.3)
Cancer directed surgery (%)	Not performed	445 (60.1)	404 (60.7)	41 (54.7)	.379
	Surgery performed	296 (39.9)	262 (39.3)	34 (45.3)
Radiotherapy (%)	None/Unknown	250 (33.7)	229 (34.4)	21 (28.0)	.327
	Yes	491 (66.3)	437 (65.6)	54 (72.0)
Radiation sequence with surgery (%)	Both were given	192 (25.9)	164 (24.6)	28 (37.3)	.025
	No radiation and/or cancer-directed surgery	549 (74.1)	502 (75.4)	47 (62.7)
Chemotherapy (%)	No/Unknown	38 (5.1)	31 (4.7)	7 (9.3)	.143
	Yes	703 (94.9)	635 (95.3)	68 (90.7)
Regional nodes examined (mean (SD))		.43 (3.90)	.48 (4.12)	.01 (.11)	.325

Feature Selections

Three feature selection methods were used to screen for independent prognostic variables (Table 2-3). For Cox method, we identified 6 associated features for OS and 9 for CSS. Results of univariate and multivariate Cox regression were summarized on Supplementary table 2,3,4,5. For RSF method, we identified fourteen associated features for OS and eleven for CSS. For CoxBoost method, we identified 9 and twelve associated features for OS and CSS respectively. All identified features were used to train 4 ML algorithms (Cox, RSF, CoxBoost, DeepCox)

Table 2.

Feature selections for overall survival of patients with spinal Ewing’s sarcoma.

Rating	CoxBoost	Cox	RSF
1	Age	Lymph nodes invasion (Yes)	Age
2	Tumor size	Age	Tumor size
3	Marital status (Yes)	N Stage (N1)	Marital status (Yes)
4	Stage (Localized)	Stage (Localized)	Insurance (Yes)
5	TNM Stage (IV)	Race (Black)	Primary site (C41.4-Pelvic bones)
6	T Stage (TX)	M Stage (M1)	Stage (Localized)
7	M Stage (M1)		Stage (Regional)
8	Lymph nodes invasion (Yes)		TNM Stage (IV)
9	Cancer directed surgery (Surgery performed)		T Stage (TX)
10			M Stage (M1)
11			Lymph nodes invasion (Yes)
12			Surgery of Primary Site (No surgery)
13			Cancer directed surgery (Surgery performed)
14			Radiation sequence with surgery (No radiation and/or cancer-directed surgery)

Table 3.

Feature selections for cancer-specific survival of patients with spinal Ewing’s sarcoma.

Rating	CoxBoost	Cox	RSF
1	Age	Lymph nodes invasion (Yes)	Age
2	Tumor size	Stage (Localized)	Tumor size
3	Marital status (Yes)	Age	Marital status (Yes)
4	Stage (Localized)	N Stage (N1)	Stage (Localized)
5	Stage (Regional)	Sequence number (1 primary only)	TNM Stage (IV)
6	TNM Stage (IV)	Race (Black)	T Stage (TX)
7	T Stage (TX)	M Stage (M1)	M Stage (M1)
8	M Stage (M1)	Marital status (Yes)	Lymph nodes invasion (Yes)
9	Lymph nodes invasion (Yes)	Cancer directed surgery (Surgery performed)	Surgery of Primary Site (No surgery)
10	Surgery of Primary Site (No surgery)		Cancer directed surgery (Surgery performed)
11	Cancer directed surgery (Surgery performed)		Radiation sequence with surgery (No radiation and/or cancer-directed surgery)
12	Radiation sequence with surgery (No radiation and/or cancer-directed surgery)

Predicting OS

ML models performed well in predicting OS (Figure 2A-F). The top ML models during independent testing were CoxBoost with Cox feature selection method, Cox with Cox feature selection method and Cox with Cox feature selection method. When CoxBoost model was combined with feature selection of Cox, the average C-index was better (.689 and .714) during 5-fold cross validation and independent testing for OS; the IBS were .197 and .196, the mean AUC were .771 and .811 for training and testing respectively. When Cox model was combined with feature selection of Cox, the average C-index was better (.689 and .685) during 5-fold cross validation and independent testing for OS; the IBS were .167 and .157, the mean AUC were .805 and .820 for training and testing respectively.

Figure 2.

Model evaluation for overall survival (OS). (A-F) Evaluation of different machine-learning (ML) models with 3 feature selection methods. (A) Concordance index (C-index) on training dataset; (B) Integrated Brier score (IBS) on training dataset; (C) Mean area under the curves (AUC) on training dataset; (D) C-index on testing dataset; (E) IBS on testing dataset; (F) Mean AUC on testing dataset. (G-H) Comparison of the ensemble ML model with the traditional TNM stage model. (G) The C-index, IBS and mean AUC on training dataset; (H) The C-index, IBS and mean AUC on testing dataset. (I) Venn diagram of features selected by 3 different selection methods.

The final ensemble ML model performed better than the traditional TNM stage model with the C-index, IBS and mean AUC of .693/0.169/0.799 and .693/0.158/0.829 during cross-validation and independent testing (Figure 2G-H). The ensemble ML model also achieved an AUC of .740/0.771/0.814 for predicting 3-/5-/10-year OS during cross-validation and .705/0.747/0.851 during independent testing, which were superior to that of the traditional TNM stage model (Figure 3A-C, G-I). DCA curves also showed the merits of the ensemble model, compared to the traditional TNM stage model (Figure 3D-F, 3J-L).

Figure 3.

Model evaluation for overall survival (OS). (A-C) 3-, 5- and 10-year Area Under the Curve (AUC) for Receiver Operating Characteristic (ROC) curves of the ensemble ML model compared with the traditional TNM stage model on training dataset. (D-F) 3-, 5- and 10-year Decision Curve Analysis (DCA) curves of the ensemble ML model compared with the traditional TNM stage model on training dataset. (G-I) 3-, 5- and 10-year AUC for ROC curves of the ensemble ML model compared with the traditional TNM stage model on testing dataset. (J-L) 3-, 5- and 10-year DCA curves of the ensemble ML model compared with the traditional TNM stage model on testing dataset.

Predicting CSS

ML models performed well in predicting CSS (Figure 4A-F). The top ML models during independent testing were CoxBoost with CoxBoost feature selection method, RSF with Cox feature selection method and CoxBoost with Cox feature selection method. When CoxBoost model was combined with feature selection of CoxBoost, the average C-index was better (.702 and .725) during 5-fold cross validation and independent testing for OS; the IBS were .203 and .203, the mean AUC were .805 and .798 for training and testing respectively. When RSF model was combined with feature selection of Cox, the average C-index was better (.702 and .704) during 5-fold cross validation and independent testing for OS; the IBS were .184 and .161, the mean AUC were .787 and .769 for training and testing respectively. When CoxBoost model was combined with feature selection of Cox, the average C-index was better (.697 and .713) during 5-fold cross validation and independent testing for OS; the IBS were .205 and .203, the mean AUC were .800 and .814 for training and testing respectively.

Figure 4.

Model evaluation for cancer-specific survival (CSS). (A-F) Evaluation of different machine-learning (ML) models with 3 feature selection methods. (A) Concordance index (C-index) on training dataset; (B) Integrated Brier score (IBS) on training dataset; (C) Mean area under the curves (AUC) on training dataset; (D) C-index on testing dataset; (E) IBS on testing dataset; (F) Mean AUC on testing dataset. (G-H) Comparison of the ensemble ML model with the traditional TNM stage model. (G) The C-index, IBS and mean AUC on training dataset; (H) The C-index, IBS and mean AUC on testing dataset. (I) Venn diagram of features selected by 3 different selection methods.

The final ensemble ML model performed better than the traditional TNM stage model with the C-index, IBS and mean AUC of .709/0.174/0.820 and .719/0.171/0.819 during cross-validation and independent testing (Figure 4G-H). The ensemble ML model also achieved an AUC of .762/0.783/0.820 for predicting 3-/5-/10-year CSS during cross-validation and .734/0.779/0.830 during independent testing, which were superior to that of the traditional TNM stage model (Figure 5A-C, 5G-I). DCA curves also showed the merits of the ensemble model, compared to the traditional TNM stage model (Figure 5D-F, 5J-L).

Figure 5.

Model evaluation for cancer-specific survival (CSS). (A-C) 3-, 5- and 10-year Area Under the Curve (AUC) for Receiver Operating Characteristic (ROC) curves of the ensemble ML model compared with the traditional TNM stage model on training dataset. (D-F) 3-, 5- and 10-year Decision Curve Analysis (DCA) curves of the ensemble ML model compared with the traditional TNM stage model on training dataset. (G-I) 3-, 5- and 10-year AUC for ROC curves of the ensemble ML model compared with the traditional TNM stage model on testing dataset. (J-L) 3-, 5- and 10-year DCA curves of the ensemble ML model compared with the traditional TNM stage model on testing dataset.

Discussion

The current study demonstrated that ML models could effectively predict the OS and the CSS of Ewing’s sarcoma. And our ensemble models were also verified to be superior to the traditional TNM stage model. To the best of our knowledge, this could be the first ML model for predicting survival of spinal Ewing’s sarcoma.

Ewing's sarcoma commonly grows at metaphyseal bones, but they may also present at the spine especially the sacrum. Only about 8% of all EWS cases originate from the spinal region. Spinal lesions can be primary or metastatic. Moreover, spinal EWS presenting with spinal cord compression have a very low incidence, and only 69 cases were reported in the literature till 2018.¹² The management of Ewing's sarcoma is challenging, and surgery combined with chemotherapy and radiotherapy is usually recommended to manage the progression of neurological deficits. However, there is no global uniform treatment standard for Ewing's sarcoma due to its rarity and insufficient experience collected. Only a few studies^13–15 reported the clinical outcomes of spinal Ewing’s sarcoma based on the data from their own institutions. However, the SEER database has a large number of rare cancer patients with clinical data including demographics, therapeutic and outcomes. Thus, many previous studies have analyzed the survival of spinal Ewing’s sarcoma based on the clinical data from the SEER database.^16–19 However, these studies only used conventional statistical analysis to find the correlated prognostic factors, and they failed to predict a patient-specific outcome, which was very important for clinicians in communicating with cancer patients and their families.

In this big-data era, large amounts of medical record information especially the cancer data generated everyday in our daily life. However, the accurate prediction of the OS and CSS is still 1 of the most interesting and challenging tasks for doctors. With the stronger ability to handle the large volumes and high dimensions of data compared to conventional statistical methods, ML methods have become a popular tool for medical researchers.⁶ As it’s reported, the accuracy of cancer prediction outcomes in cancerous conditions has improved by 15%-20% in 2014, with the application of ML techniques.²⁰ With the construction of more public databases and the improvement of ML algorithms, ML methods will be a promising tool for inference in the cancer domain for clinical management and treatment decisions.

The prognosis of cancer is correlated with multidimensional factors, so the conventional linear statistical models may not present reliable performance in predicting survivals.^5,21 In order to develop non-linear prediction models, many researchers have adopted several ML algorithms to predict cancer prognosis.^22,23 Karhade et al²⁴ developed prediction models for 5-year survival of spinal chordoma based on several ML algorithms, and they found the Bayes Point Machine achieved the best performance. Ryu et al²⁵ predicted the survivals of patients with spinal and pelvic chondrosarcoma using Deep survival neural networks, and the prediction performance was promising (mean AUC was .85). Ryu et al²⁶ also developed ML models to predict the survival of spinal ependymomas based on the SEER database, and the ML model achieved an AUC of .74 for predicting a 5-year OS of spinal ependymoma and an AUC of .81 for predicting a 10-year OS. However, these studies all defined the survival outcome as a classification issue for developing ML models. Not many studies developed ML models to predict the survival outcomes with time information.⁵

In the current study, all ML models were trained with different feature packs and went through 5-fold cross validation. We made different combinations of 3 feature extraction methods and 4 ML algorithms in order to identify the optimal model with the best performance of survival prediction. For feature selection, we had found several features which had been well-recognized significant prognostic factors in spinal Ewing’s sarcoma. Arshi et al² reported that age, race and tumor size were independent predictors for OS in patients with spinal Ewing’s sarcoma, and age, tumor size were independent features for CSS. Similarly, Chen et al²⁷ revealed that age, race, tumor stage, and surgery were the independent risk factors for OS prognosis of pelvic Ewing’s sarcoma. Furthermore, David et al²⁸ conducted an epidemiologic and survival trends in adult primary bone tumors of the spine and found government insurance, tumor size >5 cm, high tumor grade to be associated with worse overall survival of spinal tumors. They also found that surgical resection and chemotherapy were associated with improved survival for spinal Ewing’s sarcoma. However, other features were identified as significant prognosticators in this study through our ML-based feature selection methods. With RSF of feature selection, “Marital status (Yes)” and “lymph nodes invasion (Yes)” were predictors for both OS and CSS. With CoxBoost of feature selection, “Marital status (Yes)” and “lymph nodes invasion (Yes)” were predictors for OS and CSS. This may highlight the ability of advanced feature selection methods based on ML algorithms like CoxBoost and RSF to explore features which may have a non-linear relationship in patients’ survival outcomes.^5,21 For survival prediction, it turned out that most ML models have similar accuracy in survival prediction of spinal Ewing’s sarcoma, which was similar to a previous study.⁷ Thus, we further applied an ensemble algorithm to stacked the top ML models during independent testing as our final model, which were demonstrated to be superior to the traditional TNM stage model. Nevertheless, DeepSurv algorithm seemed to show no advantage to other algorithms. It seems that ML methods also have their limitations for usually their predictions might be hard to interpret, which is considered as an uninterpretable black-box.^5,7 Thus, maybe more feature selection methods and more ML algorithms were needed in the future to validate our findings.

As for the great scale and wide coverage for the SEER database collection, it’s definitely suited for the study of rare tumors such as spinal EWS. However, the SEER database provides limited information on tumor genetic profile and bio-molecular markers,²⁹ which usually play a highly relevant role in the OS and CSS of tumors. These biomarker or genetic information relies on the improvements and completeness of the SEER data collection.³⁰ Another limitation of the current research is the limited amount of data, which makes it challenging to train ML algorithms in our experiments. We have selected the latest and most comprehensive data from the SEER dataset to collect as much of the latest data as possible. However, we did not have external dataset to test the generalization of the optimal model, and new data may be needed to conduct further validations. What’s more, as the limited amount and large time span of our data, evolution of treatment modality over decades may lead to possible observation biases between patients. However, ML techniques showed satisfactory results in analyzing such heterogeneous and complex data with missing values.²⁶ While we are actively working on the further validation of ML models, we encourage more researchers to share their own data. We hope the validated model could be presented online as a prediction tool for spinal EWS in the future.

Conclusion

ML was an effective and promising technique in predicting survival of spinal Ewing’s sarcoma, and the ensemble models were demonstrated to be superior to the traditional TNM stage model. More feature selection methods and more ML algorithms were needed in the future to validate our findings.

Supplemental Material

Supplemental Material - Machine Learning Predict Survivals of Spinal and Pelvic Ewing’s Sarcoma with the SEER Database

Supplemental Material for Machine Learning Predict Survivals of Spinal and Pelvic Ewing’s Sarcoma with the SEER Database by Guoxin Fan, Sheng Yang, Jiaqi Qin, Longfei Huang, Yufeng Li, Huaqing Liu, Xiang Liao in Global Spine Journal

Footnotes

Acknowledgments

The authors thank the SEER database for the availability of the data. We also thank the colleagues (J.Z., X.L., C.F.) who provided assistances in the prior work.

Declaration of Conflicting Interests

The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Guangdong Basic and Applied Basic Research Foundation (2019A1515111171) and National Natural Science Foundation of China (82102640) were received in support of this work. The funders had no role in study design, data collection, data analysis, interpretation, writing of this report and in the decision to submit the paper for publication.

ORCID iD

Huaqing Liu

Supplemental Material

Supplemental material for this article is available online.

References

Weber

Damron

Frassica

Sim

. Malignant bone tumors. Instr Course Lect. 2008;57:673-688.

Arshi

Sharim

Park

Yazdanshenas

Bernthal

, et al. Prognostic determinants and treatment outcomes analysis of osteosarcoma and Ewing sarcoma of the spine. Spine J. 2017;17:645-655. doi:10.1016/j.spinee.2016.11.002

Boussios

Cooke

Hayward

Kanellos

Tsiouris

Chatziantoniou

, et al. Metastatic Spinal Cord Compression: Unraveling the Diagnostic and Therapeutic Challenges. Anticancer Res. 2018;38:4987-4997. doi:10.21873/anticanres.12817

Yan

Huang

Liu

Zhu

, et al. Nomograms for predicting the overall and cause-specific survival in patients with malignant peripheral nerve sheath tumor: a population-based study. J Neurooncol. 2019;143:495-503. doi:10.1007/s11060-019-03181-4

Matsuo

Purushotham

Jiang

Mandelbaum

Takiuchi

Liu

, et al. Survival outcome prediction in cervical cancer: Cox models vs deep-learning model. Am J Obstet Gynecol. 2019;220:381.e1-381.e14. doi:10.1016/j.ajog.2018.12.030

Kourou

Exarchos

Karamouzis

Fotiadis

. Machine learning applications in cancer prognosis and prediction. Computational and structural biotechnology journal. 2015;13:8-17. doi:10.1016/j.csbj.2014.11.005

Song

Gao

Tan

Qiu

Zhou

Zhao

. Multiple Machine Learnings Revealed Similar Predictive Accuracy for Prognosis of PNETs from the Surveillance, Epidemiology, and End Result Database. J Cancer. 2018;9:3971-3978. doi:10.7150/jca.26649

van Buuren

Groothuis-Oudshoorn

. mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software. 2011;45:1-67.

Cox

. Regression models and life‐tables. Journal of the Royal Statistical Society: Series B (Methodological). 1972;34:187-202.

10.

Ishwaran

Kogalur

Blackstone

Lauer

. Random survival forests. The annals of applied statistics. 2008;2:841-860.

11.

Binder

Schumacher

. Allowing for mandatory covariates in boosting estimation of sparse high-dimensional survival models. BMC Bioinformatics. 2008;9:14. doi:10.1186/1471-2105-9-14

12.

Boussios

Hayward

Cooke

Zakynthinakis-Kyriakou

Tsiouris

Chatziantoniou

, et al.

Spinal Ewing Sarcoma Debuting with Cord Compression: Have We Discovered the Thread of Ariadne?

Anticancer Res. 2018;38:5589-5597. doi:10.21873/anticanres.12893

13.

Oitment

Bozzo

Martin

Rienmuller

Jentzsch

Aoude

, et al. Primary sarcomas of the spine: population-based demographic and survival data in 107 spinal sarcomas over a 23-year period in Ontario, Canada. Spine J. 2020;21:296-301. doi:10.1016/j.spinee.2020.09.004

14.

Fletcher

Marasigan

JAM

Hiatt

Anderson

Taboada

Schwend

. Primary Spinal Epidural/Extramedullary Ewing Sarcoma in Young Female Patients. J Am Acad Orthop Surg Glob Res Rev. 2019;3:e19.00072. doi:10.5435/JAAOSGlobal-D-19-00072

15.

Chen

Zheng

Fan

Wang

. Treatment Outcomes and Prognostic Factors of Patients With Primary Spinal Ewing Sarcoma/Peripheral Primitive Neuroectodermal Tumors. Front Oncol. 2019;9:555. doi:10.3389/fonc.2019.00555

16.

Deb

Brewster

Pendharkar

Veeravagu

Ratliff

Desai

. Socioeconomic Predictors of Surgical Resection and Survival for Patients With Osseous Spinal Neoplasms. Clin Spine Surg. 2019;32:125-131. doi:10.1097/bsd.0000000000000738

17.

Mukherjee

Chaichana

Parker

Gokaslan

McGirt

. Association of surgical resection and survival in patients with malignant primary osseous spinal neoplasms from the Surveillance, Epidemiology, and End Results (SEER) database. Eur Spine J. 2013;22:1375-1382. doi:10.1007/s00586-012-2621-4

18.

Mukherjee

Chaichana

Adogwa

Gokaslan

Aaronson

Cheng

, et al. Association of extent of local tumor invasion and survival in patients with malignant primary osseous spinal neoplasms from the surveillance, epidemiology, and end results (SEER) database. World Neurosurg. 2011;76:580-585. doi:10.1016/j.wneu.2011.05.016

19.

Mukherjee

Chaichana

Gokaslan

Aaronson

Cheng

McGirt

. Survival of patients with malignant primary osseous spinal neoplasms: results from the Surveillance, Epidemiology, and End Results (SEER) database from 1973 to 2003. J Neurosurg Spine. 2011;14:143-150. doi:10.3171/2010.10.Spine10189

20.

Cruz

Wishart

. Applications of machine learning in cancer prediction and prognosis. Cancer Inform. 2007;2:59-77.

21.

Matsuo

Purushotham

Moeini

Machida

Liu

, et al. A pilot study in using deep learning to predict limited life expectancy in women with recurrent cervical cancer. Am J Obstet Gynecol. 2017;217:703-705. doi:10.1016/j.ajog.2017.08.012

22.

Thio

Karhade

Ogink

Raskin

De Amorim Bernstein

Lozano Calderon

, et al.

Can Machine-learning Techniques Be Used for 5-year Survival Prediction of Patients With Chondrosarcoma?

Clin Orthop Relat Res. 2018;476:2040-2048. doi:10.1097/corr.0000000000000433

23.

Bartholomai

Frieboes

. Lung Cancer Survival Prediction via Machine Learning Regression, Classification, and Statistical Techniques. Proc IEEE Int Symp Signal Proc Inf Tech. 2018;2018:632-637. doi:10.1109/isspit.2018.8642753

24.

Karhade

Thio

Ogink

Kim

Lozano-Calderon

Raskin

, et al. Development of Machine Learning Algorithms for Prediction of 5-Year Spinal Chordoma Survival. World Neurosurg. 2018;119:e842-e847. doi:10.1016/j.wneu.2018.07.276

25.

Ryu

Seo

Lee

. Novel prognostication of patients with spinal and pelvic chondrosarcoma using deep survival neural networks. BMC Med Inform Decis Mak. 2020;20:3. doi:10.1186/s12911-019-1008-4

26.

Ryu

Lee

Kim

Eoh

. Predicting Survival of Patients with Spinal Ependymoma Using Machine Learning Algorithms with the SEER Database. World Neurosurg. 2018;124:e331-e339. doi:10.1016/j.wneu.2018.12.091

27.

Chen

Long

Liu

Xing

Duan

. Characteristics and prognosis of pelvic Ewing sarcoma: a SEER population-based study. PeerJ. 2019;7:e7710. doi:10.7717/peerj.7710

28.

Kerr

Dial

Lazarides

Catanzano

Lane

Blazer

, et al. Epidemiologic and survival trends in adult primary bone tumors of the spine. Spine J. 2019;19:1941-1949. doi:10.1016/j.spinee.2019.07.003

29.

Nathan

Pawlik

. Limitations of claims and registry data in surgical oncology research. Ann Surg Oncol. 2008;15:415-423. doi:10.1245/s10434-007-9658-3

30.

Ostrom

Gittleman

Kruchko

Louis

Brat

Gilbert

, et al. Completeness of required site-specific factors for brain and CNS tumors in the Surveillance, Epidemiology and End Results (SEER) 18 database (2004-2012, varying). J Neurooncol. 2016;130:31-42. doi:10.1007/s11060-016-2217-7

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.13 MB