Sage Journals: Discover world-class research

Abstract

Gliomas are the most common central nervous system tumors exhibiting poor clinical outcomes. The ability to estimate prognosis is crucial for both patients and providers in order to select the most appropriate treatment. Machine learning (ML) allows for sophisticated approaches to survival prediction using real world clinical parameters needed to achieve superior predictive accuracy. We employed Cox Proportional hazards (CPH) model, Support Vector Machine (SVM) model, Random Forest (RF) model in a large glioma dataset (3462 patients, diagnosed 2000–2018) to explore the most optimal approach to survival prediction. Features employed were age, sex, surgical resection status, tumor histology and tumor site, administration of radiation therapy (RT) and chemotherapy status. Concordance index (c-index) was employed to assess the accuracy of survival time prediction. All three models performed well with prediction accuracy (CI 0.767, 0.771, 0.57 for CPH, SVM, RF models respectively) with the best performance achieved when incorporating RT and chemotherapy administration status which emerged as key predictive features. Within the subset of glioblastoma patients, similar prediction accuracy was achieved. These findings should prompt stricter clinician oversight over registry data accuracy through quality assurance as we move towards meaningful predictive ability using ML approaches in glioma.

Keywords

Artificial intelligence cancer registry data Mining electronic health records machine learning

Introduction and background

Survival prediction in Glioma – a clinical necessity

Gliomas are the most common central nervous system tumors. Gliomas are typically managed by maximal safe resection followed by radiation therapy, chemotherapy or in rare cases observation depending on the histology and clinical context.^1–3 The survival of glioma remains overall extremely poor with a 5-years overall survival less than 35%.⁴ The ability to estimate prognosis is crucial for both patients and providers in order to select the most appropriate treatment that is sufficiently aggressive to allow for tumor control while minimizing adverse long-term effects on normal tissue. Clinicians also consider prognosis in order to appropriately de-escalate treatment. When prognosis is poor, emphasis should be placed on patients’ quality of life and best supportive care. Multiple attempts have been made to design robust scoring systems predictive of outcome for both low⁵ and high-grade glioma.^3,6 Survival prediction have mostly relied on clinical parameters (age, Karnofsky Performace Status (KPS), gender) and resection or methylation status as well as statistical models to create prognostic groups.^7–9 These survival predictions can lack generalisability secondary to: (1) small cohorts of patients, (2) the inclusion of primarily clinical trial patients and (3) management of patients at tertiary academic centers with extensive expertise versus available options in the real-world. The current approaches present limitations as in the real-world: (1) most glioma patients are treated off study; (2) outside of centers of excellence; (3) do not necessarily benefit from expert pathology review or molecular analysis; (4) significant tumor heterogeneity, health disparities and geographical constraints further undermine the ability to predict survival across settings. Existing evidence already suggests that patients falling outside of these settings may have poorer outcomes^10–12 and therefore existing scoring systems may not necessarily reflect their prognosis.

Machine learning for survival prediction in large-scale data sets

Machine learning (ML) can allow for more sophisticated approaches to clinical, molecular and imaging data to predict risk and survival.^13–20 Given the rapid growth of large-scale data in this space and acknowledging the unavoidable limitations in data collection, curation and as a result analysis, we aimed to explore the effectiveness of both ML and statistical approaches to predict survival in glioma patients using a set of commonly available clinical features in a real-world evidence cohort. This cohort represents one such glioma dataset representative of a high volume publicly funded system – the BC Cancer registry, including patients of all glioma histological subtypes treated largely off trial over the course of nearly 20 years in the province of British Columbia, Canada. As neuro-oncology increasingly moves to harness such data sets to identify clinically meaningful biomarkers, we anticipate significant growth in the use of ML approaches to data sets of this nature given the potential limitations of the most frequently used approach Cox Proportional hazards (CPH) model when modelling complex interactions as encountered in large-scale oncology data.²¹ Considering the distinct pathological and prognostic features of glioblastoma, specific subset analysis was additionally carried out also aimed at patients with a specific histological diagnosis of glioblastoma.

Material and methods

Clinical need for survival prediction

In clinical practice we noted that survival discussion and inquiry into potential survival time by patients and family represented a corner stone of the joint decision-making process between, patient, family and oncologist towards optimal management given a range of options with varying length of treatment, aggressiveness of management and toxicity profile.

Study cohort selection

Based on the clinical need for real-world survival prediction, we intended to cover a wide time period to reflect the data heterogeneity in this patient population and variability over time. As a result, following research ethics board approval, we selected patients with a histologic diagnosis of glioma captured in the BC Cancer Registry who presented to a cancer center in British Columbia between 2000 and 2018, resulting in data from 3907 patients. Patients who received treatment out of province (14) or for whom any of the features necessary for the analysis were not captured (317), were excluded. Only patients with a pathological diagnosis of glioma were included and uncommon glioma histologies with less than 0.5% recorded cases in the dataset were excluded. Overall, 3462 patients were included in the analysis. In keeping with recognised distinctive pathological and prognostic features patients with a specific histological diagnosis of glioblastoma were also analysed a separate subset.

Techniques employed in the analysis

Cox Proportional hazards model

The CPH model is the linear regression model most widely used^21,22 in survival studies to predict the risk of an outcome based on multiple variables.²³ We built a CPH model using clinical features in the training data as covariates to allow for comparison of performance accuracy between the most commonly used model and other increasingly employed models.^{14,21,24–27}

Support vector machine (SVM) model

Support vector machine is an increasingly utilised ML algorithm that functions by separating observations into classes^26,28 and its use has been described in the context of glioma survival.²⁹ We optimized a linear SVM classifier through a hyperparameter search to find the best regularization hyperparameter which was used to train the classifier using the entire training set.

Random forest (RF) model

Like SVM, RF is also increasingly utilised for survival prediction.^26,30,31 The random forest consists of 1000 decision trees trained using the training dataset. Risk score prediction was the average across all trees in the forest.²⁵ The importance score of each feature was calculated by the decrease in concordance index if said feature was made unavailable by assigning random value to it for all patients.³²

Training and testing datasets

We split the entire dataset into a training and a testing dataset in a 7:3 ratio by random sampling. Each dataset contained the following features: age, sex, administration of chemotherapy status, surgical resection status, radiation therapy status, tumor histology and tumor site. Baseline characteristics differences between the training and testing datasets were assessed using Chi-square test, p < 0.05 was considered significant.

Modeling and prediction

Three modeling methods were implemented using open source python libraries, scikit-survival by Pölsterl et al.^33–35 Each model was trained using the training set and then was applied to predict a risk score and survival function for each patient in the testing set. Predicted median survival time was determined as the survival time at 0.5 survival probability derived from the survival function. The accuracy of the survival predictions were evaluated by Concordance Index (c-index), calculated using the python package Lifelines (https://doi.org/10.5281/zenodo.1252342).

Kaplan Meier (KM) survival curves

Kaplan Meier survival curves were plotted to compare the survival time distribution of training and testing datasets. A Log-rank test^22,36 was used to determine if there is a significant difference between the survival distributions of the training and testing dataset, using the Python package Lifelines (https://doi.org/10.5281/zenodo.1252342). As described by Chen et al.,³⁶ the concordance index was employed to compare predicted risk score and observed survival time.

While there is currently no established consensus on how to approach representation of this type of predictive survival analysis in figures, KM survival curves were plotted to compare the median survival time predicted by the models and the clinically recorded survival time of patients in the testing dataset as other authors have employed similar approaches.^28,37,38 Patients who were still alive at the time of the analysis were removed for plotting these two KM curves since model predictions would not include censoring status at the end of the study and a KM curve of predicted median survival for those patients whose censoring status is unknown would not be appropriate. In order to show potential clinical application, we used both c-index, which is widely used to evaluate the performance of survival predictions generated using both censored and uncensored patients, and log-rank test where we compare the KM curves between recorded and predicted survival time distributions in only uncensored patients.

Receiver operating characteristic (ROC) curves

We plotted ROC curves showing the specificity and sensitivity of 1-year survival status predictions of patients in the testing dataset, generated using the CPH, SVM and RF models. Area under the curve (AUC) for each ROC curve was calculated to evaluate the performance of 1-year survival prediction of each model.

Availability of data and material

The datasets supporting the conclusions of this article were obtained from the BC Cancer Registry following approved BC Cancer Research Ethics Board Review. Given the large data set (3907 patients) spanning a large time period (2000–2018) with a significant proportion of the patients deceased and anonymization of the patients in the study, per BC Cancer Research Ethics Board Review, patient consent was not required.

Results

Clinical characteristics

3462 patients with a diagnosis of glioma treated between 2000 and 2018 were included in the analysis. 2113 (61%) were male and 1349 (39%) female. Histological distribution was: glioblastoma 1555 (45%), astrocytoma 926 (27%), oligodendroglioma 299 (9%), mixed glioma 267 (8%), anaplastic oligodendroglioma 130 (4%), glioma malignant 118 (3%), anaplastic astrocytoma 70 (2%), other glioma histologies (2%) (Table 1). 1119 (33%), 795 (23%) and 117 (3%) of tumors originated in the frontal lobe, temporal and occipital lobe respectively. 2410 (70%) had maximal safe surgical resection whilst the remainder, 1052 (30%) had biopsy only. 2730 (79%) total patients received RT. At the time of the analysis 1831 (53%) had not received chemotherapy, 1515 (44%) had received chemotherapy immediately following diagnosis, 81 (2%) received subsequent chemotherapy and 35 (1%) received both initial and subsequent chemotherapy. Molecular characterization including MGMT status and patient performance status were not captured in the BC Cancer registry data.

Table 1.

Characteristics of training and testing sets.

Age (SD)	Training set (n = 2423)	Testing set (n = 1039)	Total (n = 3462)	p-value
Age (SD)	15	15	15	p-value
Sex
Male	1483 (61.2%)	630 (60.6%)	2113 (61.0%)	0.75
Female	940 (38.8%)	409 (39.4%)	1349 (39.0%)	0.75
Histology
Glioblastoma, NOS	1085 (44.8%)	470 (45.2%)	1555 (44.9%)	0.80
Astrocytoma, NOS	636 (26.2%)	290 (27.9%)	926 (26.7%)	0.80
Oligodendroglioma, NOS	220 (9.1%)	79 (7.6%)	299 (8.6%)	0.80
Mixed glioma	185 (7.6%)	82 (7.9%)	267 (7.7%)	0.80
Anaplastic oligodendroglioma	98 (4.0%	32 (3.1%)	130 (3.8%)	0.80
Glioma, malignant	86 (3.5%)	32 (3.1%)	118 (3.4%)	0.80
Anaplastic astrocytoma	47 (1.9%)	23 (2.2%)	70 (2.0%)	0.80
Giant cell glioblastoma	19 (0.8%)	11 (1.1%)	30 (0.9%)	0.80
Gliosarcoma	20 (0.8%)	7 (0.7%)	27 (0.8%)	0.80
Pilocytic astrocytoma	16 (0.7%)	8 (0.8%)	24 (0.7%)	0.80
Fibrillary astrocytoma	11 (0.5%)	5 (0.5%)	16 (0.5%)	0.80
Site
Frontal lobe	781 (32.2%)	338 (32.5%)	1119 (32.3%)	0.22
Temporal lobe	558 (23.0%)	237 (22.8%)	795 (23.0%)	0.22
Brain, overlapping lesion	408 (16.8%)	187 (18.0%)	595 (17.2%)	0.22
Parietal lobe	372 (15.4%)	140 (13.5%)	512 (14.8%)	0.22
Occipital lobe	70 (2.9%)	47 (4.5%)	117 (3.4%)	0.22
Brain, unspecified	90 (3.7%)	28 (2.7%)	118 (3.4%)	0.22
Cerebrum	69 (2.8%)	25 (2.4%)	94 (2.7%)	0.22
Brain stem	38 (1.6%)	16 (1.5%)	54 (1.6%)	0.22
Spinal cord	13 (0.5%)	10 (1.0%)	23 (0.7%)	0.22
Cerebellum, NOS	17 (0.7%)	7 (0.7%)	24 (0.7%)	0.22
Ventricle, NOS	7 (0.3%)	4 (0.4%)	11 (0.3%)	0.22
Surgery
Surgical resection	1666 (68.8%)	744 (71.6%)	2410 (69.6%)	0.09
No surgical resection	757 (31.2%)	295 (28.4%)	1052 (30.4%)	0.09
Radiation
Radiation therapy	1906 (78.7%)	814 (78.3%)	2720 (78.6%)	0.83
No radiation therapy	517 (21.3%)	225 (21.7%)	742 (21.4%)	0.83
Chemotherapy
No chemotherapy	1274 (52.6%)	557 (53.6%)	1831 (52.9%)	0.70
Concurrent chemotherapy	1072 (44.2%)	443 (42.6%)	1515 (43.8%)	0.70
Subsequent chemotherapy	53 (2.2%)	28 (2.7%)	81 (2.3%)	0.70
Initial and subsequent chemotherapy	24 (1.0%)	11 (1.1%)	35 (1.0%)	0.70

Training and testing datasets

There was no statistically significant difference in baseline characteristics (Table 1) and for the glioblastoma subset (Supplemental Table 1) or survival distributions between the training and testing datasets (log rank test p = 0.99) (Figure 1). There was minimal difference in c-index between survival predictions made using the training and testing datasets across all models independent of all glioma patients (Figure 2(a)) or subset analysis of the glioblastoma cohort (Figure 2(b)).

Figure 1.

Kaplan Meier overall survival for the training and testing datasets for the entire dataset.

Figure 2.

Survival prediction accuracy of each model for training and testing dataset. The difference in c-index between training and testing dataset is minimal across all models, suggesting that no overfitting of the training data. (a) All comers. (b) Glioblastoma histology. The model predictions were generalizable to the unused testing data. Cox = Cox Proportional hazards (CPH) model. SVM = Support Vector Machine (SVM) model, RF = Random Forest (RF) model.

ML models for survival prediction

Concordance index (c-index) is a commonly used method comparing the ranking of recorded survival with the ranking of predicted risk score. A c-index of 0.5 is expected from random prediction and 1.0 is expected if two rankings are in perfect concordance.²² The c-index adjusted for right censoring was calculated for patients in the testing dataset predicted by each model trained using different combinations of features numbered as (1) Age, (2) Sex, (3) Tumor Histology, (4) Tumor site, (5) Tumor resection, (6) Radiation therapy (RT), (7) Chemotherapy. Prediction accuracy was the lowest when models did not take into account information on management (Figure 3). The highest survival prediction accuracy was obtained using models that take into account information on patient characteristics, tumor characteristics and cancer management with CI of 0.767, 0.771 and 0.757 for CPH, SVM and RF models respectively (Figure 3). Separate subset analysis of the patients with a glioblastoma histology (n = 1555 vs n = 3262 for all comers) was carried out revealing slightly lower prediction accuracy with CI of 0.727, 0.738, 0.733 for CPH, SVM, RF models respectively (Supplemental Figure 1).

Figure 3.

Survival prediction accuracy of each model using different combinations of features where: (1) Age, (2) Sex, (3) Tumor Histology, (4) Tumor site, (5) Tumor resection, (6) Radiation therapy (RT), (7) Chemotherapy. CPH = Cox Proportional hazards model. SVM = Support Vector Machine (SVM) model, RF = Random Forest (RF) model.

Clinical features predictive importance

The variables available in the dataset and employed in the analysis were age, sex, administration of chemotherapy, surgical resection, administration of radiation therapy, tumor histology and tumor site. The predictive value of each variable in the CPH model was calculated as the c-index of a model obtained from a univariate CPH analysis using said variable. The predictive values of these clinical variables range from 0.5 for gender, to 0.69 for age (Table 2) in the CPH model. In the RF model, each feature was assigned a feature importance score calculated by the decrease in the c-index predicted by a model that eliminated said feature (Table 3). Both models show that, other than age, treatment methods, chemotherapy and RT administration, are the most predictive features of survival time irrespective of a more general “glioma” histological classification or a specific pathologic diagnosis of “glioblastoma.” However, in the RF model within the subset of patients with a diagnosis of glioblastoma, both the administration of RT and chemotherapy carry more robust feature importance scores (0.032 and 0.051 respectively) versus (0.020 and 0.031 respectively) in the all comers cohort. In addition, in the glioblastoma subset, age, previously associated with the highest feature importance score (0.08), is overtaken by both RT and chemotherapy administration (0.051) and (0.032) respectively versus age (0.013) (Table 3). In both cohorts the administration of chemotherapy ranked higher in feature importance as compared to radiation therapy (Table 3).

Table 2.

Predictive value of each variable under cox proportional hazards (CPH) model. The predictive value of each variable is the concordance index of the testing dataset using predictions made by the model containing only this variable.

Variable	Predictive value
Variable	All patients	Glioblastoma histology
Histology	0.60
Age	0.69	0.64
Sex	0.50	0.49
Surgical resection	0.57	0.58
Radiation therapy administration	0.60	0.60
Chemotherapy administration	0.63	0.67
Tumor site	0.55	0.53

Table 3.

Most important features as identified in the random forest (RF) model. The feature importance score of each feature is shown as calculated by the decrease in the concordance index of the testing dataset predicted by RF model if this feature were not available. Features with feature importance score less than 0.001 and histology/site features applying only to relatively small portions of the data are not shown.

Feature	Feature importance score
Feature	All patients	Glioblastoma histology
Histology = glioblastoma	0.020
Histology = oligodendroglioma	0.011
Histology = astrocytoma	0.010
Age	0.080	0.013
Surgical resection	0.006	0.013
Radiation therapy administration	0.020	0.032
Chemotherapy administration	0.031	0.051

Survival models

All three models CPH, SVM and RF performed reasonably well (Figure 4(a)–(c)) as seen in the predicted survival probability for three sample patients (figure simplified to include only three sample patients for ease of interpretation). The clinical information for each patient is as follows: Patient 1: 67-year-old male diagnosed with oligodendroglioma NOS in overlapping areas of the brain who received chemotherapy and no surgical resection or radiation. Patient 2: 83-year-old female diagnosed with glioblastoma located in the cerebrum treated with surgical resection only (no chemotherapy or radiation). Patient three was a 69-year-old male diagnosed with anaplastic oligodendroglioma located in the front lobe who passed away 7 days after diagnosis not having received any therapy. Only uncensored patients were used for generation the KM survival curve as the model predictions do not include censoring status. There was no statistically significant difference between the recorded survival time distribution and the predicted median survival time distribution using the CPH and RF models for the testing dataset, p = 0.07 and p = 0.61 respectively (Figure 4(d) and (f)). The difference between SVM predicted median survival distribution was statically significant from the recorded survival time, p < 0.005 (Figure 4(e)).

Figure 4.

Predicted survival curves for three patients in the testing dataset using (a) CPH model. (b) SVM model. (c) RF model respectively. Corresponding KM survival curves for recorded survival times in testing dataset and KM survival curve for predicted median survival times using (d) CPH model. (e) SVM model. (f) RF model. CPH = Cox Proportional hazards model. SVM = Support Vector Machine (SVM) model, RF = Random Forest (RF) model.

All three models have similar sensitivity and specificity for predicting 1-year survival outcomes (Figure 5) evident in their similar ROC curves and minimal differences in AUC values: 0.854, 0.849, and 0.844 for CPH, SVM, and RF respectively for the entire patient cohort and 0.823, 0.821, and 0.825 for CPH, SVM, and RF respectively for the glioblastoma subset.

Figure 5.

Receiver operating characteristic (ROC) curves for 1-year survival prediction for (a) all glioma patients. The area under the curve (AUC) values of the ROC curves were 0.854, 0.849, and 0.844 for CPH, SVM, and RF respectively. (b) Glioblastoma histology. The area under the curve (AUC) values of the ROC curves were 0.823, 0.821, and 0.825 for CPH, SVM, and RF respectively. ROC curves show the sensitivity and the specificity of predicted 1-year survival outcomes generated using CPH, SVM and RF models.

Discussion

Machine learning as a tool towards superior prediction of clinical outcomes has increased in popularity in all domains of medicine including oncology,^{17,19,20,29,39–41} driven by the need to rapidly harness clinically relevant results when prospective data is unavailable and impossible to obtain such as in the context of the COVID-19 pandemic.

Our study is novel in two significant ways: (1) it employs an extremely large glioma patient population (3462 patients) representative of a high volume publicly funded system originating in the BC Cancer registry in British Columbia, Canada, which is extremely rare in this space for neuro-oncology survival prediction⁴² outside of public data bases¹⁷ which suffer from their own limitations and biases and (2) we deliberately explored the ability to predict survival while employing exclusively non radiomic, non-molecular data features generally available in most high volume cancer centers treating gliomas to generate comparative performance parameters grounded in an inclusive, realistic and as a result transferrable data set²⁵ We achieved excellent survival prediction accuracy with c-index ranging from 0.757 (RF model) to 0.771 (SVM model) while including the lowest common denominators embedded in most large brain tumor registries including age, sex, administration of chemotherapy status, surgical resection status, radiation therapy status, tumor histology and tumor site. Slightly lower c-index values were identified in the analysis of the subset of patients diagnosed with glioblastoma (0.727, 0.738, 0.733 for CPH, SVM, RF) likely reflecting a lower n (1555 vs 3462) but also reinforcing that the models are equally robust in this specific histological scenario associated with both less controversy in terms of establishing consistent pathological diagnosis as well as a far poorer prognosis.^3,4,6,8,9 We also note that surgical resection status again recently reported on as a significant feature defining glioma outcomes⁴³ did not rank as highly as we may have expected. We suspect that this is secondary to inconsistent coding distinguishing the extent of resection in the registry which may reflect variable capture as to what is considered gross total or subtotal resection or even biopsy in operative reports which are used as source documentation for this feature in the registry.⁴⁴

Most ML survival prediction studies aimed at patients with glioma center around MRI radiomics or histological features as a result involving smaller patient populations as proof of concept: Tan et al. 2019 (MRI radiomics, n = 147), Papp et al.⁴⁰ (PET, n = 70), Mobadersany et al.⁴¹ (histology and genomics n = 769), Mizutani et al.²⁹ (radiation dosimetry, n = 35).^29,39–41 Our patient characteristics and tumor management features are more similar to large retrospective registry studies such as the SEER database which using traditional statistical analysis that has been employed to develop nomograms in the context of low-grade glioma with 3732 patients,⁴⁵ oligodendroglioma with 2689 patients,⁴⁶ high-grade glioma (6395 patients).⁴⁷

The CPH model has been the gold standard for survival analysis. It uses a semi-parametric statistical modeling approach where the survival outcome is a linear combination of predictive variables. A disadvantage of the CPH model is that it operates on the underlying assumption that predictive variables are independent and do not interact, and their impact on survival do not change over time. We hypothesized that these assumptions are unlikely to hold true when considering: (1) the large number of predictive features potentially available in cancer patients and (2) these features are likely to interact with each other in an unforeseen manner.¹⁷ Therefore, we selected two widely used ML methods, Support Vector Machine (SVM) and Random Forest (RF), that allow the use of predictive features that have potential interactions.^17,19 The SVM approach assigns weight to each predictive feature to produce a score that maximizes concordance between predicted risk ranking and recorded survival time ranking.²⁸ By contrast the RF approach takes the average of a collection of decision trees where the branches are split based on values of the predictive features.³⁸ All three approaches ultimately produced equal or superior c-index in comparison with the literature.^{17,29,39–41,45–47}

We achieved a higher c-index using the ranking based SVM model in comparison with the other two models. However, our SVM model exhibited a difference between SVM predicted median survival time distribution and the recorded survival time. This was likely secondary to our selection of a rank based SVM approach which optimizes risk ranking. A regression based SVM model can be explored in further analysis for potential better survival time prediction³⁷ as well as survival tree analysis⁴⁸ in comparison to RF.⁴⁹ We found some parallels with Nemati et al.¹⁹ who employed real world data to predict hospital discharge for COVID-19 patients and Senders et al.¹⁷ who employed 20,821 glioblastoma patients originating in SEER database to predict survival at 1 year following diagnosis. Both employed c-index as a performance metric and focused on risk factor analysis and 1 year survival classification respectively but stopped short of comparing predicted and recorded discharge or survival time as c-index is solely based on ranking of times of events in all possible pairs.^17,19 To enhance clinical applicability, it was important to us to compare the predicted survival time to actual survival time and we employed log rank test to accomplish this using uncensored data. Further analysis alternatives could include using alternate weighting methods to include censored data and comparison with accelerated failure time (AFT) algorithms.^17,35,50

Cross-validation and its inclusion is subject to future directions in our work including to compare its impact on performance in large-scale data sets in glioma given that both cross-validation and sample splitting ultimately reflect internal validation and neither may completely mitigate overfitting although this is subject of ongoing studies and validation with other sources of data is the ultimate goal.⁵¹

Similar to other studies employing large scale retrospective data, the current study is limited by lack of information with respect to patient performance status, molecular features and the detailed timing of chemotherapy administration in relationship to diagnosis and RT administration, all currently not being collected as part of the BC Cancer registry. The management of glioma, particularly in the wake of the revisions made as a result of the 2016 WHO Classification, has evolved to transition from morphologic based “grading” to incorporate molecular features that define biological aggressiveness and tumor behavior. Although tumor sequencing and molecular classification is increasingly being incorporated in management, as of yet most datasets do not possess sufficient molecular information to incorporate into a formal analysis. As a result, it was crucial for us to include all tumors pathologically confirmed as gliomas in the analysis in a deliberate effort to circumvent the bias due to older tumor grading paradigm and in the absence of molecular characterization.^1–5 In the context of glioblastoma (WHO grade IV) however, pathological diagnosis is more consistently defined due to the identification of distinctive pathognomonic features, in particular the presence of necrosis and is employed to guide optimal management, albeit growing importance of molecular features.^6,9 A separate analysis was therefore carried out in the subset of patients with a specific diagnosis of glioblastoma to determine the applicability of the survival prediction models and to explore feature importance scores. This analysis identified a more robust feature importance of both RT and chemotherapy administration (0.032 and 0.051 respectively) versus (0.020 and 0.031 respectively) in the cohort overall with chemotherapy administration in particular, exceeding those same featured identified in the analysis of the entire cohort and overtaking age as the most important feature. From a clinical perspective this is perhaps not surprising considering the aggressive nature of the glioblastoma and the recognised importance of both management options as predictive of survival in the literature.^6,8,9,12,52

Limitations of this study are posed by the lack of vital status information for some patients and the possibility that abrupt events not directly related to patient characteristics, histology or management that may have affected outcome. Whilst we do have information on the intention of RT and chemotherapy administration, the current analysis does not take into account whether the treatment was in fact ultimately administered or completed as intended.

Ultimately both ML methods achieved good predictive ability on par with the gold standard (CPH) in a large dataset but similarly to other studies,^17,19 they did not outperform the CPH statistical model acknowledging that in the context of additional highly complex interacting features (radiomic, RT dosimetry, detailed genomic and pharmaceutical data), machine and deep learning models are likely to perform better.^13,14,16 The robust capture and inclusion of the above features comprise future directions in the field of oncology.

The patient characteristics and outcomes of our study cohort are similar to other large series^{17,29,39–41} and we determined that management as a feature was crucial in achieving superior predictive capability for all models but especially so in glioblastoma. Our study is a first step towards future investigations into the potential of involving ML models in personalized treatment planning where model predicted survival times for different treatment options can be take into consideration when determining the optimal management plan for each patient. This is especially important in glioma management, where the intricacies of administration of chemotherapy can be a source of clinical debate (concurrent versus sequential, number of cycles and patient selection). The fact that these aspects of management are often incompletely captured and hence often used as a dichotomy RT (yes/no) and chemotherapy (yes/no) should be remedied. The fact that our models identified chemotherapy and RT administration as the most important features both in the context of a more general glioma diagnosis as well as in the specific context of a specific histologic diagnosis of glioblastoma, reinforces the importance of consistent and detailed capture of patient management in order to advance ML approaches to large scale data. Future studies are required to address the issue of how ML encapsulates the a priori complexity of clinical decision making and the implications for patient outcomes juxtaposed with the ability to create clinically meaningful ML models that appropriately disentangle the multiple factors involved.

Our efforts in this study highlight both the need to create reliable clinician/ML connections and the need for increasingly robust datasets that capture the intricacies of patient management in large scale registries. This means more clinical oversight of data coding in registries as well as quality assurance of patient management as is now increasingly performed via peer review in tertiary care institutions. The ability to work from a platform of consensus will allow for meaningful conclusions based on ML eventually on par with those currently obtained from prospective trials. Ongoing efforts and future directions involve in depth survival modeling aimed specifically at the management and outcomes of elderly patients with a glioma diagnosis as well as that of patients with lower grade gliomas and incorporation of large-scale systemic management data into existing models.

Conclusions

We achieved superior survival prediction performance with the aforementioned ML studies as compared to other ML and non-ML approaches in the literature while employing exclusively widely available clinical sets of features. The administration of chemotherapy and RT emerged as key features in both general glioma histology and in glioblastoma specifically raising questions about the potential for superior results that may be achieved through further optimisation and clinical oversight of large-scale real-world datasets particularly with respect to how the intricacies of patient management are captured to allow for clinically relevant results to be generated by ML approaches.

Supplemental Material

Supplemental Material - Machine learning based survival prediction in Glioma using large-scale registry data

Supplemental Material for Machine learning based survival prediction in Glioma using large-scale registry data by Rachel Zhao, Ying Zhuge, Kevin Camphausen and Andra Krauze in Health Informatics Journal

Footnotes

Acknowledgements

We acknowledge the contribution of Dr N. Coleman who kindly reviewed the manuscript and provided suggestions for its relevance to the field. We also acknowledge the contributions of Ryan Proulx and the staff at Safe Software and Dr Iulian Badragan, medical physicist at BC Cancer Surrey for their assistance with some of the data related technical aspects involved in this analysis.

Author contributions

R.Z. conceived the study, organised patient registry data, created and optimized non-machine and machine learning models and co-drafted the manuscript.

Y.Z. reviewed the manuscript and conceived the original joint study of machine learning based glioma patient outcomes analysis employing large scale data.

K.C. reviewed the manuscript and conceived the original joint study of machine learning based glioma patient outcomes analysis employing large scale data.

A.K. conceived the study, participated in its design and coordination, collected patient data and co-drafted the manuscript.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Porte-Hungerford Neuro-Oncology Grant held by Dr A. Krauze and the BC Cancer Summer Studentship Program and NCI NIH intramural program (ZID BC 010990).

Availability of data and material

ORCID iD

Rachel Zhao

Supplemental Material

Supplemental material for this article is available online.

Appendix 1

References

Morshed

Young

Hervey-Jumper

, et al. The management of low-grade gliomas in adults. J Neurosurg Sci 2019; 63(4): 450–457.

van den Bent

Smits

Kros

, et al. Diffuse infiltrating oligodendroglioma and astrocytoma. J Clin Oncol 2017; 35(21): 2394–2401.

Geurts

van den Bent

, On high-risk, low-grade glioma: what distinguishes high from low? Cancer 2019; 125(2): 174–176.

Lapointe

Perry

Butowski

. Primary brain tumours in adults. Lancet 2018; 392(10145): 432–446.

Franceschi

Mura

Lamberti

, et al. Concordance between RTOG and EORTC prognostic criteria in low-grade gliomas. Future Oncol 2019; 15(22): 2595–2601.

Corso

Bindra

Mehta

. The role of radiation in treating glioblastoma: here to stay. J Neurooncol 2017; 134(3): 479–485.

Curran

Jr Scott

Horton

, et al. Recursive partitioning analysis of prognostic factors in three radiation therapy oncology group malignant glioma trials. J Natl Cancer Inst 1993; 85(9): 704–710.

Mirimanoff

Gorlia

Mason

, et al. Radiotherapy and temozolomide for newly diagnosed glioblastoma: recursive partitioning analysis of the EORTC 26981/22981-NCIC CE3 phase III randomized trial. J Clin Oncol 2006; 24(16): 2563–2569.

Bell

Pugh

McElroy

, et al. Molecular-based recursive partitioning analysis model for glioblastoma in the temozolomide era: a correlative analysis based on NRG oncology RTOG 0525. JAMA Oncol 2017; 3(6): 784–792.

10.

Zhao

Zeng

DeVries

, et al. Optimising management of the elderly patient with glioblastoma – a nomogram and online tool based on BC Cancer Registry real world data. Neurooncol Adv 2021; 4(1): vdac052.

11.

Harrison

Anderson

Cachia

, et al. Clinical trial participation of patients with glioblastoma at the university of Texas MD anderson cancer center. Eur J Cancer 2019; 112: 83–93.

12.

Krauze

Megan

Theresa

, et al. The addition of valproic acid to concurrent radiation therapy and temozolomide improves patient outcome: a correlative analysis of RTOG 0525, SEER and a phase II NCI trial. Cancer Stud Ther 2020; 5(1): vi13.

13.

Peeken

Goldberg

Pyka

, et al. Combining multimodal imaging and treatment features improves machine learning-based prognostic assessment in patients with glioblastoma multiforme. Cancer Med 2019; 8(1): 128–136.

14.

Booth

Williams

Cardoso

, et al. Machine learning and glioma imaging biomarkers. Clin Radiol 2020; 75(1): 20–32.

15.

Davatzikos

Barnholtz-Sloan

Bakas

, et al. AI-based prognostic imaging biomarkers for precision neuro-oncology: the ReSPOND consortium. Neuro Oncol 2020; 22(6): 886–888.

16.

Hsu

Hsieh

, et al. Machine learning-based radiomics for molecular subtyping of gliomas. Clin Cancer Res 2018; 24(18): 4429–4436.

17.

Senders

Staples

Mehrtash

, et al. An online calculator for the prediction of survival in glioblastoma patients using classical statistics and machine learning. Neurosurgery 2020; 86(2): E184–E192.

18.

Han

Kolli

Gransar

, et al. Machine learning based risk prediction model for asymptomatic individuals who underwent coronary artery calcium score: comparison with traditional risk prediction approaches. J Cardiovasc Comput Tomogr 2020; 14(2): 168–176.

19.

Nemati

Ansary

Nemati

, Machine-learning approaches in COVID-19 survival analysis and discharge-time likelihood prediction using clinical data. Patterns (N Y) 2020; 1(5): 100074.

20.

Wang

Reddy

. Machine learning for survival analysis: a survey. ACM Comput Surv 2017; 51.

21.

Moncada-Torres

van Maaren

Hendriks

, et al. Explainable machine learning can outperform cox regression predictions and provide insights in breast cancer survival. Scientific Rep 2021; 11(1): 6968.

22.

Bewick

Cheek

Ball

, Statistics review 12: survival analysis. Crit Care 2004; 8(5): 389–394.

23.

Cox

. Regression modelsand life-tables. J Royal Statis Soc Ser B 1972; 34(2): 187–220.

24.

Prelaj

Boeri

Robuschi

, et al Machine learning using real-world and translational data to improve treatment selection for NSCLC patients treated with immunotherapy. Cancers (Basel) 2022; 14(2): 435.

25.

Tasci

Zhuge

Camphausen

, et al. Bias and class imbalance in oncologic data-towards inclusive and transferrable AI in large scale oncology data sets. Cancers (Basel) 2022; 14(12): 2897.

26.

Zhao

Krauze

. Survival prediction in gliomas: current state and novel approaches. In: Debinski

(ed), Gliomas [Internet]. Brisbane, AU: Exon Publications, 2021.

27.

Mirza

Wang

, et al. Machine learning and integrative analysis of biomedical big data. Genes (Basel) 2019; 10(2): 87.

28.

Van Belle

Pelckmans

Van Huffel

, et al. Support vector methods for survival analysis: a comparison between ranking and regression approaches. Artif Intell Med 2011; 53(2): 107–118.

29.

Mizutani

Magome

Igaki

, et al. Optimization of treatment strategy by using a machine learning model to predict survival time of patients with malignant glioma after radiotherapy. J Radiat Res 2019; 60(6): 818–824.

30.

Ishwaran

Kogalur

Blackstone

, et al. Random survival forests. Annals Appl Statis 2008; 2(3): 841–860.

31.

Chen

Breman

. Using random for learn imbalanced data, 2004.

32.

Paul

Dupont

. Inferring statistically significant features from random forests. Neurocomputing 2015; 150: 471–480.

33.

Pölsterl

Navab

Katouzian

. An efficient training algorithm for kernel survival support vector machines. ArXiv 2016.

34.

Pölsterl

Gupta

Wang

, et al. Heterogeneous ensembles for predicting survival of metastatic, castrate-resistant prostate cancer patients. F1000Res 2016; 5: 2676.

35.

Pölsterl

Navab

Katouzian

. Fast training of support vector machines for survival analysis. Cham: Springer International Publishing, 2015.

36.

Chen

H-C

Kodell

Cheng

, et al. Assessment of performance of survival prediction models for cancer prognosis. BMC Med Res Methodol 2012; 12(1): 102.

37.

Bellera

MacGrogan

Debled

, et al. Variables with time-varying effects and the cox model: some statistical concepts illustrated with a prognostic factor study in breast cancer. BMC Med Res Methodol 2010; 10: 20.

38.

Mogensen

Ishwaran

Gerds

. Evaluating random forests for survival analysis using prediction error curves. J Stat Softw 2012; 50(11): 1–23.

39.

Tan

Wang

X-C

, et al. Improving survival prediction of high-grade glioma via machine learning techniques based on MRI radiomic, genetic and clinical risk factors. Eur J Radiol 2019; 120: 108609.

40.

Papp

Pötsch

Grahovac

, et al. Glioma survival prediction with combined analysis of in vivo (11) C-MET PET features, ex vivo features, and patient features by supervised machine learning. J Nucl Med 2018; 59(6): 892–899.

41.

Mobadersany

Yousefi

Amgad

, et al. Predicting cancer outcomes from histology and genomics using convolutional networks. Proc Natl Acad Sci U S A 2018; 115(13): E2970–e2979.

42.

Patil

Somasundaram

Waite

, et al. Independently validated sex-specific nomograms for predicting survival in patients with newly diagnosed glioblastoma: NRG oncology RTOG 0525 and 0825. J Neurooncol 2021; 155(3): 363–372.

43.

Molinaro

Hervey-Jumper

Morshed

, et al. Association of maximal extent of resection of contrast-enhanced and non-contrast-enhanced tumor with survival within molecular subgroups of patients with newly diagnosed glioblastoma. JAMA Oncol 2020; 6(4): 495–503.

44.

Zhao

Zeng

DeVries

, et al. Optimizing management of the elderly patient with glioblastoma: survival prediction online tool based on BC cancer registry real-world data. Neurooncol Adv 2022: 4(1): vdac052.

45.

Zhao

Chen

Hao

, et al. A nomogram for predicting individual prognosis of patients with low-grade glioma. World Neurosurg 2019; 130: e605–e612.

46.

Brandel

Alattar

Hirshman

, et al. Survival trends of oligodendroglial tumor patients and associated clinical practice patterns: a SEER-based analysis. J Neurooncol 2017; 133(1): 173–181.

47.

Yang

Yao

Long

, et al. Prognostic nomograms for primary high-grade glioma patients in adult: a retrospective study based on the SEER database. Biomed Res Int 2020; 2020: 1346340.

48.

Hsich

Gorodeski

Blackstone

, et al, Identifying important risk factors for survival in patient with systolic heart failure using random survival forests. Circ Cardiovasc Qual Outcomes 2011; 4(1): 39–45.

49.

Nasejje

Mwambi

Dheda

, et al. A comparison of the conditional inference survival forest model to random survival forests based on a simulation study as well as on two applications with time-to-event data. BMC Med Res Methodol 2017; 17(1): 115.

50.

Vock

Wolfson

Bandyopadhyay

, et al. Adapting machine learning techniques to censored time-to-event health record data: a general-purpose approach using inverse probability of censoring weighting. J Biomed Inform 2016; 61: 119–131.

51.

Simon

Subramanian

M-C

, et al. Using cross-validation to evaluate predictive accuracy of survival risk classifiers based on high-dimensional data. Brief Bioinform 2011; 12(3): 203–214.

52.

Wang

Won

, et al. Validation and simplification of the radiation therapy oncology group recursive partitioning analysis classification for glioblastoma. Int J Radiat Oncol Biol Phys 2011; 81(3): 623–630.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.44 MB