Abstract
Study Design
Retrospective study.
Objectives
To develop machine learning (ML) models to predict recurrent lumbar disc herniation (rLDH) following percutaneous endoscopic lumbar discectomy (PELD).
Methods
We retrospectively analyzed 1159 patients who had undergone single-level PELD for lumbar disc herniation (LDH) between July 2014 to December 2019 at our institution. Various preoperative imaging variables and demographic metrics were brought in analysis. Student’s t test and Chi-squared test were applied for univariate analysis, which were feature selection for ML models. We established ML models to predict rLDH: Artificial neural networks (ANN), Extreme Gradient Boost classifier (XGBoost), KNeighborsClassifier (KNN), Decision tree classifier (Decision Tree), Random forest classifier (Random Forest), and support vector classifier (SVC).
Results
A total 130 patients (11.22%) were diagnosed as rLDH in 1159 patients. Recurrence occurred within 10.25 ± 11.05 months. Body mass index (BMI) (P = .027), facet orientation (FO) (P < .001), herniation type (P = .012), Modic changes (P = .004), and disc calcification (P = .013) are significant factors in univariate analysis (P < .05). Extreme Gradient Boost classifier, Random Forest, ANN showed fine area under the curve, .9315, .9220, and .8814 respectively.
Conclusion
We developed a deep learning and 2 ensemble models with fine performance in prediction of rLDH following PELD. Predicting re-herniation before surgery has the potential to optimize decision-making and meaningfully decrease the rates of rLDH following PELD. Our ML model identified higher BMI, lower FO, Modic changes, disc calcification in a non-protrusive region, and herniation type (noncontained herniation) as significant features for predicting rLDH.
Keywords
Introduction
Percutaneous endoscopic lumbar discectomy (PELD) is a widespread used treatment of lumbar disc herniation (LDH), due to advantages of less soft tissues damage, few postoperative complications1,2 and favorable long-term results. 3 Despite positive short-term outcomes, however, there are still some patients who require reoperation after PELD. Re-herniation of LDH is among the most common reason for reoperation after PELD, 4 with rates ranging from 0-12.5%. 2 A number of studies had reported risk factors of recurrent lumbar disc herniation (rLDH), like increased age, body mass index (BMI), more severe disc degeneration, increased sagittal range of motion, higher lumbar lordosis (LL), and sacral slope (SS), the course of disease, and Modic change.2,5,6 Nevertheless, factors holden by some studies, present none significant impact on rLDH in others,5,7 and contrary results were also existed.8,9 As such, the first purpose of this research is to further investigate the related risk factors of rLDH.
The multivariate logistic regression analysis relies on predefined relationships, which is not robust for complex relationships between characteristics and outcomes. 10 Machine learning (ML) is particularly more suited for finding meaningful patterns than conventional statistical methods in multidimensional data. 11 In recent years, studies that using ML to predict postoperative outcomes of spine diseases, remarkably increased. To our knowledge, few research made use of ML in predicting possible rLDH after PLED. None deep learning algorithm was yet applied in this work. The second goal of this study was to develop ML models identifying potential rLDH patients, and then compare performance between deep learning, ensemble models, and weak classifiers.
Methods
Patient Population
We retrospectively analyzed 1159 patients who had undergone single-level PELD for LDH between July 2014 to December 2019 at Department of Spine Surgery, Zhongda Hospital affiliated to Southeast University. The inclusion criteria were: (1) patients who were diagnosed with LDH as soft protrusion (not causing by disc calcification) according to clinical manifestations and radiological characteristics; (2) deterioration or none remission of symptom after 6 weeks of conservative treatment. Patients were excluded based on: (1) patients who had undergone a previous spinal surgery; (2) patients who fail to complete follow-up; (3) patients who lacked clinical and imaging data. The operation was performed by 2 senior surgeons following standard procedures. We defined rLDH as the recurrence of leg pain with or without lower back pain after at least 1 month of pain-free interval, which is due to recurrent disc herniation at the same segment confirmed by MRI imaging.
Examined Variables
Imaging Variables and Demographic Metrics.
Recurrent lumbar disc herniation (rLDH), body mass index (BMI), pelvic tilt (PT), sacral slope (SS), lumbar lordosis (LL), disc height index (DHI), facet orientation (FO), facet tropism (FT).
Statistical Analysis and ML Development
Univariate analysis was performed using SPSS Statistics software (version 23.0). Student’s t test and Chi-squared test were utilized for univariate analysis, setting significance at P < .05. As feature selection, factors with P < .1 were added to ML models for identifying rLDH.
Normalization of data was performed through MinMaxScaler approach of Sklearn package. Using the SMOTE approach 19 to overcome class imbalance impact on result, which was confirmed in previous research. The total dataset was randomly divided into training set and test set at 7:3.
Machine learning models were developed with the Sci-Kit Learn package in Python (version 3.7.6): Extreme Gradient Boost classifier (XGBoost), KNeighborsClassifier (KNN), Decision tree classifier (Decision Tree), Random forest classifier (Random Forest), and support vector classifier (SVC). For tuning parameters, Grid searching approach was performed in training set, which was evaluated by 10-fold cross-validation. Models were built based on best parameter from Grid searching, including KNN, Decision Tree, Random Forest, and SVC. We manual tuned each parameter of XGBoost. Artificial neural networks (ANN) were conducted in Tensorflow (version 2.3) keras package. In ANN framework, dense layers using activation of relu function. Dropout layers was added following dense layers to inhibit overfitting. Gradient descent method was used to reduce loss of the ANN. Figure 1 Measures were calculated to evaluate models’ performance in test set: receiver operating characteristic (ROC) curve, area under the curve (AUC) score, accuracy score, recall score, F1 score, and precision score. Inner structure of ANN model. Dropout layers were conducted to inhibit overfitting. Artificial neural networks (ANN).
Results
Variables between Cases with rLDH and with Non-rLDH in Univariate Analysis.
Recurrent lumbar disc herniation (rLDH), body mass index (BMI), pelvic tilt (PT), sacral slope (SS), lumbar lordosis (LL), disc height index (DHI), facet orientation (FO), facet tropism (FT).
Extreme Gradient Boost classifier, Random Forest, and ANN showed AUC score at .9315, .9220, and .8814 respectively. While, KNN, Decision Tree, and SVC had lower AUC scores: .8081, .7571, and .6842 respectively. Figure 2 XGBoost, Random Forest, ANN, and KNN had higher accuracy score: .8641, .8236, .843, .8091 respectively. Decision Tree, and SVC had accuracy score: .7654 and .6845. XGBoost, Random Forest, ANN, KNN, Decision Tree, and SVC had recall score: .8269, .8462, .891, .9103, .8109, and .7083. XGBoost, Random Forest, and ANN had fine precision score: .8958, .8123, and .8152. KNeighborsClassifier, Decision Tree, and SVC had lower precision score: .7594, .7463, and .68. Extreme Gradient Boost classifier, Random Forest, ANN, and KNN had higher F1 score: .86, .8289, .8515, and .828 respectively. Decision Tree, and SVC had F1 score: .7773, and .6939. Table 3 ROC curve of models. Note: Receiver operating characteristic (ROC) curve, Artificial neural networks (ANN), Extreme Gradient Boost classifier (XGBoost), KNeighborsClassifier (KNN), Decision tree classifier (Decision Tree), Random forest classifier (Random Forest), support vector classifier (SVC). Performance of Models. Artificial neural networks (ANN), Extreme Gradient Boost classifier (XGBoost), KNeighborsClassifier (KNN), Decision tree classifier (Decision Tree), Random forest classifier (Random Forest), support vector classifier (SVC).
Discussion
The prediction of rLDH may assist surgeons to identify potential risk patients, select manner of surgery, and decrease the rates of rLDH, which is the most common reason for reoperation after PELD. The result based on samples in test set shown that XGBoost, Random Forest, and ANN had fine predictable capacity. However, it is still a very long way before ML models of rLDH play a role in practical work. We hold the current study is a necessary step before a mature medical assistant tool built in future research. We identified higher BMI, lower FO, the presence of Modic changes, and herniation type (noncontained herniation) as significant risk factors of rLDH. Moreover, disc calcification in a non-protrusive region may be a protective factor after PELD.
In recent years, a few models had applied to extract information from complicated data for predicting rLDH. Jia et al. developed a nomogram with LASSO regression model selecting factors to predict rLDH within 6 months after PELD. 6 A retrospective nonlinear multiple logistic regression prediction model was built for rLDH after PELD. 20 Harada et al. identified significant risk factors of rLDH after microdiscectomy via an ML approach. 11 This study presents a set of predictive models for rLDH following PELD. As we know, this is the first study to use a deep learning algorithm in the task. Ensemble models like XGBoost and Random Forest were designed to combine weak classifiers for elevating predictive ability. Compared with previous studies, we utilized more measure score to comprehensively evaluate models, like F1 score, recall score, and precision score. Artificial neural network, XGBoost, and Random Forest had fine AUC score, which indicates deep learning and ensemble models have similar fitting capacity in the low order of magnitude data. Extreme Gradient Boost classifier, Random Forest, and ANN also had high F1 score, which is the synthesis of recall score and precision score.
Previous studies had identified a few potential risk factors associated with rLDH, including age, BMI, the course of disease, disc degeneration, adjacent-level disc degeneration, sagittal range of motion, DHI, LL, SS, Modic changes, migration grade, and smaller-sized herniated discs.5-7,21 Compared with a recent study based on ML 11 , our research encompasses more radiological characteristics, which may better reflect the population undergoing PELD and enlarge the scope of predictors, such as FO, FT, and Disc calcification in a non-protrusive region.
BMI was also noted as a significant factor in this study. Researches had frequently suggested a positive relationship between high BMI and rLDH after PELD,22,23 which were consistent with our pervious study. 24 In a biomechanical analysis, obesity would noticeably increase the load on spine, which lead to damages of discs such as perturbation in nutrient supply, cell apoptosis, inflammation, additional innervation, and subsequent degeneration. 25 An increase in pressure within the disc results in greater burden and thus shear strain on the posterolateral part of the annulus fibrosus, 22 that may significantly influence the biomechanical characteristic and morphology of postoperative discs. 26 Our analysis also identified Modic changes as a risk factor for rLDH, which is in accordance with the literature where it has been reported that rLDH after PELD preferentially occurs when Modic changes or herniated cartilage are present.6,27,28
Li et al. reported that FO and FT play important roles in the development of rLDH after discectomy. 15 Moreover, a similar result was presented by another retrospective study with following up more than 5 years. 29 When placed in a flexed or imbalanced position, facet joints and vertebral discs may face accelerated degeneration compared to a symmetric counterpart. 15 Finite element analysis and contour maps visualization confirmed that FT and high sagittal orientation may increase the risk of rLDH due to increasing ipsilateral disc pressure. 30 Our research highlights, that low FO is the significant risk factor of rLDH, while there is no significant correlation between rLDH and FT. The dissimilar result may be related to differences between PELD and open discectomy, which pervious study was investigated. It is necessary to perform further verifications on FO and FT in the prospective study with larger sample.
Our result showed that rLDH individuals had more percent of the noncontained herniation which includes transligamentous extrusion and sequestered disc herniation. 16 Li et al. suggested herniation type (transligamentous extrusion) increase recurrence after discectomy. 15 However, another study held opposite viewpoint that subligamentous disc herniation is the risk factor for requiring surgical treatment of a first rLDH. 31 Disc calcification at non-protrusion part may be a protective factor after PELD. Calcification is an adaptive response to stimulus of environment, and thus the inflammatory response of discs may be weaker during calcification period. 24
First limitation of the work is that the data retrieved from a single institution, and further external validation is needed to verify models’ expansibility. Next, the design of retrospective study may limit the present evidence. The Black box is a common issue in application of ML, that most AI technologies are operating based on opaque logic and hard to be understanded.
Conclusions
It is meaningful to predict the rLDH occurrence after PELD for suitably surgery choosing. The study identified high BMI, low FO, Modic changes, and herniation type (noncontained herniation) are significant risk factors of rLDH. Disc calcification at non-protrusion part may be a protective factor after PELD. Two ensemble models and a deep learning achieved fine predictive performance.
Footnotes
Authors’ Contributions
GuanRui Ren and Lei Liu contributed to the study conception and design. Data collection were performed by PeiYang Wang, Wei Zhang, Hui Wang, MeiJi Shen LiTing Deng, YuAo Tao, Xi Li and JiaoDong Wang. Analysis was performed by GuanRui Ren. The first draft of the manuscript was written by GuanRui Ren and Lei Liu. ZhiYang Xie, YunTao Wang, and XiaoTao Wu commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Ethics Approval
The study had been approved by Zhongda Hospital Affiliated to Southeast University ethics committee and the reference number was 2021ZDSYLL223-P01.
Consent to Participate
This study had met the requirements for exemption from informed consent, and approved to exemption from informed consent by Zhongda Hospital Affiliated to Southeast University ethics committee.
Availability of Data and Material
The datasets analysed during the current study are available from the corresponding author on reasonable request.
