Abstract
Purpose:
Previous studies have shown conflicting results regarding the factors affecting the clinical outcome after fusion for degenerative spondylolisthesis. However, no study has compared the best and worst clinical outcome groups using patient-reported outcome measures. We aimed to compare the characteristics of patients with best and worst outcomes following single-level lumbar fusion for degenerative spondylolisthesis.
Methods:
200 patients underwent single-level interbody fusion with a minimum 2-years follow-up were included. We excluded patients with surgical complications already-known to be associated with poor postoperative outcomes, including pseudoarthrosis and postoperative infection. According to 2-year postoperative Oswestry disability index scores, patients were divided into two groups; Best and Worst. Demographic, clinical and radiographic variables were compared between the two groups.
Results:
Compared with patients in the Best group, those in the Worst group were older (59.5 and 67.0 years, respectively; p = 0.012; odds ratio [OR], 1.143; 95% confidence interval [CI], 1.030–1.269) and had a longer duration of pain from onset (2.6 and 7.2 years, respectively; p = 0.041; OR, 1.021; 95% CI, 1.001–1.041). The cutoff value of pain duration from onset was measured as ≥3.5 years on Receiver operating characteristic analysis. Patients in the Worst group had a lower preoperative angular motion compared to those in the Best group (12.7° and 8.3°, respectively; p = 0.016; OR, 0.816; 95% CI, 0.691–0.963).
Conclusions:
Degenerative spondylolisthesis patients of good clinical outcome after single-level lumbar interbody fusion were relatively young, had a short symptom duration before surgery, and a high preoperative instability compared with the patient having poor postoperative clinical outcome. Therefore, these findings should be considered preoperatively when deciding the appropriate individual treatment plan.
Introduction
Traditionally, degenerative spondylolisthesis has been treated with interbody fusion surgery. 1 –3 In the United States, approximately 83% to 96% of patients with lumbar spinal stenosis with or without degenerative spondylolisthesis who have undergone surgical treatment have been shown to eventually undergo fusion surgery. 4,5
Previous studies have determined the factors associated with poor clinical outcome after lumbar fusion surgeries in spinal stenosis with or without spondylolisthesis. 6 –10 Age, sex, worker’s compensation insurance status, operation time, number of fusion segments, antidepressant medication history, back pain as the predominant symptom, and symptom duration were probable demographic factors suggested by previous studies to be related to clinical outcome. In addition, multifidus intermuscular adipose tissue, preoperative high pelvic tilt (PT), and postoperative pelvic incidence-lumbar lordosis mismatch (PI-LL mismatch) were probable related radiographic factors. However, numerous suggested factors showed conflicting results in previous studies and remain controversial. For instance, a recent retrospective study 7 suggested that age is related poor clinical outcomes; however, some studies proposed that age is not significantly related. 6,9 As no significant difference was found between the groups, many previous comparative studies have failed to show statistically significant results for most of the parameters checked.
Some previous studies 11,12 grouped and compared the patients according to the presence or absence of specific parameters. Our study focused on groups based on Oswestry Disability Index (ODI) score 13 (Best [the best clinical outcome group] vs. Worst [the worst clinical outcome group]), not specific parameters. Also the present study focused on the extreme tails, which could emphasize and recognize the distinguishing differences between the two groups. From a different perspective with our newly attempted study design, the identification of the characteristic might be helpful considering surgical treatment in degenerative spondylolisthesis. If there is a modifiable factor prior to surgery, it may be advisable to correct it preoperatively. The patient with non-modifiable factor need to be explained that postoperative poor outcome would be suspected and treatment other than fusion might be recommended.
The aim of our study was to compare the demographic, clinical, and radiographic characteristics of patient groups based on the best and worst patient-reported outcome measure(PROM) after single-level posterior lumbar interbody fusion for degenerative spondylolisthesis.
Materials and methods
Patient population
After institutional review board (SMC-2017-11-125) approval was received, this retrospective review was performed of consecutively collected data from patients who underwent single-level posterior lumbar fusion surgery for degenerative spondylolisthesis between 2010 and 2016. All the patients underwent an operation performed by two senior surgeons at a single institute. The inclusion criteria were as follows: (1) single-level degenerative spondylolisthesis (Meyerding grade 1) 14 ; (2) single-level lumbar interbody fusion with a minimum 2-year complete follow-up (clinical and radiographic). Degenerative spondylolisthesis was defined as a degree of slip of >3 mm, measured using the Taillard technique, on standing lateral radiography. 15 All the included patients had mechanical back pain for more than 3 months and did not respond to conservative management. The exclusion criteria were as follows: (1) multiple-level degenerative spondylolisthesis; (2) isthmic spondylolisthesis; (3) high-grade lumbar spondylolisthesis; (4) lost to follow-up before 2 years; (5) preoperative definite neurologic deficit, grade IV or greater motor weakness; (6) postoperative surgical complications already-known to be associated with poor postoperative outcomes, including pseudoarthrosis, infection, interbody cage subsidence, screw loosening or rod breakage, and adjacent segment pathologies. We aimed to determine which demographic, clinical, and preoperative radiologic characteristics resulted in poor postoperative outcome, barring known surgical factors.
Surgical technique
All the patients underwent a single-level posterior lumbar interbody fusion followed by instrumentation. After laminectomy of the lower half of the proximal lamina and upper one-third of the distal lamina, bilateral foraminotomy, total facetectomy, and subtotal discectomy were performed. A bilateral titanium metallic bullet-type cage packed with local auto-laminectomized bone, allo-cancellous chip bone, and demineralized bone matrix, was inserted into the transforaminal approach. No posterolateral fusion was done. Our surgical technique is similar to transforaminal lumbar interbody fusion (TLIF) wherein the cage is inserted using the transforaminal approach with a slight retraction of the traversing root. However, it was different from minimally invasive TLIF in that the tubular retractor was not used.
Demographic and clinical analysis
Preoperative demographic data were gathered from proper medical records. Demographic categories included age, sex, body mass index (BMI), bone mineral density (BMD), and fusion level (L3–L4, L4–5, and L5–S1). The clinical categories included the mean follow-up period, preoperative and last follow-up ODI score, dominant symptom (lower back pain/leg radiating pain), walking distance without cessation, and symptom duration from the onset. In addition, we also determined the American Society of Anesthesiologists (ASA) score and the presence or absence of underlying diseases, including hypertension, diabetes mellitus, and psychological history. Patients completed the ODI survey preoperatively and postoperatively at 3, 6, and 12 months and then annually.
Radiographic analysis
Data collection was carried out by two independent senior spine surgeons who were not involved in the surgical treatment. Preoperative and postoperative radiographic data were collected consecutively. Inter- and intra-reliability was measured by kappa analysis. After 3 weeks, the same two senior spine surgeons measured radiographic data for a second time. Intra-class correlation coefficients (ICC) were used to evaluate inter- and intra-observer reliability. 16
The two different classification systems 17,18 to assess radiologic fusion were used to exclude pseudoarthrosis. Only the patients with solid fusion 17,18 were included in this study. Radiographic categories included segmental angle, disk height, degree of slip, pelvic incidence (PI), lumbar lordosis (LL), PT, sacral slope, C7 sagittal vertical axis (C7SVA), and PI-LL mismatch. Sagittal alignments were evaluated using conventional lumbar radiography at standing, flexion/extension, and 36-inch lateral standing films of the entire spine and both femoral heads. The segmental angle was measured using the Cobbs method between the upper end plate of the proximal fused vertebral body and the lower end plate of the distal fused vertebral body. Disk height was calculated as the average of the anterior and posterior disk height, and the degree of slip was calculated using the Taillard technique. 15 The C7SVA was measured using the horizontal distance between the C7 plumb line and the posterosuperior corner of the sacrum.
Data and statistical analysis
According to the postoperative 2-year ODI score, patients were divided into two groups. Patients with minimal (ODI score ≤20%) and severe disability (ODI score >40%) were assigned into the Best and Worst, respectively. In assessing patients with ODI score, the present study followed the traditional method of dividing the patients into several categories. 13,19 Our study focused on the tails at the extremes (Best vs. Worst), and the two patient groups were identified separately. These groups were compared based on demographic, clinical, and radiographic parameters.
Statistical analysis was performed using the SPSS ver 18.0 software (SPSS Inc., Chicago, IL). For continuous variables, independent t tests were used to assess differences in mean values between the two groups. For categorical variables, Pearson’s chi-square test or Fisher’s exact test were used. After univariate logistic analysis, multiple logistic regression analysis was used to adjust for the effects of the multiple covariates predictive of Best in comparison with Worst. Multivariable analyses were performed using all variables that had a significance of <0.05 in the univariate analysis. The preoperative ODI score was also included as an explanatory variable in multivariable analysis. A p value of <0.05 was considered statistically significant. Receiver operating characteristic (ROC) analysis was used to estimate the best cutoff points for the continuous variables that showed statistically significant results on multivariate analysis. 20 The area under the curve (AUC) was calculated as a measure of accuracy.
Results
A total of 200 patients who met the inclusion criteria were divided into Best and Worst, and 72 and 48 were allocated to the Best Worst groups (Figure 1). Only 16 patients were lost to follow-up and dropped out before completion of the study and finally 200 patients were selected. There were no statistically significant intergroup differences in the preoperative ODI scores between the two groups (Best and Worst: 54.4 ± 36.4 and 52.6 ± 16.1, respectively; p = 0.797). However, 2 years after the operation, a significant difference was found between the two groups (Best and Worst: 6.2 ± 4.5 and 44.4 ± 3.5, respectively; p < 0.001). Compared with the patients in the Best group, those in the Worst group were older and had a longer duration of pain from the onset. Patients in the Best group were approximately 7 years younger than those in the Worst group (59.5 ± 7.0 vs. 67.0 ± 8.6, p = 0.001). Moreover, the patients in Worst had preoperative pain for about 4.5 years longer than Best (2.6 ± 2.3 vs. 7.2 ± 7.8 years, respectively; p = 0.008).

Best and Worst groups based on 2-year postoperative ODI score. The patients were divided into two groups; those with minimal (ODI score ≤20%) and severe (ODI score >40%) disability in the Best and Worst, respectively. Left and right arrows indicate Best and Worst, respectively. Best, best clinical outcome group. Worst, worst clinical outcome group. ODI, Oswestry disability index.
For the radiographic parameters, patients in the Worst group had a lower preoperative angular motion (Best, 12.7° ± 6.1°; Worst, 8.3° ± 3.8°, p = 0.010). The other parameters were all statistically insignificant including postoperative PI-LL mismatch (p = 0.213). The baseline characteristics of Best and Worst are shown in Tables 1 and 2.
Comparison of demographic and clinical parameters between Best and Worst based on ODI scores after posterior lumbar interbody fusion with single-level degenerative spondylolisthesis.
Bold p values indicate statistical significance.
Best: best clinical outcome group; Worst: worst clinical outcome group; ODI: Oswestry disability index.
Comparison of radiographic parameters between Best and Worst based on the ODI after posterior lumbar interbody fusion with single-level degenerative spondylolisthesis.
Bold p values indicate statistical significance.
Best: best clinical outcome group; Worst: worst clinical outcome group; ODI: Oswestry disability index; PI-LL: pelvic incidence minus lumbar lordosis mismatch.
The univariate analysis revealed comparable results for the effect of the parameters between the two groups. Multiple logistic regression analysis was performed to determine which factors were related to Worst compared to Best. Among the demographic and clinical parameters, only older age (p = 0.012; odds ratio, 1.143; 95% confidence interval [CI], 1.030–1.269) was a statistically significant factor. Among preoperative radiographic parameters, longer pain duration (p = 0.041; odds ratio, 1.021; 95% CI, 1.001–1.041) and smaller preoperative angular motion (p = 0.016; odds ratio, 0.816; 95% CI, 0.691–0.963) were the two statistically significant parameters that correlated with the Worst group. Among the parameters, age was the most strongly correlated with the Worst group (Table 3).
Univariate and multivariate logistic regression analyses of factors distinguishing between Best and Worst on the basis of ODI score after posterior lumbar interbody fusion with single-level degenerative spondylolisthesis.
Bold p values indicate statistical significance.
Best: best clinical outcome group; Worst: worst clinical outcome group; ODI: Oswestry disability index; CI: confidence interval.
Given that symptom duration from onset had a statistically significant intergroup difference, sub-analysis was performed to determine its cutoff value. On ROC analysis, the optimal cutoff value of pain duration from onset for poor clinical outcome was measured as ≥3.5 years, with 62.9% sensitivity and 72.4% specificity (AUC, 0.684; 95% CI, 0.559–0.809; p value = 0.008) (Figure 2).

ROC curve for pain duration from onset. The optimal cutoff for maximum sensitivity and specificity was ≥3.5 years (AUC, 0.684; 95% CI, 0.559–0.809; p = 0.008). ROC, receiver operating characteristic. AUC, area under the curve. CI, confidence interval.
Reliability test of radiographic parameters
All ICC values for both the inter-rater and intra-rater reliability for all radiographic parameters showed excellent agreement. The intra-rater and inter-rater reliability were classified as excellent (average ICC = 0.94; range, 0.76–1.0; average ICC = 0.92; range, 0.70–1.0, respectively). The reliability was good to excellent for the 2 different classification systems 17,18 used to assess radiologic fusion and to exclude pseudoarthrosis.(intra-rater reliability, average ICC = 0.88, range, 0.74–1.0 and inter-rater, average ICC = 0.96, range 0.80–1.0).
Discussion
Barring known surgical factors such as infection, pseudoarthrosis or mechanical failure, we aimed to compare the demographic, clinical, and radiographic characteristics of patient groups based on the best and worst PROM after single-level posterior lumbar interbody fusion for degenerative spondylolisthesis. In our study, the range of postoperative ODI scores for the 200 patients was remarkably broad, from 0 (no disability and treatment indicated) to 50 (pain affect daily activities and remains the main problem requiring a detailed investigation). Our study focused on the extreme tails and compared Best and Worst based on 2-year postoperative ODI score (minimal disability patients [ODI score ≤20%] in the Best group and severe disability patients [ODI score >40%] in the Worst group). This study was designed to emphasize and recognize the distinguishing differences between the two groups.
From our study, for demographic and clinical characteristics, older age and longer duration of pain were statistically significant factors correlated to poor clinical outcomes. Previous studies reported conflicting results for numerous factors related to poor clinical outcomes. A retrospective study of 239 patients with degenerative spondylolisthesis underwent posterior lumbar fusion revealed that good clinical outcome correlated with shorter symptom duration, and not with age, sex, the degree of listhesis, or operation time. 10 However, the study population was relatively young (mean, 44.7 years), and if the study included the old females, the result might show different result. Contrary to the results of the present study, Fleming et al. 21 suggested that a symptom lasting up to 2 years prior to surgery might not be a useful predictor of postoperative disability scores. The patients were grouped into those with preoperative symptom duration <1 year, 1–2 years, or >2 years. No patient-reported outcome differed among the patient cohorts. However, their study did not take into account the patient with surgical complications such as infection or nonunion. Compared with the present study, the previous study included patients with relatively shorter symptom durations. In our study, the average symptom durations of Best and Worst groups were 2.6 and 7.2 years, respectively and the cutoff value of pain duration from onset was 3.5 years. The minimum follow-up period was only 6 months, as compared with at least 2 years in the present study, and isthmic and degenerative spondylolistheses were included together. We hypothesized that many previous comparative studies failed to show a statistically significant difference between comparison groups, which resulted in conflicting outcomes. Badhiwala et al. 22 recently reported propensity score-matched analysis for the effect of older age on the perioperative outcome in patients with degenerative spondylolisthesis. The study cohort consisted of 2238 patients. Propensity score matching was balanced for gender, race, underlying disease, the type of fusion, and the number of fusion levels. The authors suggested that fusion surgery might be performed safely in older patients. However, the previous studies did not consider the stability of degenerative spondylolisthesis.
With 364 patients with degenerative spondylolisthesis or spinal stenosis patients underwent fusion, Pearson et al. 23 suggested that patients with predominant leg pain had a more significant postoperative improvement than those with predominant lower back pain. However, the study from Pearson et al. excluded unstable degenerative spondylolisthesis and did not referred about the surgical complications including infection and pseudoarthrosis. With 7618 patients with lumbar degenerative disease, McGirt et al. 24 reported that employment status, the baseline clinical score, psychological distress, education level, workers’ compensation status, symptom duration, race, the ASA score, age, predominant symptom, smoking status, and insurance status were associated with postoperative outcome, but not gender, BMI or BMD. In the present study, a small number of patients was diagnosed with psychological disease, only 3 patients (6.3%) on Worst group. Many patients who suffered from psychological distress might not have been diagnosed with psychological disease or might not have wanted to reveal their psychological history. Moreover, the morbidity of the previous study was worse than that in our study. We assumed that the differences in results were due to different ASA score (there were only three patients [2.5%] with an ASA score of 3; however, approximately 37% [2826 patients] had an ASA score of 3 for the previous study). Ondeck et al. 25 compared ASA score, the modified Charlson Comorbidity Index, and the modified Frailty Index, and found that the ASA score and age had overall similar or better discriminative abilities for perioperative adverse outcome following posterior lumbar fusion. However, the previous study used adverse events and not PROM. Furthermore, a previous study suggested that BMI and gender did not produce statistically significant outcome, which was similar to our findings (p value of BMI and sex; 0.499, 0.124, respectively).
Studies have revealed conflicting results pertaining to the efficacy of decompression and fusion compared with that of decompression alone for the treatment of symptomatic degenerative spondylolisthesis. 26,27 These results raised the question of whether all patients with degenerative spondylolisthesis would require fusion. 28,29 In particular, there are suggestions that fusion is necessary in cases wherein instability is likely to occur due to wide decompression in degenerative spondylolisthesis. 27 The present study showed that the patients of poor postoperative clinical outcome had a relatively low preoperative instability (segmental angular motion) compared with the patients having good clinical outcome (p value = 0.016, Table 3).
Other radiographic parameters, including PT, PI-LL mismatch, and C7SVA showed no statistically significant results. These findings differ from those of previous studies. One hundred degenerative spondylolisthesis patients, who underwent short level posterior lumbar interbody fusion, were evaluated using PROMs. The patients were divided according to efficacy of the surgery by comparing preoperative and 2-year postoperative PROMs scores. Makino et al. suggested that older age group and spinopelvic malalignment, including preoperative high PT and postoperative increase of PI-LL mismatch, were related to poor clinical outcome. 7 However, the previous study included surgical complications including adjacent segment disease and nonunion. Additionally, PI-LL mismatch correction was ineffective because 43 of 48 patients retained the PI-LL mismatch postoperatively. In a retrospective study, Kim et al. suggested that degenerative spondylolisthesis patients who showed PT improvement following posterior lumbar interbody fusion had better clinical outcomes than those who did not show improvement in PT. 11 However, of 220 patients reviewed, only 18 patients were included. The study indicated that the patients with nonunion excluded; however, it made no mention of other surgical complications. Aoki et al. compared two groups on the basis of postoperative PI-LL mismatch, and reported that lower back pain while standing was most strongly related to PI-LL mismatch. The study suggested that surgeons should pay attention to sagittal spinopelvic alignment and avoid postoperative PI-LL mismatch even when performing short-segment lumbar interbody fusion. 12 In the present study, the average postoperative PI-LL mismatches of patients in the Best and Worst groups were only 1.1° and 4.7°, respectively. Conversely, approximately 60% of the patients in the previous study had PI-LL mismatch.
Our study had some limitations. First, it was retrospective and had a relatively small sample size, although the data used were collected consecutively. Second, the study design, focusing on the tails of the groups, did not adequately represent all the analyzed patient groups. Third, the 2-years postoperative ODI scores cannot represent all the patients’ clinical outcomes. Fourth, there could be selection bias as the present study ruled out surgical factors that are well known to be associated with postoperative poor clinical outcomes such as infection, interbody cage subsidence, screw loosening or rod breakage, adjacent segment pathologies, and pseudoarthrosis, however, the present study overcomes this limitation with detailed inclusion and exclusion criteria. Additionally, only 16 patients were lost to follow-up in the study. Fifth, as the present study only included the early (<2 years) postoperative period, no additional study has been conducted on adjacent segment disease. Lastly, the present study did not consider the baseline functional status of the patients, such as preoperative disability or pain.
Conclusion
After ruling out the already-known surgical complications associated with poor postoperative outcome, such as pseudoarthrosis and postoperative infection, the present study compared the demographic, clinical, and radiographic characteristics of the Best and Worst patient groups after single-level lumbar fusion for degenerative spondylolisthesis. We conclude that the patients of good postoperative clinical outcome were relatively young, had a short symptom duration before surgery, and a high preoperative instability compared with the patient having poor clinical outcome. Therefore, these findings should be considered preoperatively to decide the appropriate treatment plan for each individual.
Footnotes
Author contributions
Se-Jun Park performed conceptualized and designed of the work. And have drafted the work. Keun-Ho Lee have drafted the work and analyzed the data set, Chong-Suh Lee interpreted of data. Ki-Tack Kim conceptualized or designed of the work. All authors read and approved the final manuscript.
Availability of data and materials
The datasets are available on request due to privacy or other restrictions.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Ethics approval
Ethics approval and consent to participate; the need for approval was waived by IRB (SMC-2017-11-125).
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
