Abstract
Study Design
Retrospective Cohort Study.
Objectives
Lumbar spinal fusion is an increasingly utilized surgery within the United States attributed to an aging population with an increasing prevalence of degenerative spinal disease. Primary approaches include anterior lumbar interbody fusion (ALIF) and posterior or transforaminal lumbar interbody fusion (PLIF/TLIF), for which indications are preference driven. This study’s objectives were to compare the clinical, radiographic, and functional outcomes of approaches for lumbar spondylolisthesis.
Methods
We conducted a retrospective chart review of 1156 (267 ALIF; 889 PLIF/TLIF) patients who underwent ALIF or PLIF/TLIF due to spondylolisthesis at the L4-5, L5-S1, or L4-S1 levels at a large tertiary referral center from 2010 through 2018. Univariate and multivariable linear and logistic regressions were used to compare outcomes including radiologic fusion, change in lumbar lordosis, postoperative complications, and pre- to postoperative changes in quality-of-life (QoL) variables.
Results
Between propensity-weighted cohorts, multivariate regression showed markedly increased odds of radiographic fusion for the ALIF cohort (Odds Ratio [OR]: 2.50, 95% Confidence Interval [CI]: 1.51-4.15). ALIF patients had shorter lengths of stay and less blood loss. However, there was no difference in the odds of reoperation within 1 year or any complications. There was also no significant group difference in change in Cobb angle or in any of the QoL variables.
Conclusions
These results indicate that ALIF may have greater radiologic success of fusion. However, when controlling for confounders, there is no difference in clinical outcomes between the approaches. Further research should evaluate the long-term cost-effectiveness of the 2 procedures.
Introduction
Lumbar spinal fusion is an increasingly utilized surgery within the United States (US) attributed to an aging population with an increasing prevalence of degenerative spinal disease.1,2 Degenerative lumbar spondylolisthesis is a pathologic condition of the lumbar spine characterized by the displacement of 1 superior vertebra over the adjacent caudal vertebra with an intact posterior arch. 3 The prevalence of degenerative lumbar spondylolisthesis has been reported to range from 6% to greater than 20%, and it is more commonly diagnosed in patients older than 50. 3 Subsequently, the procedural volume and instrumentation cost place a substantial burden on the US healthcare system with Medicare spending $778 million of lumbar spine surgeries in 2003. 4 In 2018, spine fusion was the most costly principal operating room procedure, with stays totaling $14.1 billion in aggregate costs. 5 As the prevalence of lumbar fusion surgeries has continued to rise, 4 it is critical to improve our understanding of the outcomes and complications related to these procedures.
The primary approaches for lumbar spine fusion include anterior lumbar interbody fusion (ALIF) and posterior or transforaminal lumbar interbody fusion (PLIF/TLIF). ALIF is the more expensive procedure as it may necessitate longer lengths of stay and vascular surgery involvement. 6 While both approaches have been studied extensively, there are no clear indications for which approach a surgeon should utilize and so the decision is often a result of surgeon and patient preference.7–13 For instance, in patients with extensive abdominal scarring or a significantly elevated body mass index (BMI), the posterior approach is often preferred. 14 Conversely, for those with a history of posterior cerebrospinal fluid (CSF) leaks following a microdiscectomy, the anterior approach may be more appropriate. 14 From a radiographical and complications perspective, studies have shown ALIF to be associated with greater restoration of disc height, segmental lordosis, whole lumbar lordosis, and lower odds of dural injury.15,16 Additionally, ALIF was shown to deliver better improvement in patient-reported outcomes (PROs). 16 However, ALIF required longer hospitalizations and was associated with higher odds of blood vessel injuries, ileus, wound infection, deep vein thrombosis (DVT), and 30-day readmission compared to PLIF/TLIF.15,17 These factors all significantly increased the 30- and 90-day costs of ALIF compared to PLIF/TLIF. 17 Nonetheless, these reported results generally fail to control for pre-operative and intraoperative factors which may influence indications and outcomes of ALIF vs PLIF/TLIF in patients with spondylolisthesis. Furthermore, ALIF procedures can be further subdivided into those with and without posterior fixation. Stand-alone constructs have been shown to have higher rates of radiographic nonunion compared to those including posterior instrumentation in patients with spondylolisthesis.18,19 Still, data regarding clinical and patient reported outcomes (PROs) for these procedural subgroups remains limited.
Therefore, we aimed to assess differences in (1) radiographic outcomes and (2) clinical outcomes and PROs between patients receiving ALIF vs PLIF/TLIF after controlling for patient presentations. Secondarily, we compared radiographic outcomes, clinical outcomes, and PROs for ALIF patients stratified by posterior fixation.
Methods
Study Population
Institution review board (IRB) approval was obtained at our institution for this retrospective chart review of patients who underwent ALIF or PLIF/TLIF at the L4-5, L5-S1, or L4-S1 levels due to spondylolisthesis at a large tertiary referral center from 2010 through 2018. Inclusion criteria included age 18 or greater at the time of surgery, surgery for spondylolisthesis with or without spinal stenosis/radiculopathy, and a minimum of 1-year follow-up. Exclusion criteria included patients younger than 18 years of age at the time of surgery and patients with spinal malignancy or infection to minimize potential confounders.
Data Collection
The electronic medical record was queried for patient clinical information. The following preoperative variables were collected: age at surgery, date of birth, sex, race, driving time to the hospital, BMI, diabetes, tobacco use, alcohol use, osteoporosis, osteopenia, and surgical indications. The following intraoperative variables were collected: levels of fusion, operative time, blood loss, whether the surgery was a revision, use of allograft or autograft, use of bone morphogenetic protein 2 (BMP-2), epidural placement, and posterior fixation for patients undergoing ALIF. The following postoperative outcomes were collected for the initial inpatient stay: length of stay (LOS), discharge disposition, ileus, durotomy, surgical site infection, DVT, urinary tract infection (UTI), pneumonia, sepsis, blood transfusion, intubation days, and surgical intensive care unit (SICU) or medical intensive care unit (MICU) stay. Additionally, revision surgery within 1 year, 90-day readmission, and 90-day emergency department (ED) visits were collected.
Imaging analysis was conducted to determine lumbar lordosis and bony fusion based on postoperative X-rays collected up to 2 years following surgery. Radiographic analysis was completed on the Impax and XERO viewing platforms (Agfa, Mortsel, Belgium) independently by 2 neuroradiologists. Cobb angle measurements for lumbar lordosis were measured with the inbuilt angle measurement tool from L1-L5. Fusion was graded according to the Brantigan, Steffee, Fraser (BSF) scale 20 as follows: BSF-1 describes radiographic pseudoarthrosis, BSF-2 represents radiographical locked pseudoarthrosis where there is fusion centrally into the cage (possible fusion), BSF-3 describes radiographic fusion (definite fusion), and an additional category: unable to determine. When there was disagreement between ratings, we coded success of fusion as the lower of the 2 ratings when performing analysis.
For quality-of-life (QoL) variables, the Knowledge Program, an institutional database of patient-completed surveys, was queried for pre- and postoperative measurements. The following QoL and psychological status variables were collected: Patient Health Questionnaire 9 (PHQ-9), Perceived Deficits Questionnaire (PDQ), Patient-Reported Outcomes Measurement Information System (PROMIS), Oswestry Disability Index (ODI), and EuroQol-5D (EQ-5D).
Data Analysis
All statistical analyses were done in R, version 3.6.1 (Vienna, Austria). All tests were two-sided, and significance was set at P = 0.05. Propensity score weighting was utilized to balance preoperative and intraoperative variables between the ALIF and PLIF/TLIF. Preoperative covariates included age, sex, race, drive time to hospital, diabetes, BMI, tobacco use, osteoporosis, osteopenia, prior spinal diagnoses (lumbar stenosis, foraminal stenosis, spondylolisthesis, degenerative disc disease, pseudarthrosis/hardware failure, and adjacent segment disease), other prior surgeries, pre-surgery Cobb angle, and pre-operative PRO scores (PDQ, PHQ-9, EQ-5D, PROMIS-GH Mental T-score, and PROMIS-GH Physical T-score). Intraoperative covariates included level of analyzed surgery (L4-L5, L4-S1, L5-S1), operative time, revision surgery, BMP-2 use, and epidural use. A generalized boosted model (GBM) within the twang package in R was utilized to estimate propensity scores and weight observations accordingly. 21 GBM estimation in the present study computed the probability of belonging to the ALIF group. This machine learning method can assess relationships between the dependent variable and covariates, even those that are nonlinear, and automatically handles missing data. 21 Covariate balance was assessed by computing group means and standard deviations for continuous variables and proportions for categorical variables. Covariates with absolute standardized bias of less than 0.2 were considered well-balanced.
Univariate analyses, Fisher’s exact test for categorical variables and Mann-Whitney U tests for continuous variables, were conducted to compare postoperative outcomes between the ALIF and PLIF/TLIF groups. Postoperative outcomes were also compared between ALIF patients who had received posterior fixation vs those who had not. Additionally, multivariable linear and logistic regression were also used to compare outcomes utilizing the survey package in R. 22 Outcomes of interest were dependent variables in these regression models with surgical approach (ALIF vs PLIF/TLIF) being the independent variable. Unbalanced covariates and any covariates related to the outcome at the univariate level were included in the model to obtain doubly robust estimates of group differences.
Logistic regression was used for the following outcomes: revision surgery within 1 year, 90-day readmission, 90-day ED visit, discharge disposition (home vs other), blood loss, and other complications (yes vs no). The other complications included ileus, durotomy, surgical site infection, DVT, UTI, pneumonia, sepsis, blood transfusion, intubation days, SICU stay, or MICU stay. Linear regression with logarithmic transformation were used for length of stay, total blood loss, change in lumbar lordosis, and change in patient-reported QoL variables.
Secondarily, the radiographic outcomes, clinical outcomes, and PROs between ALIF with and without posterior fixation were compared using Fisher’s exact test for categorical variables and two-sample t-tests or Mann-Whitney U tests for continuous variables.
Results
Patient Characteristics Before and After Propensity Score Weighting.
Neff – effective sample size after applying propensity score weights, SD – standard deviation.
Radiographic Outcomes for ALIF vs PLIF/TLIF
Counts of Fusion Success Ratings by Two Radiologists for Patients Who Were Rated by Both Radiologists.
Fusion was graded according to the Brantingan, Steffee, Fraser (BSF) scale as follows: BSF-1 = radiographic pseudarthrosis (“Not Fused”), BSF-2 = fusion centrally into cage (“Possibly Fused”), and BSF-3 = radiographic fusion (“Definitely fused”).
Unadjusted Frequency of Patients’ Post-surgery Fusion Status and Change in Cobb Angle.
P < 0.05 in bold. Fusion was graded according to the Brantingan, Steffee, Fraser (BSF) scale as follows: BSF-1 = radiographic pseudarthrosis (“Not Fused”), BSF-2 = fusion centrally into cage (“Possibly Fused”), and BSF-3 = radiographic fusion (“Definitely fused”).
Results of Multivariable Logistic and Linear Regression Models Incorporating Propensity Score Weights for Radiologic Outcomes.
aOdds ratio for logistic regression models and beta estimate for linear regression models. Linear regression models in italics. P < 0.05 in bold. Fusion was graded according to the Brantingan, Steffee, Fraser (BSF) scale as follows: BSF-1 = radiographic pseudarthrosis (“Not Fused”), BSF-2 = fusion centrally into cage (“Possibly Fused”), and BSF-3 = radiographic fusion (“Definitely fused”).
Clinical and Patient Report Outcomes for ALIF vs PLIF/TLIF
Unadjusted Comparison of Clinical and Patient Reported Outcomes.
a– consists of at least of ileus, durotomy, surgical site infection, DVT, UTI, pneumonia, pneumonia, sepsis, transfusion, SICU, MICU, ICU bounce back, or intubation. P < 0.05 in bold.
Results of Multivariable Logistic and Linear Regression Models for Clinical and Patient Reported Outcomes.
aOdds ratio for logistic regression models and beta estimate for linear regression models. Linear regression models in italics.
bConsists of at least of ileus, durotomy, surgical site infection, DVT, UTI, pneumonia, pneumonia, sepsis, transfusion, SICU, MICU, ICU bounce back, or intubation. P < 0.05 in bold.
Radiographic, Clinical, and Patient Reported Outcomes for ALIF with vs without Posterior Fixation
Unadjusted Frequency (Percentage) of ALIF Patients’ Post-surgery Fusion Status, Stratified by Posterior Fixation.
P < 0.05 in bold. Fusion was graded according to the Brantingan, Steffee, Fraser (BSF) scale as follows: BSF-1 = radiographic pseudarthrosis (“Not Fused”), BSF-2 = fusion centrally into cage (“Possibly Fused”), and BSF-3 = radiographic fusion (“Definitely fused”).
Unadjusted Comparison of Clinical and Patient Reported Outcomes for ALIF Patients, Stratified by Posterior Fixation.
aConsists of at least of ileus, durotomy, surgical site infection, DVT, UTI, pneumonia, pneumonia, sepsis, transfusion, SICU, MICU, ICU bounce back, or intubation.
Discussion
Lumbar spine surgery for degenerative disease is an increasingly prevalent procedure within the US. Due to the volume of surgery and the costs of instrumentation, it will be critical to understand the varying outcomes of different approaches. The present analysis has compared the radiographic, clinical, and patient-reported outcomes following ALIF (anterior approach) and PLIF/TLIF (posterior approach). There was significant covariate imbalance despite the use of highly robust propensity score weighting algorithms, suggesting varying patient selection and intraoperative factors between procedures. Additionally, while rates of fusion were greater amongst the ALIF group, there were no differences in terms of complications, reoperations, readmissions, changes in Cobb angle, or changes in PROs between the ALIF and PLIF/TLIF groups. There were however large differences in length of stay and blood loss. Together, our results suggest that ALIF is associated with greater rates of radiographic fusion, but similar clinical and functional outcomes.
Our propensity score weighting model demonstrated covariate imbalance following implementation. For 1, diabetes remained less likely in the ALIF group. Diabetes may have served as a surrogate for iliac artery calcification, as diabetics have been shown to have greater development of atherosclerosis compared to non-diabetics. 23 This is of note as surgeons often consider significant iliac artery calcification or atherosclerotic disease to be an exclusion criterion for ALIF, and diabetics have been shown to have lower rates of fusion in comparison to matched controls.24,25 Additionally, a greater proportion of patients received L4-L5 surgery with PLIF/TLIF while a greater proportion of patients received L5-S1 surgery with ALIF. The interspaces of the lumbar spine have been associated with different rates of fusion due to biomechanical differences. 26 A greater proportion of patients receiving PLIF/TLIF also had lumbar stenosis, which may have been significant as the surgeon may have opted to perform a concomitant laminectomy. As such, the choice to conduct ALIF vs PLIF/TLIF for lumbar spondylolisthesis may be multifactorial, considering both patient comorbidities and presentation.14,27 Furthermore, the increased prevalence of BMP-2 in the ALIF group can readily be explained by the fact that single-level ALIF is a Food and Drug Administration-approved indication for BMP-2 usage due to strong evidence of improved rates of fusion, while PLIF and TLIF are off-label uses with lower quality of evidence for improved fusion rates. 28 A meta-analysis by Hofstetter et al. found that the use of BMP in ALIF resulted in an increase in fusion rates from 79.1% (95 % CI: 57.6%-91.3%) in the control group to 96.9% (95% CI: 92.3%-98.8%) in the BMP-treated group (P < 0.01). 29 However, the use of BMP in PLIF/TLIF had a minimal effect on fusion rates, with the BMP group achieving 95.0% (95% CI: 92.8%-96.5%) compared to 93.0% (95% CI 78.1%-98.0%) in the control group. 29 Similar findings were reported in a meta-analysis by Lytle et al. that found fusion rates did not differ significantly between different doses of BMP for posterior interbody fusions. 30
Radiographic Outcomes for ALIF vs PLIF/TLIF
Markedly higher rates of fusion were observed for patients who received ALIF when compared to those receiving PLIF/TLIF for spondylolisthesis in our study. This may be partially attributed to the greater use of BMP-2 in the ALIF patients. In contrast, a meta-analysis by Phan et al showed no difference in fusion rates between ALIF and TLIF (88.6% vs 91.9%; P = 0.23). 15 The studies included in this meta-analysis had heterogenous definitions of successful fusion, and we believe our results are the first to describe the radiographic success of fusion of ALIF in comparison to PLIF/TLIF. When assessing fusion success radiographically using the BSF scale, we report ALIF having greater than twice the likelihood of radiographic fusion even when balancing for covariates. This is akin to the greater restoration of disc height for ALIF in comparison to PLIF/TLIF reported by Tye et al 16 (3.5 vs 6.7 mm, P = 0.01) and Lightsey et al 31 (8.7 vs 3.6 mm, P < 0.001). Both studies and Phan et al also reported a greater improvement of segmental lordosis for ALIF in comparison to PLIF/TLIF.15,16,31 The present analysis investigated changes in lumbar lordosis and found no differences between the 2 groups in contrast to the findings of Lightsey et al and Phan et al.’s meta-analysis (weighted mean difference = 6.33°, P = 0.03).15,31 This may be because our study controlled for many covariates which previous radiographic studies have not attempted to do. Overall, the present analysis demonstrated a greater likelihood of radiographic fusion for ALIF, but no difference in global restoration of lumbar lordosis.
Clinical and Patient Report Outcomes for ALIF vs PLIF/TLIF
Regarding clinical and patient-reported outcomes, our analysis found no differences in complications, reoperations, readmissions, or changes in PROs between the ALIF and PLIF/TLIF groups. However, ALIF patients had a 19% shorter length of stay and 57% less blood loss. This contrasts with Phan et al.’s findings of longer hospitalization and higher risk of blood vessel injury. 15 Similarly, Qureshi et al’s study of an administrative database found that ALIF carried higher odds of ileus, wound infection, DVT, and 30-day readmission compared to PLIF/TLIF. 17 However, Rathbone et al.’s meta-analysis of 21 studies comparing ALIF vs PLIF/TLIF found similar results to ours in that blood loss was significantly lower in patients undergoing ALIF compared to both TLIF (mean difference (MD) = −192.65 mL, 95% CI: −256.41 mL to −128.90 mL, I2 = 94%) and PLIF (MD = −186.61 mL, 95% CI: −355.18 mL to −18.04 mL, I2 = 94%). 32 This meta-analysis also found a shorter length of stay when comparing ALIF to TLIF (MD = −0.71 day, 95% CI: −1.42 days to 0.00 days, I2 = 0%) but no difference between ALIF and PLIF. 32 With regards to patient-reported outcomes, both Tye et al and Lightsey et al reported greater improvements in PROs in ALIF patients.16,31 The present analysis builds on the findings of these previous studies by elaborating upon differences in outcomes between ALIF and PLIF/TLIF in balanced cohorts. As our results demonstrate that ALIF may have a shorter length of stay and less blood loss, future studies may wish to investigate the cost-effectiveness of this more expensive procedure in similarly presenting patients. 6
Radiographic, Clinical, and Patient Reported Outcomes for ALIF with vs without Posterior Fixation
The present analysis showed that radiographic fusion was greater for single-level ALIF procedures with posterior fusion in the univariate but no differences in two-level procedures. There were no differences in clinical outcomes or PROs between the ALIF cohort that received posterior fixation and the cohort that did not. In a cohort of 80 patients who received a one- or two-level fusion in a for degenerative disk disease and lytic spondylolisthesis., McCarthy et al found radiographic fusion in 100% of patients with posterior instrumentation and in 65% of patients with stand-alone construct. 18 Similarly, in Anjerwalla et al.’s radiographic study of 117 patients who underwent ALIF for discogenic back pain, the fusion rates were 51% for stand-alone ALIF, 58% with translaminar screws, 89% with unilateral pedicle screws, and 88% with bilateral pedicle screws (P < 0.01). 19 Our radiographic results support the finding of prior studies. While no differences in radiographic fusion were detected between our two-level cohorts this may be due to the limited number of patients who received two-level ALIF without posterior fixation (n = 8). In a large registry analysis of 1377 patients, Laiwalla et al. found operative, symptomatic nonunion rates to be lower for ALIF with posterior supplementary fixation (Hazard Ratio [HR] = 0.22, 95% CI = 0.06−0.76). 33 However, operative nonunion was rare in both techniques (<5%) and there were no differences in operative adjacent segment disease. Our results also show no difference in revision rates, complications, and PROs. 33 As such, the radiologic benefits of posterior fixation may not be clinically apparent. Because posterior fixation requires longer operative times, it can predispose to greater intraoperative complications. 34 At our institution, this was a decision made based on various factors such as expected stability after ALIF, degree of spinal deformity, bone quality, risk of complications from posterior instrumentation, and patient-specific factors. Further research is required to discern the patients who require improved stability with posterior fixation and for whom it is cost-effective.
Limitations
This study is not without its limitations. As it was a retrospective analysis, many of the patient characteristics and clinical outcomes relied on the accurate documentation of clinicians. This is a single-center study, and operative outcomes and cost can vary by region. 35 Furthermore, our imaging analysis for evidence of radiographic fusion had relatively low rates of definitive or possible bony fusion. This may be due to the use of plain radiographs to assess the success of fusion, as CT has been shown to be the optimal tool to assess fusion. 36 Because CT scans were not available for all patients included within this study, we opted to use universally available plain radiographs which were reviewed separately by 2 neuroradiologists. As inter-reviewer agreement was high, our results may be interpreted as internally valid. Furthermore, a study by Fogel et al. has shown equal accuracy of fusion via plain radiographs or CT scans when compared to surgical exploration of fusion in interbody fusion surgeries. 20 Lastly, there remained covariate imbalances following propensity score weighting in our cohort comparisons. Although we also utilized multivariate regression modelling to better reflect the association of ALIF and PLIF/TLIF to outcomes, factors such as BMP-2 use and patient comorbidities may have affected outcomes. Confounders such as BMP-2 use and baseline demographics may have also contributed to the differential rates of “definite fusion” (BSF-3) between ALIF cohorts with and without posterior fixation. Future randomized controlled trials are warranted to limit the effects of patient characteristics on the understanding of outcomes.
Conclusion
As lumbar spine surgery is an increasingly utilized surgery within the US, it is critical to understand the outcomes associated with anterior and posterior approaches. This will allow surgeons to better understand the cost-effectiveness of ALIF vs PLIF/TLIF for the healthcare system. Our analysis of data from a tertiary referral center demonstrates similar outcomes between ALIF and PLIF/TLIF in patients with spondylolisthesis, but greater likelihood of radiographic fusion in the ALIF group. Additionally, there may be factors which influence patient selection for ALIF vs PLIF/TLIF. Future studies investigating optimal patient selection for these approaches and outcomes in randomized setting are warranted. Based on our results, surgeons may wish to perform the procedure which they are most comfortable with. While PLIF/TLIF may be optimal for the healthcare system as it is less expensive and is associated with similar outcomes, further research evaluating the cost-effectiveness of the 2 procedures is warranted.
Footnotes
Author’s Note
Previous Presentations: American Association of Neurological Surgeons, May 3-6, 2024 Chicago.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
