Abstract
Study Design
Retrospective cohort study.
Objective
To evaluate the predictive performance (discrimination and calibration) of the mFI-5 for major postoperative complications in thoracolumbar spine surgery, and to compare its effectiveness in low-, moderate-, and high-complexity procedures.
Methods
The study was conducted on adult patients (>18 years) who underwent thoracolumbar spine surgery at a single tertiary care center between 2017 and 2020. The primary outcome was the incidence of major complications within 90 days. Model discrimination was assessed using the area under the receiver operating characteristic curve (AUROC), and calibration was evaluated with calibration-in-the-large (CITL), calibration slope, and the Hosmer-Lemeshow test.
Results
A total of 839 patients were included (mean age: 62.8 years; SD: 16.8). Major complications occurred in 8.2% of cases. The mFI-5 demonstrated fair discrimination (AUROC: 0.66; 95% CI: 0.60-0.70) and excellent calibration (slope = 1, CITL = 0; Hosmer-Lemeshow P = .99). Stratified analysis showed improved discrimination in high-complexity surgeries (AUROC: 0.74; 95% CI: 0.64-0.84), compared to moderate (0.62; 95% CI: 0.48-0.74) and low complexity (0.63; 95% CI: 0.50-0.74) procedures. Readmission rates were 7% at 30 days and 9% at 90 days, with a 6-month mortality rate of 1%.
Conclusion
The mFI-5 is a valuable tool for predicting major complications in thoracolumbar spine surgery, particularly in high-complexity procedures. Its predictive performance is limited in lower-complexity surgeries. Further prospective studies are needed to validate its use and enhance preoperative risk stratification.
Introduction
Postoperative complications are a common concern, associated with increased morbidity and mortality, prolonged hospital stays, and higher healthcare costs. A complication is defined as any event that deviates from the expected postoperative course. 1 Several classification systems have been developed to categorize postoperative complications, though many are difficult to interpret.2,3 Dindo and Clavien 4 introduced a classification system for general surgery that has demonstrated high reproducibility. This system categorizes complications into grades 1 to 5 based on severity5,6 and has since been applied in various specialties, including urology,7 nephrology,8,9 gastroenterology,10,11,12 and, more recently, orthopedics, trauma surgery, and spinal surgery.13,14
In our population, there is a growing number of elderly and frail patients. Global projections estimate that between 2012 and 2030, the population aged over 65 will double or even quadruple.15,16 Frailty and advanced age are associated with worse postoperative outcomes and a higher risk of complications. 17 Therefore, accurately predicting postoperative outcomes and selecting appropriate surgical candidates is crucial. The American College of Surgeons developed a widely used frailty index, originally comprising 11 items and later modified to a 5-item version.18,19 Both versions have been validated in various surgical specialties20,21,22,23 and, more recently, in spinal surgery.24,25,26,27,28,29
However, few studies have assessed the predictive performance of the 5-item Modified Frailty Index (mFI-5) for morbidity and mortality in patients undergoing spinal procedures—particularly those of low complexity. This study aims to evaluate the predictive ability of the mFI-5 for major postoperative complications (defined as Dindo-Clavien grade ≥3) and to compare its performance across surgeries of low-, moderate-, and high-complexity.
Objectives
(1) To assess the discrimination and calibration of the mFI-5 in predicting major postoperative complications (Dindo-Clavien grade ≥3) in patients undergoing thoracolumbar spinal surgery. (2) To compare the discriminatory performance of the mFI-5 across low-, moderate-, and high-complexity thoracolumbar spine surgeries. (3) To estimate the prevalence of postoperative complications within 90 days, the time to complication, unplanned readmissions at 30 and 90 days, and 6-month mortality.
Materials and Methods
Study Design
A retrospective cohort study was conducted on patients over 18 years of age who underwent thoracolumbar spinal surgery at the Italian Hospital of Buenos Aires (HIBA) between January 2017 and December 2020. Exclusion criteria included surgeries involving the cervical or cervicothoracic spine, referrals from other centers with prior interventions in the same region, and cases with sequelae (persistent pathological changes due to disease or surgery), treatment failure (eg, tumor recurrence after incomplete resection or reoperation due to psoas abscess), or unresolved intraoperative complications.
Data Collection and Variables
Data were collected from electronic medical records. Postoperative complications were analyzed up to 90 days, and mortality was assessed up to 180 days following surgery. Frailty was measured using the mFI-5 defined by the American College of Surgeons, 19 which includes the following five items: history of diabetes mellitus, hypertension requiring medication, functional dependence, chronic obstructive pulmonary disease or pneumonia, and congestive heart failure within 30 days before surgery.
Complications were classified using the Dindo-Clavien scale (grades 1-5), 6 with only the most severe event recorded in cases with multiple major complications. The analysis included time to complication, unplanned readmissions, reoperations within 90 days, and mortality at 180 days. Demographic and surgical variables included age, sex, body mass index (BMI), type of pathology, and surgical complexity.
Surgical complexity was classified a priori based on procedural invasiveness and technical demands, in accordance with previously published literature.30,31 Low-complexity procedures included microdiscectomy, laminectomy, vertebroplasty, kyphoplasty, and single-level posterior lumbar instrumentation. Moderate-complexity procedures included posterior dorsal decompression and/or instrumentation, posterior lumbar instrumentation with or without interbody fusion (TLIF/PLIF) involving up to two levels, and single-level lateral or anterior lumbar interbody fusion (LLIF/ALIF). High-complexity procedures included combined anterior–posterior approaches, corpectomy, pedicle subtraction osteotomy, and multilevel instrumented fusion with deformity correction. 32
Sample Size Calculation
For adequate model validation, including calibration and discrimination, a minimum of 10-20 events per variable is recommended. 33 Given that five variables were included in the model, approximately 75 events were needed. Assuming an expected event incidence of 10%,24,26,27,33 a sample of at least 750 patients undergoing thoracolumbar spine surgery was estimated.
Statistical Analysis
All patients meeting the inclusion criteria were consecutively sampled. Categorical variables were reported as counts and percentages, and continuous variables as means with standard deviations or medians with interquartile ranges, depending on their distribution.
For the first objective, model discrimination was evaluated using the area under the receiver operating characteristic curve (AUROC), and calibration was assessed using the Hosmer-Lemeshow goodness-of-fit test. The expected/observed (E/O) event ratio, the calibration-in-the-large (CITL), and the calibration slope were used to assess systematic bias and overfitting.
For the second objective, AUROCs were calculated for each surgical complexity level, and differences were compared using DeLong’s test (significance threshold two-sided P-value .05).
Missing Data
Data completeness was high across all variables of interest. Given the low proportion of missing data, a complete case analysis was performed, and no imputation methods were applied (Supplemental Material: Table S1).
Results
Baseline Demographic and Surgical Characteristics Stratified by Modified Frailty Index in Patients Undergoing Thoracolumbar Spine Surgery
Note. mFI, modified frailty index (5-item version); BMI, body mass index; COPD, chronic obstructive pulmonary disease; CHF, congestive heart failure; ASA, American society of anesthesiologists physical status classification, Major complication = Defined as Dindo-Clavien grade ≥3 within 3 months postoperatively, Surgical complexity = Categorized as low, moderate, or high based on procedural invasiveness and technical demands.
The distribution of patients according to mFI-5 score was homogeneous across surgical complexity levels, with no statistically significant differences (P = .709). On average, 40% of patients had an mFI-5 score of 0, 36% had a score of 1, 20% had a score of 2, and 3% had a score of 3, regardless of surgery complexity.
Risk Stratification Showing Predicted Complication Rates by mFI-5 Score and Surgical Complexity Level
Note. mFI, modified frailty index (5-item version).
The model’s discriminatory performance was fair, with an AUROC of 0.66 (95% CI: 0.60-0.70) (Figure 1). The optimal cutoff was an mFI-5 score of ≥2, yielding 50% sensitivity and 79% specificity. Calibration was strong, with a slope of 1, CITL of 0, and a Hosmer-Lemeshow P-value of .99, indicating excellent model fit. The model demonstrated fair overall discriminatory performance of the mFI-5. AUROC curve of 0.66 (95% CI: 0.60-0.70)
For the second objective, complication rates by surgical complexity were 5% for low, 10% for moderate, and 18% for high complexity surgeries (P < .01). Discriminative ability of the mFI-5 was the highest in high-complexity surgeries (AUROC: 0.74; 95% CI: 0.64-0.84), and lower in moderate (AUROC: 0.62; 95% CI: 0.48-0.74) and low complexity surgeries (AUROC: 0.63; 95% CI: 0.50-0.74) (Figure 2). However, we found no statistically significant differences between the groups when comparing the AUROCs using the DeLong test (P = .20). Discriminative ability of the mFI-5 according to surgical complexity level. The mFI-5 showed its highest discriminatory performance in high-complexity procedures, with an AUROC of 0.74 (95% CI: 0.64-0.84), and showed better discrimination among patients at risk of major complications compared with those with low- or moderate-complexity surgeries
For the third objective, the median time to major complication was 16 days for 50% (n = 34) and 26 days for 75% (n = 52) of patients. The rate of unplanned readmission was 7% (95% CI: 5-9%) at 30 days and 9% (95% CI: 7-11%) at 90 days. The 6-month mortality rate was 1% (95% CI: 0.5-2%).
Discussion
Patient frailty and surgical complexity are well-established predictors of adverse outcomes.30,34,35,36,37 Accordingly, identifying the impact of these variables across patient subgroups is crucial for anticipating major complications. The primary aim of this study was to evaluate the predictive performance of the 5-item Modified Frailty Index (mFI-5) and to compare its effectiveness in patients undergoing spinal surgeries of varying complexity.
Our findings confirm that the mFI-5 is a significant predictor of major postoperative complications. Statistically significant differences were observed in frailty categories 2 (OR: 3.8; SD: 1.3) and 3 (OR: 7.7; SD: 3.8), with P-values <.01. The model demonstrated moderate overall discrimination with an AUROC of 0.66 (95% CI: 0.60-0.70) and excellent calibration, as evidenced by a calibration slope of 1, CITL of 0, and a Hosmer-Lemeshow P-value of .99. From a clinical standpoint, this strong calibration indicates that the model accurately estimates absolute risk, supporting its utility for shared decision-making even when discrimination is modest. An mFI-5 cutoff of ≥2 provided the most clinically meaningful threshold for identifying patients at increased risk.
Currently, the mFI-5 is one of the most widely used tools for evaluating frailty severity, and numerous studies,19,24,25,26,27,28,29,33,38 including our own, have confirmed its relevance as a predictor of postoperative complications. For example, Segal et al. 38 demonstrated a correlation between mFI-5 scores and the risk of complications and readmissions in patients undergoing kyphoplasty. Similarly, Weaver et al. 25 reported that higher mFI-5 scores were associated with increased rates of medical and surgical complications, as well as higher readmission and mortality rates following lumbar fusion surgery. Kang et al. 26 also found that patients with elevated mFI-5 scores had significantly higher complication rates (P = .032). Furthermore, several authors have reported that an mFI-5 score of ≥2 increases the risk of postoperative complications by a factor of 1.45 to 2.13.25,26,27
A key contribution of this study is the evaluation of mFI-5 performance across different levels of surgical complexity. Patients undergoing high-complexity procedures showed better discrimination (AUROC: 0.74; 95% CI: 0.64-0.84) than those undergoing low- (AUROC: 0.63; 95% CI: 0.50-0.74) or moderate-complexity procedures (AUROC: 0.62; 95% CI: 0.48-0.74). Discriminatory ability was highest in high-complexity procedures, whereas only fair to poor discrimination was observed in moderate- and low-complexity procedures. This suggests that frailty, as captured by the mFI-5, exerts a stronger predictive influence when surgical stress is substantial. In contrast, outcomes following less invasive procedures may be driven by additional factors not captured by the mFI-5, such as prior spinal surgery, underlying pathology, or chronological age.
Notably, previous studies evaluating the discriminative performance of the mFI-5 in spine surgery have reported overall AUROC values consistent with our findings, although without stratification by procedural complexity.27,28 Pierce et al. 28 reported an AUROC of 0.61, while Kweh et al. 27 found an AUROC of 0.66 (95% CI: 0.58-0.75) for all complications and 0.68 (95% CI: 0.62-0.74) for major complications defined as Dindo-Clavien grade ≥4.
To our knowledge, this is the first study to demonstrate that surgical complexity meaningfully alters the predictive accuracy of the mFI-5. These findings highlight a limitation of applying a uniform, frailty-based risk model across heterogeneous spinal procedures. Moreover, the literature has primarily focused on the high risk of complications in frail patients undergoing complex surgeries,14,30,37,39,40,41 leaving a gap in knowledge regarding outcomes in less complex interventions. In theory, the patients’ reduced physiological reserve and impared compensatory mechanisms, coupled with a highly stressful procedure involving prolonged operating times and significant blood loss, could certainly present challenges.42,43
From a clinical perspective, the results suggest that the usefulness of the mFI-5 is context-dependent. In high-complexity spine surgery, an mFI-5 score ≥2 should prompt greater awareness of postoperative risk, targeted preoperative optimization, and detailed informed consent discussions regarding the substantially elevated complication rate (18%) observed in this subgroup. In selected cases, it may be appropriate to consider less invasive alternatives when modifiable risk factors cannot be adequately optimized. Conversely, in low-complexity procedures, relying solely on the mFI-5 may be insufficient, underscoring the importance of integrating additional patient-, disease-, and procedure-specific risk factors into preoperative counseling.
Although mortality was not the primary outcome of this study, the observed 6-month mortality rate of 1% is clinically reassuring and consistent with previously reported rates following elective thoracolumbar spine surgery.25,33,39,40,41 In the present cohort, the low overall mortality likely reflects careful patient selection, contemporary perioperative management, and the predominance of low- and moderate-complexity surgeries. Given the limited number of mortality events, robust statistical modeling was not feasible, suggesting that the mFI-5 may be more informative for predicting major morbidity than mortality in this setting.
Additionally, an important conceptual limitation of frailty indices in spine surgery should also be acknowledged. Functional dependence, a core component of the mFI-5, may be directly influenced by the severity of spinal pathology rather than by systemic frailty alone. Patients with advanced degenerative disease, spinal deformity, or myelopathy may exhibit impaired mobility and dependence as a consequence of neurological compromise, pain, or mechanical instability.36,44 In such cases, the mFI-5 may partially reflect disease-related disability rather than true physiological frailty, introducing potential confounding between frailty and pathology severity. These considerations underline the need for cautious interpretation of frailty scores in this particular spine population.
This study has several strengths. To our knowledge, it is the first to provide a comprehensive evaluation of the mFI-5’s predictive performance—including both discrimination and calibration—while also comparing its effectiveness across surgical complexity strata. Additionally, the sample size ensured adequate statistical power for the analysis of included variables.
Nevertheless, our study has some limitations. The retrospective, single-center design limits generalizability, and selection bias related to procedure complexity cannot be excluded. The heterogeneity of spinal pathologies and the lack of differentiation between open and minimally invasive techniques may have influenced outcomes. Furthermore, the absence of external validation restricts broader clinical applicability, and the study design does not allow assessment of clinical utility or decision-making impact.
Future research—ideally prospective, randomized clinical trials—should aim to more accurately assess the impact of mFI-5 and surgical complexity on outcomes in patients undergoing thoracolumbar spinal surgery. These investigations should also consider additional independent risk factors that may influence postoperative outcomes.
In conclusion, this study supports the utility of the mFI-5 as a valuable predictor of major postoperative complications in thoracolumbar spine surgery. The index demonstrates strong predictive performance in high-complexity surgeries but limited discrimination capacity in low- and moderate-complexity procedures, highlighting the need for complementary risk assessment strategies. Further research is warranted to address current knowledge gaps and to enhance risk stratification in this patient population.
Supplemental Material
Supplemental material - The Modified 5-Item Frailty Index as a Predictor of Postoperative Complications in Patients Undergoing Spinal Surgery. Performance Comparison at Different Levels of Surgical Complexity
Supplemental material for The Modified 5-Item Frailty Index as a Predictor of Postoperative Complications in Patients Undergoing Spinal Surgery. Performance Comparison at Different Levels of Surgical Complexity by Gonzalo Kido, Nicolas Molho, Patricio Encinar, Camila Juana, Matias Petracchi, and Marcelo Gruenberg in Global Spine Journal
Footnotes
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
