Abstract
Study Design
Retrospective analysis utilizing machine learning.
Objectives
This study aims to identify the key factors influencing total charges during the primary admission period following single-level lumbar arthrodesis, using machine learning models to enhance predictive accuracy.
Methods
Data were extracted from the National Inpatient Sample (NIS) database and analyzed using various machine learning models, including random forest, gradient boosting trees, and logistic regression. A total of 78,022 unweighted cases of patients who underwent single-level lumbar arthrodesis were identified using the NIS database from 2016 to 2020. Variables included hospital size, region, patient-specific factors, and procedural details. Multivariate linear regression was also used to identify charge-related variables.
Results
The average total charge for single-level lumbar arthrodesis was $145,600 ± $102,500. Significant predictors of charge included length of stay, hospital size, hospital ownership, and region. Private investor-owned hospitals and procedures performed in the Western U.S. were associated with higher charges. Random forest models demonstrated superior predictive accuracy with an AUC of .866, outperforming other models.
Conclusions
Hospital characteristics, regional factors, and patient-specific variables significantly influence the charges of single-level lumbar arthrodesis. Machine learning models, particularly random forest, provide robust tools for predicting healthcare costs, enabling better resource allocation and decision-making. Future research should explore these dynamics further to optimize cost management and improve care quality.
Keywords
Introduction
Single-level lumbar arthrodesis procedures have been a cornerstone for the treatment of degenerative conditions of the lumbar spine. From 2004 to 2015, the rate of single-level lumbar arthrodesis had increased by 62.3%, with the largest increases observed in the management of spondylolisthesis, stenosis, and disc degeneration. 1 A key component of many of these procedures involves the use of interbody devices, which help maintain disc height, provide stability, and promote fusion by serving as a scaffold for bone growth.2,3 These devices offer advantages such as improved fusion rates and better restoration of spinal alignment. 4 Despite carrying risks of reoperation with potential development of adjacent segment disease, single-level arthrodesis remains one of the most widely applicable and diverse spine interventions for pathologies affecting the lumbar spine.
As the number of lumbar fusions being performed in the U.S. continues to grow, analyzing the factors that most significantly affect the costs experienced by individuals undergoing single-level lumbar arthrodesis procedures becomes a significant discussion, which has been extensively studied in the literature for other common spine procedures. A study conducted by Missios et al. aimed to assess the most significant factors affecting hospitalization costs following spine surgery in the U.S. 5 The authors found that length of stay, number of admission diagnoses and procedures, hospital size and region, patient income, fusion surgery, acute renal failure, sex, and coagulopathy were all significant drivers of costs in the primary admission period. Similarly, a study conducted by D’Antonio et al. found that increased age and comorbid depression were associated with increased costs of care following lumbar decompression. 6 These findings underscore the importance of patient-specific, regional, and hospital-related factors in determining the economic burden of lumbar spine surgeries, serving as robust predictors of cost. While logistic regression has been the traditionally accepted and widely adopted form of data analysis, machine learning (ML) has emerged with capabilities that significantly outperform those of traditional models. ML techniques offer the ability to process large amounts of data quickly, create models that adapt to new data, and understand complex, nonlinear relationships that conventional regression models might fail to comprehend.7,8 These advanced capabilities enable more accurate predictions and insights, making ML a valuable tool in modern data analysis, especially in the healthcare setting. 9
Given the vast use of single-level lumbar arthrodesis in the treatment of a variety of lumbar spine pathologies, analyzing the factors that most significantly influence charges in the primary admission period following single-level lumbar arthrodesis could prove beneficial in optimizing healthcare costs. Therefore, this paper aims to identify the impact of (1) hospital size, (2) region, and (3) patient-specific factors on total charges during the primary admission period following single-level lumbar arthrodesis using machine learning models, as well as (4) explore the variability in significance of these predictors across a variety of machine learning and traditional statistical models.
Methods
Data Collection
The National Inpatient Sample (NIS) was utilized for analysis, with data spanning from January 1, 2016 to December 31, 2020, and combined with the corresponding cost-to-charge ratio database for charge assessment. Sponsored by the Healthcare Cost and Utilization Project (HCUP) under the Agency for Healthcare Research and Quality, the NIS is the largest publicly available database of inpatient healthcare in the U.S., providing comprehensive regional and national estimates on inpatient utilization, access, costs, quality, and outcomes. 10
Patient Population
Patients were selected utilizing the following International Classification of Diseases, 10th Revision (ICD-10) procedural codes for single-level lumbar interbody fusion with an interbody device: 0SG00A0, 0SG03A0, 0SG04A0, 0SG00AJ, 0SG03AJ, and 0SG04AJ. Patients under the age of 18 were excluded. ICD-10 diagnosis codes were queried to categorize patients based on diagnosis: spinal stenosis (M48.0X), spondylolisthesis (M43.1X), and degenerative disc disease (M51.XX). Patients who did not have a spinal stenosis, spondylolisthesis, or degenerative disc disease diagnosis were excluded.
Variable Selection
The collected independent variables included age, sex, length of stay (LOS), hospital bed size, teaching hospital status/location, region, hospital ownership, insurance payer, comorbidity indices such as All Patient Refined Diagnosis-Related Group (APRDRG) severity and APRDRG risk mortality, wage index, and diagnosis. The wage index is calculated as the ratio of total wage costs to total hours for all hospitals in a specific geographic area compared to the national average. 10 To determine annual hospital volume, each hospital’s unique (blinded) identifier was used to count all single-level lumbar arthrodesis procedures performed each fiscal year, and these totals were averaged over the years. The wage index is a metric developed by the Centers for Medicare & Medicaid Services (CMS) to evaluate the relative wages of hospitals across different geographic regions.
Predictive Model Design
The primary admission’s total charge of care served as the target variable. Given the use of multi-year data, all charge figures were adjusted for inflation to reflect 2020 US dollar values using specific weights. 11 The total charges were subsequently divided into quartiles, with those in the fourth quartile classified as high. Predictive modeling was then utilized to identify variables associated with these high-care charges. 12
Statistical Analysis
Machine learning algorithms were developed using Python’s Scikit-Learn package, version 3.11.4. Additional statistical analyses were performed using R statistical software (version 4.4.0; R Project for Statistical Computing, Vienna, Austria). A multivariate linear regression identified cost-related variables, with the Cohen F2 value calculating effect size. 13 The dataset was split: 80% for training the predictive model and 20% for validation. Evaluated algorithms included logistic regression, random forest, Naïve Bayes, decision trees, gradient boosting trees, and K-nearest neighbors. Hyperparameters were optimized from a randomly generated selection. Categorical variables were one-hot encoded, and continuous variables were binned into ten equal parts. Models were assessed based on precision, recall, F1-score, and AUC. The DeLong method compared AUC scores across algorithms. 14 Statistical significance was set at P < .05. To minimize overfitting and enhance model generalizability, hyperparameter tuning was performed using randomized search cross-validation. These validation steps, along with performance metric evaluation, ensure the predictive reliability and potential clinical utility of our machine learning models.
Results
Demographics.
SD, standard deviation; APR-DRG, all patient refined diagnosis-related group.
aHospital bed size designated by the healthcare cost and utilization project (HCUP) based on geographic location, teaching hospital status, and number of beds.
bAPR-DRG: Assigned using software developed by 3M health information systems by the HCUP, which measures severity of illness subclass and risk of mortality subclass within each base APR-DRG.
cWage index: index computed by the center for medical services (CMS) to measure relative hospital wage level in a geographic area compared with the national average hospital wage level.
Association of Predictors With Total Cost in Multivariate Linear Regression and Calculation of Effect Size Using Cohen’s F2.
Ref, reference group; APR-DRG, all patient refined diagnosis-related group.
Bold values are statistically significant.
In the analysis of high-charge predictors for single-level lumbar arthrodesis using machine learning models, the random forest classifier revealed several key variables with substantial influence on charge outcomes. Private investor-owned hospitals emerged as the most significant predictor of charge, closely followed by the wage index and LOS. Hospital characteristics such as annual case volume, also significantly impacted charges. Geographic factors, such as the West region, were notable as well. Procedure characteristics such as approach and a spondylolisthesis diagnosis were significant predictors of higher charges as well. Additionally, hospital size (large-sized hospitals), whether the hospital was urban nonteaching, APRDRG severity, and patient age were important predictors (Figure 1). Feature importance as determined through a random forest classifier via permutation method for prediction of high charge of care following single-level lumbar arthrodesis. APR-DRG indicates all patient refined diagnosis-related group.
Machine Learning Algorithms Summary of Model Measures.
Precision is defined as the ratio between true positives and all positives. Recall as the ratio of true positives to that of true positives and false negatives. The F1-score is the mean of precision and recall.
AUC, area under the curve; 95% CI, 95% confidence interval.

Receiver operating characteristic curve with area under the curve (AUC) calculation of each machine learning algorithm in the prediction of high charges following single-level lumbar arthrodesis.
Pairwise Comparison of Machine Learning Algorithms Utilizing the DeLong Method.
X denotes a comparison that could not be made as a model could not be compared to itself.
Bold values are statistically significant.
Discussion
Single-level lumbar arthrodesis remains one of the most widely performed spine surgeries.1,15,16 With rising charges in spine surgery, it is crucial to identify factors impacting costs for patients and hospitals. ML models provide superior tools for data analysis, identifying complex patterns and predicting outcomes better than traditional methods.17,18 This study aimed to analyze factors influencing charges of single-level lumbar arthrodesis during the primary admission period using ML models. Key hospital factors included annual case volume, medium-to-large hospital size, and private for-profit status. Regional factors such as wage index and Western U.S. location increased charges. Patient factors like surgical approach and LOS significantly affected costs. Among ML models, random forest and gradient boosting trees outperformed logistic regression, with random forest achieving the highest AUC values and narrowest confidence intervals, underscoring its reliability and predictive accuracy.
Key hospital characteristics, such as annual case volume and medium-to-large hospital size, significantly influenced postoperative care costs. Higher annual case volumes were associated with lower charges, suggesting that hospitals with more experience in these procedures operate more efficiently and cost-effectively. Previous studies have shown that increased annual case volume for spine procedures decreases costs, indicating similar efficiencies for single-level lumbar fusion.19,20 Medium and large hospitals incurred the highest charges, possibly due to unique resource allocation or specialized care protocols. Hospitals with higher costs are often smaller in bed size for procedures related to elective spinal fusion and adult spinal deformity.21,22 This suggests that single-level lumbar arthrodesis may be managed differently depending on the hospital’s size and resources. The unexpectedly higher charges at larger hospitals warrant further investigation into the underlying factors contributing to these expenses. Private for-profit status had the greatest effect on increasing charges during the primary admission period, reflecting financial motives in for-profit institutions. A systematic review by Devereaux et al. found that private for-profit hospitals were significantly associated with increased charges compared to private not-for-profit counterparts. 23 These findings align with broader healthcare trends where for-profit hospitals prioritize revenue generation, resulting in higher patient costs. This underscores the need for targeted strategies and further research to mitigate cost drivers associated with hospital characteristics, ensuring financial considerations do not compromise care quality for patients undergoing single-level lumbar arthrodesis.
Geographic region within the U.S. significantly impacts the cost of postoperative care for single-level lumbar arthrodesis. Specifically, the Western U.S. is associated with higher charges. This trend is well-documented in spine surgery, where the West consistently exhibits the highest costs for procedures.21,22 One reason for this could be regional policy differences, such as California’s mandated minimum nurse-to-patient ratios in acute care hospitals, which increase expenses. 24 Additionally, the wage index, accounting for regional labor cost variations, significantly contributes to higher charges following primary admission for single-level lumbar arthrodesis. 19 The higher cost of living and labor in the Western U.S. necessitates higher wages for healthcare workers, increasing overall hospital expenses. These factors highlight the pervasive nature of regional influences on healthcare costs and underscore that institutional and policy-related factors significantly impact patient charges. Addressing regional disparities in healthcare costs may require tailored policy interventions and resource allocation strategies that consider the unique economic and regulatory environments of different regions. By understanding and mitigating these regional cost drivers, healthcare providers and policymakers can work towards more equitable and affordable care for patients undergoing single-level lumbar arthrodesis.
Patient-specific characteristics, such as the approach used for fusion, significantly impact charges. Anterior approaches often require collaboration between vascular and spine surgeons, adding complexity and expense. 25 While the ML models did not highlight other patient-specific factors as main contributors to charges, the logistic regression model identified significant predictors, including age and sex. These factors have been critically linked to patient charges in spine surgery. A study by Puffer et al. found that age, BMI, and comorbidities significantly impact postoperative costs, highlighting their role in determining the economic burden on patients. 26 LOS was also a major contributor to charges for single-level lumbar arthrodesis, correlating with the patient’s overall health. Boylan et al. found that longer LOS was associated with higher costs in adolescent scoliosis surgery, as extended stays often indicate more severe conditions or complications. 27 This underscores the importance of preoperative evaluation and mitigation strategies to reduce LOS and associated charges, thereby lessening the financial burden on patients.
ML models like random forest and gradient boosting trees are significantly more accurate in predicting charges compared to traditional logistic regression. Specifically, the random forest algorithm effectively identifies key factors influencing total charges and produces the highest AUC with the shortest confidence intervals, demonstrating its robust charge predictions. This capability is valuable for optimizing resource allocation and enhancing decision-making in healthcare. The utility of these models has been extensively addressed in the literature, with studies consistently showing their superior performance compared to traditional methods. A systematic review conducted by Senders et al. evaluated the capability of ML models to serve as prediction tools for various neurosurgical outcomes and found them exceptionally accurate and reliable, significantly outperforming traditional methods. 28 Similarly, a study by Merali et al. identified random forest as the best predictive model for outcomes after degenerative cervical myelopathy surgery, showcasing its potential in forecasting surgical outcomes and guiding clinical decision-making. 29 These findings underscore the transformative potential of ML models, particularly random forest, in advancing healthcare charge prediction and management.
This study on single-level lumbar arthrodesis and its charge implications has several limitations. The retrospective nature of the NIS database introduces potential biases and limits granularity, affecting patient-specific insights. Machine learning models, while powerful, can overfit and are dependent on data quality; errors in data entry or coding can impact predictions. Despite a large sample size, potential selection bias remains due to the inclusion and exclusion criteria applied. Additionally, the focus on hospital size, region, and patient-specific factors may overlook other influential factors like surgical techniques, surgeon experience, and postoperative care. The analysis assumes comprehensive data capture, which may not always be accurate, leading to potential misestimation of prevalence and charges. The findings are specific to the NIS settings and populations, limiting generalizability to other regions or contexts. Moreover, while our machine learning models were rigorously validated, their predictive accuracy may vary when applied in different clinical settings or populations. Further external validation is warranted to confirm their generalizability and support broader clinical adoption. While valuable, these insights should be interpreted with caution and contribute to a broader discussion on cost management and procedural choices in spine surgery.
Conclusions
Understanding drivers of charges for single-level lumbar arthrodesis is crucial for developing targeted strategies to manage expenses, improve patient outcomes, and ensure the sustainability of high-quality care. This study identified significant influences on charges, including hospital characteristics, regional factors, and patient-specific variables, with machine learning models like random forest demonstrating superior predictive accuracy. By leveraging these advanced analytical tools, healthcare providers can optimize resource allocation and enhance clinical decision-making. Future research should continue to explore these charge dynamics to further refine cost-management strategies and improve care for patients undergoing this procedure.
Footnotes
Declaration of Conflicting Interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Mitchell K. Ng is a paid consultant at Pacira BioSciences Inc., Sage Products Inc., Alafair Biosciences Inc., Next Science LLC, Bonutti Technologies Inc., Johnson & Johnson Ethicon Inc., Hippocrates Opportunities Fund LLC, and Ferghana Partners Inc.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
