Abstract
Study Design
Narrative review.
Objectives
Artificial intelligence (AI) is being increasingly applied to the domain of spine surgery. We present a review of AI in spine surgery, including its use across all stages of the perioperative process and applications for research. We also provide commentary regarding future ethical considerations of AI use and how it may affect surgeon-industry relations.
Methods
We conducted a comprehensive literature review of peer-reviewed articles that examined applications of AI during the pre-, intra-, or postoperative spine surgery process. We also discussed the relationship among AI, spine industry partners, and surgeons.
Results
Preoperatively, AI has been mainly applied to image analysis, patient diagnosis and stratification, decision-making. Intraoperatively, AI has been used to aid image guidance and navigation. Postoperatively, AI has been used for outcomes prediction and analysis. AI can enable curation and analysis of huge datasets that can enhance research efforts. Large amounts of data are being accrued by industry sources for use by their AI platforms, though the inner workings of these datasets or algorithms are not well known.
Conclusions
AI has found numerous uses in the pre-, intra-, or postoperative spine surgery process, and the applications of AI continue to grow. The clinical applications and benefits of AI will continue to be more fully realized, but so will certain ethical considerations. Making industry-sponsored databases open source, or at least somehow available to the public, will help alleviate potential biases and obscurities between surgeons and industry and will benefit patient care.
Keywords
Introduction
Across medicine, continued advancements in computing power and data storage over the past several years have produced breakthroughs in the use and applications of artificial intelligence (AI) and its subsets such as machine learning (ML) or deep learning (DL).1,2 Spine surgery is no exception to these developments.3,4 For the past 5-10 years, AI has been applied in spine surgery to preoperative patient stratification, analysis of imaging, curation of large datasets, attempts to predict postoperative outcomes, and more. It is also being applied in multiple ways to spine research.5,6
As the use of AI is still in its relatively nascent phase, articles examining the use of AI in spine surgery have focused mainly on ways in which it may be useful to approximate – or, in some cases, supplement – conventional clinical workflow alongside surgeons or radiologists. 7 Some articles discuss future directions for AI or the limitations that AI may encounter. 8 Fewer articles explore the ethical implications or potential pitfalls of widespread AI use. 9 There is an even greater paucity of articles that discuss the use of AI for analysis of datasets that may not be available for public use. A prime example of this would be the future uses of AI for data curated by industry.
To fill this gap in the literature, the authors herein present a review of AI in spine surgery, including its use across all stages of the perioperative process and applications for research. We furthermore discuss the future applications of AI and ethical implications of its use. Particularly, we discuss the need to scrutinize the relationship between AI, surgeons, and commercial spine industry partners. As large companies embrace the use of AI and continue to collect data regarding procedures and patient outcomes, the ways in which that data is utilized by industry requires careful analysis by surgeons themselves. Ultimately, we advocate for making large industry-sponsored databases open source to avoid obscurations between physicians and industry and so that the medical community may use that data to generate independent conclusions.
Applications of AI in Spine Surgery
Preoperative Uses
The primary preoperative uses of AI have been in the domains of image analysis, patient diagnosis and stratification, and aiding in the preoperative decision-making process. Regarding image analysis, multiple AI or ML algorithms have been used to accurately identify various features on X-ray (XR), computed tomography (CT), or magnetic resonance imaging (MRI) studies. This may be as simple as identifying individual vertebral levels or as complex as predicting which patients may develop pathologies in the future based on variables gleaned from imaging. Jimenez-Pastor et al and Jakubicek et al have reported on ML algorithms that enabled correct vertebral level identification ranging between 74.8% and 87.1% when using CT scans.10,11 AI has also been used for XR and CT imaging to assess patients’ bone mineral densities.12-14
Other researchers have used AI to stratify pathologies such as spinal stenosis, degenerative changes, scoliosis and other deformities, or severity of trauma.15-26 Wu et al 16 trained a Multi-View Correlation Network (MVC-Net) system that produced a mean error of approximately 4° when estimating Cobb angles from 526 anteroposterior and lateral XR. Niemeyer et al 24 used a DL network that produced a sensitivity of 90.2% and precision of 92.5% in grading disc degeneration among 1599 patients. Han et al 23 trained a DL network to identify lumbar neuroforaminal stenosis with a precision of 84.5% based on T1- and T2-weighted MRI scans for 200 patients. Jamaludin, Kadir, and Zisserman used a Convolutional Neural Network (CNN) framework that identified numerous aspects of spinal degeneration at a performance level that approximated human ability. 22
For measurements of spinal deformity, Galbusera et al used AI to assess numerous spinal angles (L1-L5 lordosis, pelvic incidence, sacral slope, pelvic tilt, scoliosis measurements, etc.) for 493 patients. 25 Across all measurements, the standard error of measurements ranged from 2.7° (for pelvic tilt) to 11.5° (for L1-L5 lordosis). Doerr et al 18 used a CNN network to predict the thoracolumbar injury classification and severity score (TLICS) and posterior ligamentous disruption of 111 patients with thoracolumbar trauma. The network demonstrated 95.1% accuracy for predicting injury overall and accuracy of 86.3% and 86.8% for TLICS morphology and posterior ligamentous injury, respectively.
Additionally, beyond simply analyzing imaging or generating diagnoses, the preoperative planning process involves complex decision-making regarding which patients would benefit from surgery, the type of surgery to perform, and ensuring that proper informed consent takes place. Inherent to this process are the judgement biases and heuristics that are exhibited by all surgeons. These problems are not exclusive to spine surgery but may be particularly applicable to this specialty as there is often no one definitive answer to addressing spinal pathology. To that end, multiple studies have been conducted to assess the potential beneficial role that AI may play in preoperative decision-making.26-29 Loftus et al 27 have argued that AI may assist in preoperative decision-making through the generation of more accurate prediction models, preoperative scoring systems, or decision aids. Via ML or the use of neural networks that can adapt and learn as they incorporate data, AI tools to aid in complex decision-making can improve as they are applied to more patients or images. 2 In theory, decisions supported by AI may be more objective than those made by individual surgeons. Ames et al 28 used AI-based hierarchical clustering to classify spinal deformity patients into multiple preoperative groups that each had distinct 2-year risk-benefit profiles. Preoperative stratification such as this can help surgeons determine the ideal surgical procedures for patients to produce optimal outcomes while reducing risk. These preoperative planning algorithms may be used in close conjunction with many different AI-based predictors of postoperative complications, which will be discussed later in this article.
Intraoperative Uses
Intraoperatively, AI has been used primarily for image guidance and to improve the accuracy of surgical navigation.30-36 AI has been incorporated into aspects of intraoperative imaging utilizing fluoroscopy, CT or MRI, augmented reality (AR), or virtual reality (VR). Jecklin et al 36 used ML to develop a method by which intraoperative XR studies could be converted into 3-dimensional (3D) reconstructions to aid in intraoperative navigation and decision-making. This enabled an accurate, real-time 3D depiction of spinal architecture to be created from intraoperative fluoroscopy images that are typically used to assess basic 2D anatomy or instrumentation placement. An intraoperative navigation system, the 7D Surgical System, utilizes ML to generate 3D topography of surgical anatomy using either pre- or intraoperative CT imaging. Intraoperative tools can be registered to this topography and the navigation process may be augmented by incorporation of AR tools, if available. 32 This system is incorporated more easily into existing operative room workflow as the generation of 3D topography and assessment of instrumentation insertion does not require the acquisition of intraoperative imaging studies. AI has additionally been used to improve merger of 2-dimensional (2D) images (eg, fluoroscopy) and 3D images (eg, volumetric MRI images) to improve the accuracy of intraoperative navigation based on 2D intraoperative imaging. 33 Given the numerous intraoperative navigation and robotic systems for spine surgery on the market today, 37 continued integration with AI platforms is a near certainty.
AI has also been applied to assess intraoperative pedicle screw placement and strength.35,38,39 Burstrom et al used ML analysis of cone-beam CT imaging to guide pedicle screw placement in 21 cadavers. 35 They found that their algorithm enabled accuracy of at least 86.1%, which was increased to 95.4% if patients with severe spinal deformity or prior spine surgery were excluded. Varghese, Krishnan, and Kumar developed a ML approach to assess pedicle screw pullout strength. 39 They found that a random forest regression model was able to accurately predict the strength of pedicle screws based on bone density, screw depth, and screw insertion angle. Von Atzigen et al have conducted multiple studies examining the use of AR and real-time ML to develop a method of more accurate rod bending for spinal fusions.40,41 They utilized the Microsoft HoloLens and a stereo neural network to analyze the positions of pedicle screw heads. Using that data, their AI algorithm was able to determine the rod shape, which was transmitted to the surgeon via the AR headset. They found that use of their algorithm led to decreased time required to bend each rod as well as fewer rebending maneuvers when compared to freehand or marker-based methods.
Postoperative Uses
Postoperatively, AI has been used to predict a variety of outcomes including hospital length of stay, postoperative complications, hospital readmission, proximal junctional kyphosis, postoperative pain, quality of life, or mortality after surgery for spinal tumors.42-76 Kim et al 67 used ML to identify factors associate with complications after posterior lumbar fusions. When compared to the American Society of Anesthesiology (ASA) patient classifications, their machine learning algorithms more accurately predicted postoperative cardiac and wound complications, venous thromboembolism, and mortality. In a similar study, Arvind et al used ML to predict postoperative complications after anterior cervical surgery, finding that their algorithms also outperformed the ASA classification system. Jain et al 66 created a predictive model using AI to predict 90-day hospital readmissions and major complications after long-segment lumbar spine fusions.
Arora et al have conducted multiple studies examining factors contributing to extended length of stay for patients undergoing elective spine surgery and surgery for spinal deformity.73,74 Using ML, they created a model that could accurately predict a patient’s length of hospital stay and risk of discharge to rehabilitation when compared to the American College of Surgeons’ National Surgical Quality Improvement Program (NSQIP) prediction calculator. Furthermore, they found that a ML algorithm could predict extended length of stay for patients undergoing multilevel thoracolumbar fusions for spinal deformity, with sensitivity and specificity of 77% and 68%, respectively. Valliani et al 75 used ML to predict excessive length of stay after cervical spine surgery. They were able to accurately predict those patients who were at risk of prolonged hospital stay, with an area under the curve (AUC) of 0.87 for a single-center dataset and 0.84 for a national dataset. Stopa et al 71 validated a previously-developed ML algorithm for predicting nonroutine discharges after elective lumbar surgery. This algorithm predicted nonroutine discharges with an AUC of 0.89, positive predictive value of 0.50, and negative predictive value of 0.97.
For resection of spinal tumors, Karhade et al 55 developed multiple ML algorithms to predict 30-day mortality after surgery for spinal metastases. Using a variety of prognostic factors, the highest performing algorithm was accurately able to predict mortality. The resulting Skeletal Oncology Research Group (SORG) ML algorithm subsequently was developed into an open-access web application for predicting survival (SORG Spine Metastases Survival Calculator). The SORG ML algorithms for 90-day and 1-year survival have since been validated using additional data, including an international datasets.54,58,77 Karhade et al 78 also used ML to predict 5-year survival for patients undergoing resection of spinal chordomas. Their Bayes Point Machine algorithm demonstrated the highest performance, with a C-statistic of 0.80 and Brier score of 0.16. This algorithm was used to generate another prediction tool application for this patient population.
Applications of AI in Spine Research
AI’s ability to analyze and identify trends across massive amounts of data makes it particularly attractive for conducting research. The increased use of electronic health records (EHRs), wearable technologies, or other digital means of accruing patient information present numerous avenues by which a wide array of data may be gleaned via automated methods. The use of large registries – eg, the NSQIP registry or the Quality Outcomes Database (QOD) created by the American Association of Neurological Surgeons – is becoming commonplace for spine research, and multicenter collaborations such as the International Spine Study Group (ISSG) and European Spine Study Group (ESSG) have merged their databases in order to train ML algorithms.2,79,80 When dependent on manual data collection, these growing registries may be difficult to maintain, audit, and update given the high volumes of data influx. Only a minority of large datasets are actually regularly audited to ensure data quality. 80 However, the use of AI to curate these datasets can streamline the input and maintenance processes. The result is relatively “real time” and organized data that may be subsequently analyzed using AI to yield the highest-quality results.
Relatedly, AI algorithms may be used to extract conclusions from a deluge of data that may be too complicated to analyze manually. The use of “big data” in research has taken off across tech, medicine, and beyond. 80 Without the use of ML or other aspects of AI, worthwhile research in this domain would not be possible. Within the field of spine surgery, clinical data from EHRs or patient-reported outcomes surveys is only the tip of the iceberg. Research is also being conducted using wearable technologies, biometric data, or the use of biological samples. 6 When adding in genetic information such as sequencing of genomics, proteomics, or other “-omics,” the amount of data for one patient can be measured in terabytes. 80 AI algorithms can, and are, being trained to comb through this expanse of data to generate models, identify associations or biases, interpret results, and make predictions at speeds that would never be possible without modern computing power. Further advances in computing, such as quantum computing, will also unlock AI’s true potential. 81 Indeed, the applications of AI in spine surgery may only be limited by widely available high-quality data.
Moreover, while AI continues to prove its worth in the analysis of data generated by existing patients, it may also serve an important role in the generation and analysis of synthetic data. Synthetic data, as its name may suggest, is data that approximates real-world patient information (as it may be modeled on existing patient data to approximate proper distributions) but which is not exclusively derived from actual patients.5,82 This data approximates patient information that would be obtained from typical sources (eg, EHRs) and may be similar with respect to potential cofounding variables, lab values, and demographic data. As shown by Greenberg et al, who compared data from cohorts each containing over 12 000 real or synthetic lumbar fusion patients, the results generated from synthetic data can be statistically similar to those generated from real data. 82 The benefits of synthetic data include the ability to create datasets regarding specific conditions. Schonfeld and Veeravagu 83 used a generative adversarial network (GAN) to create synthetic spinal radiographs on which they trained a CNN to classify the radiographs as normal or abnormal. When compared to a CNN that was trained using real patient radiographs, the synthetically-trained algorithm demonstrated similar performance (AUC 0.856 for real images vs 0.830 for synthetic images). Since synthetic datasets are not derived from patients, the issues inherent to de-identification of data and the use of protected health information do not apply. There is a decreased need for institutional review board approval or other regulatory oversight for research using synthetic data. Thus, sharing of synthetic data between institutions would be easier than sharing actual patient information.
Future Clinical Impact of AI
As it currently stands, the clinical applications of AI are at a stage where they do not eclipse the role of physicians in reading imaging, enacting treatment plans, performing surgery, or overall caring for patients. Though AI may be superior at compiling or analyzing large quantities of data, the studies mentioned in this current article have shown that AI largely is only approximating, not yet surpassing, the clinical skills of the trained spine surgeon. In addition, they have not yet convincingly led to paradigm changing patient selection or risk modification strategies despite requiring expensive computing power. In sum, AI may be beneficial for performing tasks rapidly but is not yet of convincing utility for surgical decision-making or predictions. And, for multiple reasons, the overall quality of AI or ML research in spine surgery may be of questionable quality (eg, studies not training AI models on sufficient numbers of instances). However, this does not mean that AI is not rapidly being instituted in myriad clinical ways for spine patients. In the not-so-distant future we may reach a point where AI-based tools are used as routine adjuncts to nearly every aspect of the perioperative process.
As we have discussed, AI has been used to stratify patients based on imaging and other comorbid factors to help guide treatment plans or to predict complications and other postoperative outcomes. As these models are honed, their ability to distinguish important prognostic factors will become more powerful and their preoperative predictive abilities will become more accurate. The ability of AI to utilize large amounts of data or comb through vast repositories of imaging will allow it to discover yet-undiscovered factors that affect the preoperative planning or postoperative recovery processes. Patients will be more neatly stratified into appropriate categories, which will allow them to receive surgeries that are more tailored towards their patient-specific factors. The ultimate outcome of this will be more personalized spine surgery that will generate improved outcomes for individual patients and hopefully reduce the costs of medical care. This has been identified by Mallow et al 84 in their description of the future of spine surgery, which they have named the Intelligence Based Spine Care Model.
Of course, all this depends on properly honed AI models, which, in turn, requires well-curated databases. The quality of AI-generated results is only as good as the data on which the algorithms were trained. Biased data inputs will yield biased models. The same rigorous statistical principles and techniques that are applied to large datasets to ensure that there is no risk of bias for non-AI research should be applied to AI research. Moreover, researchers must avoid falling into a trap by which they seek to only display the best outputs or results from their AI models. AI is still a novel field for spine surgery and research, and those implementing AI should not assume that their results are inherently valid or convincing simply because they were generated using AI.
Ethical Implications of AI
The opportunities created by AI are exciting, but it also is important to note that the use of AI for spine surgery raises potential ethical considerations. Chief among these is how AI will be used by third parties that fall outside of the physician-patient relationship – eg, industry. Spine surgeons and commercial industry partners have a particularly strong and longstanding history of collaboration. The products and technologies created by numerous companies have resulted in more accurate, minimally invasive, and effective surgeries.85-87 Nevertheless, this relationship still requires scrutiny. As identified in a Cochrane Database review, drug and device studies that are sponsored by industry sources more often report positive conclusions than studies sponsored by other sources. 88 There are likely inherent biases to device trials when the manufacturer is assessing outcomes for their own products.
In spine surgery, there have been numerous studies that have analyzed the efficacy of various devices, implants, or other technologies.89,90 A review of these studies indicates that device trials, particularly multicenter trials, are significantly more likely to be sponsored by industry sources. 91 Almost one quarter of all spine device trials are discontinued before completion and half of device trials do not publish their results. 92 Though some of these discontinuations may be due to the high costs and organizational demands required to successfully carry out a clinical trial, the high rate of discontinuations and under-publishing certainly introduces bias to the literature regarding spine devices.
The adoption of AI runs the risk of adding to this bias. Currently, large device companies are partnering with or acquiring AI firms to streamline and augment their abilities to compile large amounts of data regarding the patients for whom their technologies are used. This may include demographic information, perioperative information, or outcomes data. Medtronic, for instance, launched its “UNiD ePRO” data collection service to improve how the company collects data regarding patients and procedures. 93 Powered by Medtronic’s Adaptive Spine Intelligence (ASI) AI platform, UNiD ePRO can extract patient data directly from EHRs and will collect data from every aspect of the perioperative process, including postoperative patient-reported outcome responses. Medtronic is optimistic that digitizing this process will significantly increase patient response rates regarding outcomes. Other companies such as Stryker and SMAIO have followed suit and are using AI to collect or interpret data regarding patients, imaging, procedures, or outcomes.94,95
These companies emphasize that the use of this data will enhance the power of their AI and other predictive algorithms to stratify patients, identify the proper surgical procedure, reduce complications, and improve patient-reported outcomes. They also highlight these developments as ways to aid surgeons. As Stryker’s CEO detailed in an interview regarding his company’s use of AI for shoulder replacements, “we already had one application for shoulder replacements where we use AI based on a scan to actually suggest to the surgeon what type of implant they should use.” 96 The same goals are no doubt being applied to spinal implants.
The use of AI to predict the ideal patient, surgical strategy, and surgical implants for a particular procedure could certainly lead to more personalized surgery. It may also lead to superior outcomes. As mentioned previously in this current article, future iterations of AI will be able to incorporate and synthesize data that may not have been considered by purely human analyses. That said, there is still the overarching threat of bias with respect to the physician-industry relationship. Device companies continue to amass large amounts of patient data and the rate of data collection will only increase in the future. Yet it is not entirely transparent how companies are using this data or how they curate their databases. Are they able to overcome the inherent flaws that may exist in industry-sponsored studies to objectively counsel on strategy and indications for their implants and enabling technologies? Biases can be difficult to edit out of AI models. 97 Or, conversely, will they be able to objectively decide when their implants may not be indicated at all? The goal of having AI recommend management strategy options will only be truly trusted and implemented when the potential financial conflicts of interest are minimized via independent scrutiny. Surgeons must be able to interpret the data that is used or generated by AI to incorporate that information into the patient’s clinical care plan. Otherwise, the ultimate outcome approximates one in which industry can dictate which procedures or which devices may be used for a particular case. Industry having a monopoly in this space may impede surgeon autonomy or decision-making, hinder important research, and increase the prices of implants or devices. Concerningly, insurance coverage or approval could then become linked to these opaque databases or AI models, requiring physicians to satisfy industry demands before approving clinical decisions, which may not actually facilitate the best patient care.
To that end, surgeons can help counter what is currently a primarily one-way flow of information. Much of this data collection may not be possible without the consent of surgeons themselves. Surgeons oftentimes must enter into an agreement with companies so that their procedure and outcomes data may be added to these repositories. Surgeons having a clear understanding of how companies plan on using this data should be a baseline requirement before they sign over permission to access it. Surgeons should also advocate for how their data could or should be used by companies. This, though, does not necessarily address the full problem. While industry may outline reasonable goals or methods for how they intend to use specific data, having a full, nuanced, objective understanding cannot be achieved without access to the data itself.
Going a step forward, then, we assert that the databases that have been created by these companies should be de-identified of patient, provider, and facility related information and subsequently made open source so that they may be analyzed by non-industry entities. It is imperative that these data be available to the scientific community for independent analysis and validation. Doing so would hopefully address several ambiguities. It would enable auditing of the data by which AI algorithms are trained, partially addressing the “black box” problem innate to some AI and exposing potential biases in the data that would affect the algorithms’ predictive abilities. It would allow institutions without financial incentives to compare their results to those of various companies, providing more validity to results. Spine surgery journals could aid in this endeavor by taking a hardline stance on issues of data use and open access, requesting that authors who seek to publish their results supply appropriate data sharing plans.
If not open source, then these datasets could be overseen by independent advisory boards. Certain technology companies have made a practice of selling deidentified medical data for use in research or other analytics. Doing so for industry data would be a suboptimal but feasible move by which these data could be made accessible to physicians or researchers for a certain price. Ultimately, this would also allow for monitoring of the physician-industry relationship by physicians themselves. Industry partnership is necessary to optimally support patient care in spine surgery, but each side must place the patient’s well-being above all else. This is only fully achieved by transparent data sharing between both sides. Our stance, we understand, may seem fanciful or unrealistic, but we believe that open-source databases are a necessary step in advancing patient care, fostering stronger collaborations, and accelerating scientific advancement in the field of spine.
Conclusions
AI has already found numerous uses in the pre-, intra-, or postoperative spine surgery process, and the applications of AI continue to grow. AI will enable far more efficient collection of data than has previously been achievable. It will also allow for better analysis of huge swaths of data. The clinical applications and benefits of AI will continue to be more fully realized, but so will certain ethical considerations. Large amounts of data are being accrued by industry sources for use by their AI platforms, though the inner workings of these datasets or algorithms are not well known. The authors posit that making industry-sponsored databases open source, or at least somehow available to the public, will help alleviate some of the biases and obscurities between surgeons and industry and will ultimately prove beneficial for patient care.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Correction (October 2024):
The article type has been updated to Narrative Review since its original publication.
