Abstract
Study Design
Retrospective analysis of prospectively collected data.
Objectives
To compare decision-making between an expert panel and real-world spine surgeons in thoracolumbar burst fractures (TLBFs) without neurological deficits and analyze which factors influence surgical decision-making.
Methods
This study is a sub-analysis of a prospective observational study in TL fractures. Twenty two experts were asked to review 183 CT scans and recommend treatment for each fracture. The expert recommendation was based on radiographic review.
Results
Overall agreement between the expert panel and real-world surgeons regarding surgery was 63.2%. In 36.8% of cases, the expert panel recommended surgery that was not performed in real-world scenarios. Conversely, in cases where the expert panel recommended non-surgical treatment, only 38.6% received non-surgical treatment, while 61.4% underwent surgery. A separate analysis of A3 and A4 fractures revealed that expert panel recommended surgery for 30% of A3 injuries and 68% of A4 injuries. However, 61% of patients with both A3 and A4 fractures received surgery in the real world. Multivariate analysis demonstrated that a 1% increase in certainty of PLC injury led to a 4% increase in surgery recommendation among the expert panel, while a .2% increase in the likelihood of receiving surgery in the real world.
Conclusion
Surgical decision-making varied between the expert panel and real-world treating surgeons. Differences appear to be less evident in A3/A4 burst fractures making this specific group of fractures a real challenge independent of the level of expertise.
Introduction
The thoracolumbar junction is most susceptible for injuries. About two thirds of all fractures occur at this site. The most common mechanism is falling from height, during sports or within a motor vehicle accident. 1 The axial impact to the spine with or without a concomitant bending moment may lead to the fracture of the vertebral column. 2 In addition to the high impact on the quality of life, there are high socioeconomic costs due to treatment, loss of work, short- and long-term disability.2,3 Therapeutic decision-making includes non-surgical and surgical treatment in fractures without neurological compromise while surgical treatment is more commonly recommended in cases with neurological compromise or refractory pain. Despite the frequency and severe sequelae of the fracture, treatment of thoracolumbar burst fractures (TLBF) in patients who are neurologically intact (Type A3/A4 according to AO Spine Thoracolumbar Injury Classification System) remains controversial.
The AO Spine classification for thoracolumbar fractures has shown substantial inter- and intraobserver reliability (k = .62). 4 A proper classification allows the comparison of different therapeutic approaches among injuries with similar morphological characteristics.
Despite the substantial reliability of the AO Spine classification in general, thoracolumbar fractures may present a heterogeneous radiological pattern, and misclassification is possible. Interestingly, the lowest agreement (kappa = .19) within the validation study was found for A4 fractures. There is an ongoing debate in literature for the best treatment of A3 and A4 fractures.5,6 Failing to recognize a specific fracture pattern might be one of the factors that contributes to a wide disparity in treatment recommendations.
The AO spine classification provides modifiers for more precise fracture analysis, arguing for or against surgical treatment. However, these modifiers are used inconsistently in both clinical and research settings. Other than the exact fracture pattern of the bone and potential modifiers, the integrity of the posterior ligamentous complex (PLC) is thought to be very important to guide surgeons in making therapeutic decisions.7,8 Injury to the PLC affects the static of the spine and may increase the risk for fracture subsidence.
In addition to fracture classification and related factors, the decision-making process for treatment may also be affected by variables such as the surgeon’s educational background, individual experience, regional treatment guidelines, resource availability, and patient preferences. Consequently, the extent to which radiological characteristics, such as fracture morphology, and fracture classification are influenced by a surgeon’s expertise remains uncertain. This equipoise study offers the unique opportunity to compare how the radiographic assessment influences decision making and how it differs from the actual decision making where other factors are considered.
The aims of this study are: (1) To analyze the degree to which the classification of fractures and the treatment recommendations of a panel of expert surgeons correlate with the actual classification and treatment that patients receive in a real-world setting. (2) To evaluate the influence of radiological characteristics such as the probability of PLC injury and M1 case-specific modifier in decision making.
Methods
This study was conducted by the AO Spine Knowledge Forum Trauma, an expert research study group supported by the not-for-profit organization AO Foundation whose members are generally recognized for their academic expertise in spinal trauma. The study is part of a prospective observational study of TL Fractures, the Spine TL A3/A4 study. 9
The detailed methodology is available in the article of Dandurand et al “Understanding Decision Making as it Influences Treatment in Thoracolumbar Burst Fractures Without Neurological Deficit: Conceptual Framework and Methodology” in this focus issue. The AO Spine Knowledge Forum Trauma completed consent and recruitment for a multicenter prospective observational study of TL Fractures; the Spine A3/A4 study. Each enrolling center obtained local approval from their local institutional review board. The baseline CT scans and conventional radiographs of 183 patients were available for this study. All patients were neurologically intact and had injuries between T11 and L2.
In the analysis presented here, we compare the expert group’s classification and treatment recommendations with those of the treating surgeons’ actual decision on-site (Aim 1). Twenty two Spine Trauma experts from different parts of the world were asked to recommend a treatment for 183 TLBFs. The 22 experts, all with extensive experience in management of spinal trauma were recruited from the AO Spine Knowledge Forum Trauma (KF Trauma). Each member of the expert panel independently reviewed the DICOM images of the 183 TL fracture cases and were asked to classify each injury based on the latest AO TL Injury Classification System, assess the degree of certainty of PLC disruption and the degree of comminution. Finally, they were asked to recommend treatment – either surgical or non-operative, which specific type of treatment and finally asked how confident they were in this recommendation. These experts were agnostic to the actual treatment that the patient received within the Spine TL A3/A4 Study and were also agnostic to any results of the TL Spine A3/A4 study.
Each expert recommendation was compared to the treatment actually received in real-world. We use descriptive statistics and crosstabulation to compare the overall therapy recommendation of the expert group with the actual on-site treatment. In addition, we consider the treatment decision or recommendation concerning the fracture pattern according to the AO Spine TL injury classification system. To test validity, we used Pearson Chi-Square Test.
To evaluate the influence of radiologic characteristics, such as the presence of PLC injury and case-specific M1 modifier-indeterminate injury to the tension band based on MRI, on decision-making (Aim 2), we performed a multivariate logistic regression analysis. To assess the impact of radiologic features on decision-making (Aim 2), we focused on factors like the presence of PLC injury and case specific M1 modifier. We conducted a multivariate logistic regression analysis to evaluate these influences systematically. The experts were asked to assess the probability (0%–100%) of the presence of a PLC injury based on the available images. PLC values and the M1 modifier were used as independent variables for logistic regression model. SPSS, R and SAS were used for data analysis.
Results
Overall Recommendations
We included 183 cases of TLBFs without neurologic deficit. Twenty-two experts were asked to give a treatment recommendation after reviewing all available radiological images without knowing the actual treatment these patients have received. We therefore received 3929 aggregated recommendations.
Correlation Between the Expert Panel and Real-World Treating Surgeons
Regarding surgical treatment, the overall agreement between expert panel and real-world surgeons was observed in 63.2%. In contrast, in 36.8% of cases, the expert panel recommended surgery that was not performed in the real-world scenario.
Vice versa, in all the cases where the expert panel recommended non-surgical treatment after reviewing the imaging, only 38.6% actually received non-surgical treatment. The remaining 61.4% of patients received surgery.
Comparison Between Expert Panel Surgeons and Real-World Treating Surgeons Based on Injury Fracture Pattern
Surgery Recommendations by Expert Panel Based on Injury Pattern Groups.
Surgery Actually Performed by Real-World Treating Surgeons Based on Injury Pattern Groups.
Comparison Between Expert Panel Surgeons and Real-World Treating Surgeons Based on Distinction Between A3 and A4 Fracture Pattern.
The expert Panel was as likely to recommend surgery as non-surgical care (51% vs 49%) when looking at A3/A4 fractures together while in the real world almost 2/3 of patients were treated surgically (62%). When A3 and A4 was explored separately, the expert panel recommended surgical care in 30% of A3 injuries and 68% in A4 fractures. In the real world an identical proportion (61%) of patients with A3 and A4 fractures actually received surgery.
Correlation of PLC Injury Identification and M1-Modifier in Surgical Recommendation Between Expert Panel and Real-World Treating Surgeons
Influence of Surgery Recommendation Based on PLC and Modifier Likelihood Among the Panel of Experts and Real-World Treating Surgeons.
Discussion
Treatment decision-making in TL fractures without neurological compromise is controversial among spine surgeons.10-12 The recommendations may vary according to the surgeon’s experience and background, the patient’s status, comorbidities, and proper classification based upon imaging findings.10-12 In this study, we compared treatment recommendation from international experts reviewing available imaging data to on-site spine surgeons treating patients with these fractures in a real-world setting.
Interesting differences can be found in the therapeutic recommendations of the expert panel compared to the actual real-world decisions. If the expert group considered surgical treatment based on imaging findings, 63.2% of the patients actually underwent surgical treatment in the real-world, compared with 36.8% who were managed non-surgically.
Particularly interesting are the patients for whom the experts did not consider initial surgical treatment. Almost two thirds (61.3%) of these patients nevertheless received surgical treatment from the treating surgeon on site, while only 38.7% were treated non-surgically.
Investigating the B1, B2 and C type subgroups of injuries, while the expert panel strongly favored surgery (97%), only 70% of these patients actually were treated with surgery in the real world. This strongly suggest that there are variables other than the morphological appearance of the fracture pattern on CT scan that are directing actual real world treatment decisions. These variables could include strong local preferences for non-surgical and surgical care among specific surgeons that over-ride any considerations based upon morphological features and subsequent classification categories. It may be that patient preference has a stronger influence in the real-world and guides treatment. Finally, there may be issues such as individual patient characteristics (body habitus, etc) as well as resource availability that influence treatment decisions.
Interestingly, when comparing surgical indications in the A3/A4 fracture patterns, the differences between the expert panel and real-world treating surgeons were less evident with almost identical proportion of surgical and non-surgical recommendations from the expert panel (49% and 51%). In the real world, 61% of A3 and A4 fractures were both treated surgically.
The distinction between A4 and A3 fractures becomes particularly intriguing when analyzed separately. The expert panel differentiates these two injury types by recommending surgery for 70% of A4 cases and non-surgical care for 70% of A3 cases. However, this distinction does not seem to fit real-world treatment decisions, as both A3 and A4 fractures undergo surgical treatment in 61% of cases, irrespective of the morphological differences between A4 and A3 injuries, as outlined in the AO classification.
There are several radiographic characteristics that are thought to influence the categorization of TLBF’s and guide treatment recommendations. These include: degree of vertebral body comminution, local kyphosis, PLC integrity, and canal compromise. Add to this the influence of each surgeons’ experience, local resources availability, patient preference and deeply held local beliefs regarding what the best treatment approach is for these fractures. Which of these factors have a strong influence on the decision-making remains currently unclear although we have shown that radiographic analysis by a group of experts is not adequate on its own to identify which patients actually receive surgery in the real-world setting.
In a prospective randomized trial, Wood et al compared non-operative vs operative management of TLBF without neurological compromise. They reported that patients managed non-operatively had less pain and better function in the conservative group (ODI of 20 in the operative group vs 2 in the non-operative group – P < .001) without significant differences in local kyphosis (13° for those who had surgery vs 19° for those treated non-surgically, P = .003) in long-term follow-up (16 to 22 years). 14
On the other hand, Siebenga et al demonstrated in a multicenter randomized trial that there was some patient benefit when treated surgically with a short segment fusion, with better functional outcome scores (VAS pain, VAS Spine Score and RMDQ-24).
15
Finally, a Meta-analysis performed by Gnanenthiran et al comparing non-operative vs operative treatment for burst fractures without neurological compromise, surgical treatment showed a slight improvement in residual kyphosis (12.8
In our study, the expert panel of surgeons recommended treatment only based on radiological imaging characteristics. Many authors have analyzed the value of imaging studies when classifying thoracolumbar fractures. Rajasekaran et al evaluated the classification process of 30 thoracolumbar injuries among 41 spine surgeons. They reported that x-rays alone were insufficient to classify fractures correctly (only 43.4% of the cases were correctly classified), and ct scan added significant accuracy to the classification process (further 18.2%) with a modest gain in sensitivity when adding an MRI. 17 They concluded that CT was mandatory for proper classification.
When analyzing the factors that increased the likelihood for surgical recommendation from the expert panel of surgeons, we observed that PLC injury and the presence of M1 case-specific modifier were the most important. We did not analyze the M2 (patient-specific comorbidity) modifier as the expert panel was unaware of baseline clinical information. Expert surgeons were asked how certain they were about PLC injury from 0%–100% regarding suspected PLC injury. We found that the likelihood to indicate surgery increases by 4% for each 1% increase of suspicion in PLC injury among the expert panel, this result contrasted with the real-world surgeon’s group, in which the indication for surgery was not as clearly influenced by the radiographic evidence of PLC disruption only increasing 1% by each 1% increase in PLC injury. CT-scan suspected PLC injury has an essential role in expert decision-making. Reliability of identifying PLC integrity based on CT scan findings is also subject to faults and variations,18-20 and this could affect decision-making even among experts. On the other hand, the presence of M1-modifier significantly increased the potential for a recommendation for surgery by the expert panel 4-fold, meaning that these two factors were significant radiological predictors for surgical recommendations from the experts. Regarding the real-world surgeons, the presence of M1-modifier did not appear to influence treatment. We found no statistically significance. Therefore, this modifier probably did not influence surgical decision-making. We believe that in the real world, there must be other factors influencing surgical decision-making in addition to the morphology of the fracture that was made available to the expert panel.
This study has several strengths. Firstly, the retrospective analysis of prospectively collected data lends credibility to the findings. The comparison of decision-making between an expert panel and real-world spine surgeons in TLBFs without neurological deficits provides valuable insights into the factors that influence surgical decision-making. Additionally, the use of a diverse group of 22 experts to review 183 acute, baseline injury CT scans further strengthens the study.
However, there are also limitations to this study. As a sub-analysis of a prospective observational study in thoracolumbar fractures, the results might not be generalizable to other types of fractures or different patient populations. Moreover, the study only assessed the impact of radiographic review on the experts’ decision-making and did not consider other factors that may have influenced the actual treatment received by patients, such as resource availability, regional treatment protocols, or patient preferences.
Despite these limitations, the study offers valuable insights into the differences in surgical decision-making between expert panels and real-world treating surgeons. It highlights the challenges associated with A3/A4 burst fractures, emphasizing the need for further research to understand the factors that contribute to the discrepancies in surgical recommendations and actual treatments.
Conclusion
Radiologic factors are essential in decision-making for TLBFs. International experts who assessed only the available imaging agreed with the treating surgeons on site in 63% of cases when deciding on surgical treatment. On the other hand, 61% of patients for whom the experts suggested nonoperative management underwent surgical treatment, indicating the possible influence of other factors. While the expert assessors differentiated between A3 and A4 fractures, this was not the case in the real-world tendency to treat patients surgically or non-surgically. As the likelihood of PLC injury increased and modifiers were present, the likelihood of the experts recommending surgery also increased, suggesting that better identification of PLC injury may improve the capacity of the AO Classification system to predict final treatment.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was organized and funded by AO Spine through the AO Spine Knowledge Forum Trauma, a focused group of international Trauma experts. AO Spine is a clinical division of the AO Foundation, which is an independent medically-guided not-for-profit organization. Study support was provided directly through AO Network Clinical Research.
