Abstract
Study Design
Prospective Observational Study.
Objective
To determine the alignment of the AO Spine Thoracolumbar Injury Classification system and treatment algorithm with contemporary surgical decision making.
Methods
183 cases of thoracolumbar burst fractures were reviewed by 22 AO Spine Knowledge Forum Trauma experts. These experienced clinicians classified the fracture morphology, integrity of the posterior ligamentous complex and degree of comminution. Management recommendations were collected.
Results
There was a statistically significant stepwise increase in rates of operative management with escalating category of injury (P < .001). An excellent correlation existed between recommended expert management and the actual treatment of each injury category: A0/A1/A2 (OR 1.09, 95% CI 0.70-1.69, P = .71), A3/4 (OR 1.62, 95% CI 0.98-2.66, P = .58) and B1/B2/C (1.00, 95% CI 0.87-1.14, P = .99). Thoracolumbar A4 fractures were more likely to be surgically stabilized than A3 fractures (68.2% vs 30.9%, P < .001). A modifier indicating indeterminate ligamentous injury increased the rate of operative management when comparing type B and C injuries to type A3/A4 injuries (OR 39.19, 95% CI 20.84-73.69, P < .01 vs OR 27.72, 95% CI 14.68-52.33, P < .01).
Conclusions
The AO Spine Thoracolumbar Injury Classification system introduces fracture morphology in a rational and hierarchical manner of escalating severity. Thoracolumbar A4 complete burst fractures were more likely to be operatively managed than A3 fractures. Flexion-distraction type B injuries and translational type C injuries were much more likely to have surgery recommended than type A fractures regardless of the M1 modifier. A suspected posterior ligamentous injury increased the likelihood of surgeons favoring surgical stabilization.
Keywords
Introduction
Thoracolumbar burst fractures are common traumatic injuries which result from a combination of axial loading and a variable degree of flexion.1-3 Most commonly occurring at the junction of the relatively rigid kyphotic thoracic spine and more mobile lordotic lumbar spine, these fractures have the potential to cause significant disability even when not associated with neurological injury. The contemporary management of thoracolumbar burst fractures without neurological deficit remains controversial.4-6
On one hand, some have argued that there are equivalent functional outcomes between operative and non-operative treatment thereby inherently favoring the avoidance of surgical management and its attendant complications.7-9 On the other hand, surgical stabilization affords immediate surgical correction of spine alignment and obviates the purported delayed risks of post-traumatic deformity.10,11 Gertzbein et al argued there was a clear improvement between the degree of kyphotic deformity and subsequent functional outcomes with surgical stabilization and this was shown with good effect by Schnake et al.12,13 However, Thomas et al found that there was no clear link between posttraumatic kyphosis and clinical outcomes.14-19 Hence at present this brings to light a situation of true equipoise for clinicians and has formed the basis for our need to better understand the rational used by treating surgeons who review the morphologic features on CT scans, proceed to a validated universally accepted classification system and ultimately rely on an evidence-based algorithm to select the most appropriate treatment. 20
It is vial to improve our understanding of these thoracolumbar injuries and the surgical decision making processes in order to reduce surgeon equipoise and the variability in recommended treatments for these injuries. Indeed, there is a broad spectrum of thoracolumbar burst fractures. A purely bony incomplete A3 burst fracture morphology is clearly different from a complete A4 burst fracture with indeterminate disruption of the posterior ligamentous complex. 21 It is essential to identify the critical decision-making factors which expert clinicians currently rely upon when determining the need for surgical stabilization to construct a validated classification system which forms the basis for evidence-based decision making.
The AO Spine thoracolumbar injury classification system provides a hierarchical framework to grade fractures of escalating severity whilst taking into consideration neurological status, integrity of the posterior ligamentous complex as well as unique modifying bone diseases such as long segment ossifying bone disease. The most recently updated iteration of the AO Spine Thoracolumbar Injury Classification and Injury Classification and Severity Score was published by Morrissey et al in 2020. 22 This algorithm and scoring system incorporates the newly proposed category of type A fractures in which an A3 fracture is awarded a score of 3 points whilst the A4 fracture is allocated 5 points. The cumulative score directs surgical treatment whereby a score of 3 or less is non-surgical and a score of 6 or more is surgical. In this AO Spine TL Injury Severity Score, all A3 fractures in neurologically intact patients without M1 modifiers would score 3 and thus be deemed appropriate for non-operative management. Conversely, any A4 fracture in a neurologically intact patient with an M1 modifier would score 6 points and necessitate surgical treatment.
The goal of this study is to determine the alignment of the AO Spine Thoracolumbar Injury Classification system and treatment algorithm with contemporary surgical decision making. This is achieved by comparing the responses of the expert panelists in reference to this treatment algorithm and scoring system in order to determine if there is an increased likelihood of operative management of thoracolumbar burst fractures without neurological deficit as the AO Spine category of the injury escalated and as the Severity Score increased.
Methods
A radiographic evaluation and treatment decision questionnaire was sent to 22 experienced spine trauma clinicians from the AO Spine Knowledge Forum Trauma. These expert surgeons were asked to review 183 conventional radiographs (CRs), baseline CT scans, and magnetic resonance imaging (MRI) of neurologically intact patients who had sustained thoracolumbar burst fractures between the levels of T11 and L2. After categorizing the fracture morphology using the AO Spine Thoracolumbar Injury Classification system, clinicians rated the degree of comminution and confidence that the posterior ligamentous complex was disrupted. Preferences for surgical or non-surgical treatment were collected for each case with the surgeons being blinded as to the actual treatment strategy employed.
All patients were aged 18-65 years and recruited from the observational clinical trial titled: “Thoracolumbar burst fractures (AO Spine A3, A4 fractures) in neurologically intact patients: An observational, multicenter cohort study comparing surgical vs non-surgical treatment. (Spine TL A3/4 Study)’’. The fractures were all acute in nature, defined as being diagnosed within 10 days of the trauma, and carried a thoracolumbar injury classification score (TLICS) between 2-5. 23 Patients with severe medical comorbidities precluding surgery, prior spinal surgeries, multi-trauma with injury severity scores of greater than 16, associated malignancy leading to pathological fractures, current prisoners or other severe injuries which would limit neurological assessment were excluded. Management preferences for other thoracolumbar fractures across the type A, B and C categories were also collected for comparison.
After collecting all incomplete and complete responses to the questionnaire, categorical data was analyzed by the chi-square test or Fisher’s exact test. Multivariable logistic regression analysis was performed to determine the likelihood of fracture type and presence or absence of the M1 modifier affecting surgical decision making. The statistical significance level was defined as P < .05. All statistical analysis was performed with SAS version 9.4 (SAS Institute Inc.) and SPSS 28.
The expert panel of 22 assessors were blinded to the actual recommended treatment strategy. Thus there were two treatment recommendations collected for each case. One was the recommendation by the 22 expert panel members (Expert Recommendation) and the other was the actual treatment the patient received (Real-World Treatment) in the Spine TL A3/A4 prospective observational study.
Results
Correlation Between Fracture Subtypes and Decision Making
Variations in management of type A, B and C fractures with comparison between the recommended vs real world management strategy.
Multivariable Association of Fracture Subtypes with Surgical Decision Making
Logistic Regression Analysis regarding likelihood of recommending surgical fixation rather than conservative management.
Association Between Algorithm Recommended Management and Actual Management Decision
Correlation between recommended management using the AO Thoracolumbar trauma classification system and the actual management treatment.
Discussion
The management of thoracolumbar burst fractures without neurological deficit remains controversial.12,14,24 Our study has identified that there is support for the graduated introduction of fracture morphologies by the AO Spine Thoracolumbar Injury Classification system and Treatment Algorithm and Severity Score.22,25 We determined that there was a stepwise increase in the likelihood of operative management across the individual fracture morphology categories. With respect to the broad categories, type B/C injuries had a greater likelihood of being surgically stabilized than type A injuries. More specifically, type B fractures were more likely to be operatively managed than A3/A4 injuries. Subgroup analysis also revealed that patients with the complete A4 injury subtype was more likely to have surgery recommended by the expert panel than the A3 morphology. However, there were some instances in which real world management decisions did not strictly align with the recommendations of the TL Algorithm and Severity Score. For example, surgery was performed in the real-world in an identical percentage of A3 and A4 fractures without distinction. As such, the AO Spine Thoracolumbar Injury Classification was designed and envisioned by clinicians to formulate consistent treatment strategies based upon a validated treatment algorithm.1,26 This is especially important because, despite a bewildering number of existing grading schemes, until now a universally applicable and validated classification and treatment algorithm has proven elusive. 27
Historically, the first attempt at a thoracolumbar fracture classification system was by Boehler in 1929 whose work was continued by Watson-Jones in 1938 by describing four basic morphologies: a simple wedge fracture, comminuted fracture, fracture dislocation and hyperextension injuries.28,29 Watson-Jones 28 also recognized the importance of the posterior ligamentous complex for stability. However, it was Holdsworth et al who divided the spine into two columns which was later further divided by Denis et al into the famous three column model still referenced today.30,31 A criticism levelled against the model proposed by Denis et al was its focus on biomechanical stability without translation clinically into whether there was any neurological deficit. 31
The ultimate goal of these historical classification systems was to identify unstable fractures and therefore those which were more likely to require surgical stabilization. McCormack introduced his load-sharing classification in 1994 which took into consideration the degree of kyphosis, degree of vertebral comminution and the apposition of the fracture fragments. 32 The focus of this scheme was to determine whether an anterior or posterior approach with short-segment fixation would be most appropriate. 32 Eventually, the Magerl 1994 system was devised as the culmination of a 10 year review of 1445 cases. 1 This detailed classification system divided injuries into three groups: type A were compression injuries, type B were distraction injuries and type C were rotational and translational injuries.1,33 Each type was divided into three groups, which in turn was divided into three subgroups. Whilst extremely detailed and allowing precise communication between clinicians, the Magerl system was rather complex and unwieldy limiting its application in everyday use. 1 This was not only a disadvantage but also meant that there was only moderate reliability and reproducibility given that spine surgeons would rate the same fracture morphology differently 18% of the time at three month intervals.34,35
Consequently, Vaccaro et al in 2005 pioneered the thoracolumbar injury severity score (TLISS) which had excellent internal validity but only fair interobserver agreement. It relied upon the three elements of the mechanism of injury, posterior ligamentous complex integrity and neurological status. 36 Finally, this led to the development of the thoracolumbar injury classification score (TLICS) which Lee et al and Koh et al determined to have acceptable intra-rater and inter-rater reliability.23,37,38 A major advantage of the TLICS was its ability to provide a treatment recommendation based upon a points-scoring system. 23 Indeed, Patel et al 39 determined a similar finding that the TLICS had excellent validity with recommendations from the algorithm matching actually employed treatment in 96% of cases.
However, the TLICS still had its own weaknesses. Firstly, there were concerns with the feasibility of assessing the posterior ligamentous complex using MRI, as well as the reproducibility across different geographic variations. This was addressed by Schroeder et al 40 in their landmark study which demonstrated there was no regional variation in the interpretation of burst fractures. Secondly, all burst fractures were as a routine allocated two points with no ability for clinicians to communicate that a minor incomplete burst fracture was not as severe as a comminuted, angulated complete burst fracture with significant canal compromise. This TLICS scoring system recommended non-surgical treatment for all patients with A3 fractures who are neurologically intact (without the M1 modifier). Nonetheless, in many parts of the world these A3 fractures are treated surgically as demonstrated in our results (Table 1). Lastly, there was a grey zone of four points where clinicians were left to make a reasonable decision. For example, a burst fracture with indeterminate PLC integrity was awarded four points and clinicians were left to decide on a treatment strategy. 41 This is accurate in its reflection of the controversial state of the literature but was a major drawback in providing a definitive treatment algorithm to clinicians. It was Kepler et al who eloquently noted that Vaccaro et al aimed to address all of these limitations of the TLICS with the revised Thoracolumbar AO Spine Injury Score (AL AOSIS) in 2015 as a compromise between the rather detailed Magerl system and perhaps the rather too simplistic TLICS algorithm.42–44
As the synthesis of these historical systems, the AO Spine Classification System distilled the essence of the 1994 Magerl scheme into three main groups. 44 Retained from the TLICS system is the importance of the posterior ligamentous complex and also the neurological status of the patient. 45 What remains a glaring area of treatment uncertainty is A3 and A4 burst fractures in neurologically intact patients which various authors have argued either nonoperative or operative management would be reasonable. Perhaps one reason this fracture subtype has resisted clear treatment recommendations is the difficulty in accurate identification given the A4 injury is more difficult to identify than any other fracture morphology with a kappa score of .19. 43
In 2020, Morrissey et al 22 proposed a further adaptation of the AO Spine Thoracolumbar Injury Classification Score and Severity Score in which the A3 fracture received a score of 3 points while the A4 fracture received 5 points. 22 In this modification, non-surgical care was recommended for scores of 3 or less and surgical care for scores of 6 or higher. 22 This served to expand the grey zone of indeterminate treatment (4-5) while adding a higher score for the complete burst fracture A4. 22
This study determined that surgeons were five times as likely to recommend surgery for A4 fractures as A3 injuries even with the indeterminate presence of a ligamentous injury (M1 modifier). 22 Our findings therefore support the hierarchical nature of the current AO Spine system which allocates the A4 fracture five points and is actually rated as equivalent in severity to a B1 osseous tension band injury, both of which are allocated two points more than the incomplete burst A3 fracture of three points. 46 In other words, the A4 injury is closer to a B injury than A3 injury, thereby reflecting its perceived severity by clinicians and need for operative intervention. This is a major advantage of the new TL AOSIS system. 46 Unfortunately, in the real-world this differentiation between A3 and A4 does not translate into different rates of surgery and even among expert panel members 32% of A4 fractures have non-surgical treatment recommended. This is likely due to a combination of different inherent patient-specific factors being important in guiding individual clinician decision making and an overall need to promote stricter adherence to this validated algorithm.
Importantly, the separation of the subtypes of fractures into discrete categories of severity has proven reproducible and demonstrated moderate interobserver reliability according to fracture subtype. This was shown by the authors themselves as well as independently by Urrutia et al.46,47 Kaul et al also verified that the AO Spine TL Injury Classification had superior reliability in identifying fracture morphology compared to the existing TLICS, which was crucial because the TL AOSIS was more complicated than the existing TLICS and the weakness of previous complex systems such as the 1994 Magerl was poor reproducibility secondary to overly detailed fracture morphology schemes.22,46,48 Schnake et al 24 also lauded the AO system for its ability to take into consideration important factors including neurological status, as well as treatment modifiers such as long segment ossifying bone disease. Pleasingly, the AO System with respect to the A3 and A4 fractures has also withstood bias due to either regional variation as investigated by Kweh et al and or surgeon experience as interrogated by Sadiqi et al35,49,50
This does not mean that TL AOSIS is the ideal solution and superior to the TLICS. Indeed, An et al 51 compared the TLICS and TL AOSIS directly and determined that treatment recommendations matched actual surgical decisions more often with the TL AOSIS (98.2% vs 87.3%, P = .002). 51 They posited that the TL AOSIS weights certain fracture morphologies differently to a greater extent and therefore may be more sensitive in detecting complete burst fractures as more severe than incomplete burst fractures. Furthermore, a more contemporary evaluation of the TL AOSIS by Nagi et al determined that the TL AOSIS correlated with treatment recommendations in 88.6% of cases compared to the TLICS correlation rate of 85.7% (P = .614). 52 More than this though, the TL AOSIS achieved 95% sensitivity and 80% specificity whereas the TLICS only achieved 72.2% sensitivity and 100% specificity. 52 Despite this, Joaquim at al found that there exists a cohort of neurologically intact patients (18 of 37 cases) who actually had unstable burst fractures which were awarded three points by the AO System and therefore were recommended to have conservative management. 41 In contrast, the TLICS score which awards two points for indeterminate PLC injury and three points for confirmed PLC injury, compared to the TL AOSIS which only awards one point for PLC injury, was more accurate in this circumstance of clinical equipoise. 44
A component of the uncertainty in determining PLC injury with the current AO Spine thoracolumbar spine injury classification system is due to the fact that there is only moderate reliability of assessing the PLC as Schroeder et al 53 discovered. 53 The difficulty in reliably determining PLC integrity with an MRI, which may not be readily accessible in certain centers across the world, may at first glance appear to limit the use of the TL AOSIS. 53 However, our study investigated the importance of the M1 modifier of indeterminate ligamentous injury. It was Rajasekaran et al who astutely suggested that whilst an MRI was necessary to formally diagnosed a type B injury, this imaging actually provided minimal additional information in neurologically intact patients such as those in our study.54,55 This was confirmed by our results that B and C type injuries were overwhelmingly more likely to be offered surgical stabilization than A type injuries regardless of whether the M1 modifier was present or absent.
This is not to say that the M1 modifier is irrelevant. In fact, it is crucial to be aware of the integrity of the PLC given McAfee et al advocated for surgical treatment in unstable injuries, and the PLC being disrupted increased the risk of neurological damage and deficit from 22% to 80%. 56 However, Maheswaran et al and Tang et al have demonstrated with increasingly high-quality CT scans that parameters such as local kyphosis, interspinous distance and interpedicular distance that can be used to determine PLC integrity without an MRI.57,58 Ganjeifar et al 59 even argued that the diagnostic results of PLC injury with a CT scan were similar to those obtained by MRI in thoracolumbar burst fractures. 59
The results of our study demonstrate that the current AO Spine Thoracolumbar Injury Classification system presents fracture categories in a generally logical manner. Clinicians were more likely to recommend operative management when confronted with higher category injuries although this was not always necessarily reproduced in the real-world environment of the Spine TL A3/A4 study. The presence of the M1 modifier did heighten the perceived need for operative management. Furthermore, within the type A injury there was a statistically significant difference in the recommended management of A4 compared to A3 injuries among the panel of experts. Interestingly, a proportion of A0/A1/A2 injuries (12%) were management operatively despite the algorithm recommendations and further evaluation into the decision making factors in this subgroup is also required to determine whether this was a result of A2 fractures progressing to a higher category fracture with time.
A major strength of this study was its robust study inclusion protocol and use of genuine patient cases which imparts a strong sense of internal validity. The international pool of centers from which the patients were derived is designed to confer strong external validity and generalizability. Unfortunately, there were a relatively small number of clinicians involved in evaluating the 183 selected cases. The AO Spine Thoracolumbar Injury Classification system and treatment algorithm strikes a delicate balance between brevity and accuracy. There will always be additional factors such as local kyphosis angle, degree of vertebral body comminution or degree of canal compromise which may also affect clinician decision making. Future studies could evaluate the AO Spine Thoracolumbar Injury Classification system with a larger group of surgeons from varying surgical specialties (neurosurgery and orthopedics) across the six distinct AO spine geographic regions in light of even more specific measured treatment outcomes.
Conclusion
The AO Spine Thoracolumbar Injury Classification System is a rational and hierarchical method of introducing fracture morphologies of escalating severity. This translated from expert recommendation into the clinical setting across the type A, type B and type C injury categories. Thoracolumbar A4 complete burst fractures were more likely to be operatively managed than A3 fractures. Flexion-distraction type B injuries and translational type C injuries were much more likely to be fixated than type A fractures regardless of the M1 modifier. The presence of a suspected posterior ligamentous injury did increase the likelihood of surgical stabilization. Our novel findings validate the utility of The AO Spine Thoracolumbar Injury Classification System but also highlights the need to resolve some inconsistencies between expert treatment recommendations and real world management.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was organized and funded by AO Spine through the AO Spine Knowledge Forum Trauma, a focused group of international Trauma experts. AO Spine is a clinical division of the AO Foundation, which is an independent medically-guided not-for-profit organization. Study support was provided directly through AO Network Clinical Research.
