Abstract

We thank the authors for their contribution and commend their comprehensive and detailed work. The authors perform a systematic review that compares the reliability of scoring for thoracolumbar spine trauma comparing the AO Spine Classification and the Thoracolumbar Injury Classification and Severity Scale (TLICS). We do have concerns with their interpretation of reliability as clinical utility and hope to highlight our concerns for the benefit of the reader. First, we do not believe the authors are making equivalent comparisons. The papers they include generally compare the morphological portion of the AO Spine classification to the comprehensive TLICS; which we use less so as a morphologic classification system and more as a decision aid. There are modifiers used that allow the AO Spine classification system to be used this way and applied clinically, but they are not mentioned by the authors in this paper. The M1 modifier for example is critical in assessing posterior tension band integrity, which is not mentioned by the authors yet is an integral part of the TLICS classification that is being compared throughout. At best, the authors are comparing a morphological classification to a global scale used to guide treatment. In fairness, the authors recognize this limitation and do identify two papers directly comparing only the injury pattern component of the TLICS score to the morphology component of the AO Spine classification. They suggest that the papers still support higher reliability in the AO classification. However, when looking at Kaul et al, 1 the comparison is between fracture type (A/B/C), and TLICS morphology; not a comprehensive comparison. The second paper mentioned, Pishnamaz et al, 2 does compare AO Spine fracture subtype (A0,A1…/B1,B2./C) to TLICS morphology and finds no difference in agreement.
Unfortunately, the conflation of classification reliability with clinical utility is a fundamental error made in this paper. Similarly, the Glasgow Coma Scale is a system known to possess poor intra and inter-observer reliability, 3 but one with high clinical utility, particularly at the extremes of the scale, to guide urgent intervention. A lack of agreement on one or two points on a scale may be of little concern when the clinical recommendation is the same. There is a place for both classifications, and we are proponents of the morphologic accuracy and descriptive utility of the AO classification. However, we would caution the reader to appreciate the entirety of both classification systems, and we consider both to remain clinically valuable.
