The Modified Spinal Instability Spondylodiscitis Score (mSISS): Adaptation and Validation of a Novel Classification System for Spinal Instability in Spondylodiscitis

Abstract

Study Design

Multicenter reliability and validation study.

Objectives

To adapt and validate the Spinal Instability Spondylodiscitis Score (SISS) to create a practical and reliable classification system for assessing spinal instability in pyogenic spondylodiscitis.

Methods

The original SISS was modified through structured consensus meetings within the AO Spine Trauma and Infection Knowledge Forum, resulting in the modified SISS (mSISS). The mSISS incorporates four parameters—location, extent of bone lesion, spinal alignment, and mechanical pain—to classify lesions as stable, potentially unstable, or unstable using computed tomography (CT) and clinical data. Fifteen experienced spine surgeons independently evaluated ten representative cases in two rating sessions. Intra- and interrater reliabilities were calculated using intraclass correlation coefficients (ICC) and Fleiss’ Kappa. The gold standard was established by consensus of five expert spine surgeons.

Results

The mSISS demonstrated excellent intrarater reliability for the total score (ICC 0.90, 95% CI 0.84-0.96). Interrater reliability for the total score was 0.87 (95% CI 0.77-0.97) in the first assessment and 0.89 (95% CI 0.80-0.98) in the second. Reliability of individual parameters ranged from moderate to excellent, with spinal alignment showing the highest variability but remaining within acceptable agreement levels. Agreement with the gold standard ratings was high across all parameters.

Conclusion

The mSISS is a simplified and clinically applicable scoring system for the assessment of spinal instability in pyogenic spondylodiscitis, demonstrating strong reliability among expert spine surgeons. Broader international validation is ongoing to support its integration into clinical decision-making.

Keywords

spondylodiscitis infection instability surgical treatment

Introduction

The incidence of infections of the spine has been steadily rising.¹ However, even though infections of the spine are associated with high morbidity and mortality, there is a lack of an established classification system with associated treatment recommendations to optimize therapeutic management. While in most cases initial treatment is conservative, to date, there are no widely accepted guidelines for surgical treatment decisions. In particular, even though spinal instability is known to be crucial in guiding surgical decisions, instability has yet to be defined in the context of infection.^2,3 Furthermore, there remains to be a wide international variability in the treatment of spinal conditions including spinal infection.⁴

To address this lack of consensus, in 2022, the Spinal Instability Spondylodiscitis Score (SISS) was created to establish a universally accepted classification system to define spinal instability in the presence of spondylodiscitis and thereby to aid surgeons in the decision-making process regarding surgical stabilization using computed tomography (CT) imaging.⁵ Based on the widely accepted Spinal Instability Neoplastic Score (SINS), the SISS removes the posterolateral involvement category but incorporates bone lesion, spinal alignment, localization, and mechanical pain as the four parameters to distinguish between stable, potentially unstable, and unstable infectious lesions of the spine. While in the initial publication, the SISS was shown to have high reliability and validity in detecting unstable spinal lesions, subsequent external validation studies yielded poorer performance results.^5–7

The AO Spine Knowledge Forum Trauma and Infection is a group of expert spine surgeons developing reliable classification systems based on the systematic study of all available knowledge. Based on these classification systems, simple and clinically useful diagnostic and therapeutic algorithms are established.^8–10 Due to this expertise in developing classification systems, the AO Spine Knowledge Forum Trauma and Infection was tasked to adapt and validate the previously published SISS as a classification system to rate spinal instability in the presence of pyogenic spondylodiscitis. The ultimate goal was to make the SISS a comprehensive yet simple classification system with acceptable intra- and interrater reliabilities, which can be used for both clinical and research purposes.

Methods

The presented reliability analysis was approved by the institutional ethics committee (EA1/019/21). Due to its retrospective design, the requirement for individual informed consent was waived. The study was conducted in accordance with the ethical standards of the Declaration of Helsinki.

Adaptation of the SISS

The development of the SISS has previously been described in detail and adequate reliability has been shown in a group of orthopaedic surgeons and radiologists from a single center.⁵ The SISS was now adapted by the AO Spine Knowledge Forum Trauma and Infection in three consensus meetings to make the scoring system more comprehensible and clinically applicable. The modified SISS (mSISS) includes four parameters, which are presented in Table 1.

Table 1.

Parameters of the Modified Spinal Instability Spondylodiscitis Score (mSISS)

Parameter	Score
Location
Junctional (occiput-C2, C7-T2, T11-L1, L5-S1)	3
Non-junctional	0
Bone lesion
>50% segment involvement	3
<50% segment involvement	0
Spinal alignment
Subluxation/translation	3
De novo deformity (kyphosis/scoliosis)	2
Normal alignment	0
Mechanical pain
Yes	1
No	0
Total score
0-2 stable lesion
3-4 potentially unstable lesion
5-10 unstable lesion

Location

This category defines whether the spinal lesion is located in a junctional zone of the spine. Here, it is important that the entire affected segment, including the intervertrebral disc and the upper and lower vertebral body, is considered. Patients with infections of the segment C0/1, C1/2, C7/T1, T1/2, T11/12, T12/L1, or L5/S1 receive 3 points, whereas patients with infections of any other segment receive 0 points.

Bone Lesion

Here, the extent to which the spinal segment is affected is considered. Patients, who show involvement of >50% of the spinal segment (meaning either 25% of the upper and lower vertebra each or 50% of one of the vertebrae adjacent to the affected intervertebral disc) receive 3 points, whereas patients who show involvement of <50% of the spinal segment receive 0 points.

Spinal Alignment

In this category, spinal alignment is evaluated. If standing radiographs are available, these may be used in this category, but are not mandatory. Patients with subluxation of the facet joints or translation of the vertebra receive 3 points, patients with de novo deformity in the sagittal or coronal plane as in de novo kyphosis or scoliosis receive 2 points (Figure 1).

Figure 1.

Computed tomography imaging of three patients with pyogenic spondylodiscitis of the lumbar spine. (A) Spondylodiscitis of the segment L2/3 with normal alignment (0 points) and less than 50% segmental involvement (0 points). (B) Spondylodiscitis of the segment L2/3 with de novo kyphosis (2 points) and over 50% segmental involvement (3 points). (C) Spondylodiscitis of the segment L1/2 with subluxation and translation (3 points) and >50% segmental involvement (3 points)

Mechanical Pain

Mechanical pain is defined as pain associated with spinal loading such as movement or upright posture. Patients who report mechanical pain receive 1 point.

Reliability Assessment

The results of the two reliability assessments are presented. Participating surgeons were presented with a video introduction to the classification system and a reference guide to the parameters of the mSISS including written descriptions and iconic depictions (Figure 2). To ascertain that all raters evaluated the same spinal level, the level of infection to be graded was designated. The presence of mechanical pain was obtained from the patients’ medical records and was given to the raters for calculation of the total score. Prior to the reliability assessment, two cases were presented to test the raters’ understanding of the system.

Figure 2.

Depiction of the parameters of the mSISS which was used in training the raters

High-quality CT images together with clinical data from ten patients with spinal infection chosen to represent all instability types were independently assessed by 15 investigators on two separate occasions. The second round of grading used a case order that had been scrambled using a random number generator and occurred one month after the first round.

The gold standard was established by consensus of five expert spine surgeons (ME, MA, RY, GCW, CD, MH, HC) from the five different AO regions (Middle East/North Africa, Europe, North America, Latin America, Asia Pacific) who reviewed all cases.

Statistical Analysis

Statistical analysis was performed using Fleiss’ Kappa for individual parameters (bone lesion, spinal alignment, and location) and interclass correlation coefficients (ICCs) for the total score to assess interrater and intrarater reliabilities. The results of the Kappa statistics were interpreted according to Landis and Koch as follows: <0.2 slight, 0.2-0.4 fair, 0.4-0.6 moderate, 0.6-0.8 substantial, >0.8 excellent reliability or reproducibility.¹¹ Analysis was performed with SAS Software 9.4 TS1M8 MBCS3170 (SAS Institute Inc, Cary, NC, USA).

Results

Reliability Assessment

A total of ten cases were scored (Table 2).

Table 2.

Summary of the Results as Provided by the Gold Standard

Parameter	Gold standard results
Mechanical pain, n (%)
No	3 (30%)
Yes	7 (70%)
Location, n (%)
Non-junctional	7 (70%)
Junctional	3 (30%)
Bone lesion, n (%)
<50% segment involvement	4 (40%)
>50% segment involvement	6 (60%)
Spinal alignment, n (%)
Normal alignment	6 (60%)
De novo deformity	2 (20%)
Subluxation/translation	2 (20%)
Total score, mean (sd)	4.4 (3.3)
Stability, n (%)
Stable lesion	3 (30%)
Potentially unstable lesion	3 (30%)
Unstable lesion	4 (40%)

Intrarater Reliability

Overall intrarater reliability was 0.90 (95% CI 0.84-0.96) for the total score with intrarater reliabilities of 0.85 (95% CI 0.76-0.94) for bone lesion, 0.67 (95% CI 0.55-0.80) for spinal alignment, and 0.83 (95% CI 0.70-0.97) for location. Of the 15 raters, twelve showed excellent intra-rater reliability for the total score, seven showed excellent reliability for bone lesion, six showed excellent reliability for spinal alignment, and nine showed excellent reliability for location.

Interrater Reliability

Overall interrater reliability was 0.87 (95% CI 0.77-0.97) for the total score in the first assessment, and 0.89 (95% CI 0.80-0.98) for the second assessment. The interrater reliabilities for the individual items are given in Table 3.

Table 3.

Interrater Reliabilities for the Score’s Individual Items (Fleiss’ Kappa) and the Total Score (ICC) for Each Assessment

	Bone lesion	Spinal alignment	Location	Total score
Assessment 1 Mean (95% CI)	0.74 (0.53-0.95)	0.57 (0.46-0.68)	0.86 (0.70-1.02)	0.87 (0.77-0.97)
Assessment 2 Mean (95% CI)	0.82 (0.62-1.03)	0.77 (0.60-0.95)	0.83 (0.64-1.01)	0.89 (0.80-0.98)

Comparison With the Gold Standard

Agreement with the gold standard was excellent for all individual items of the score (Table 4). The mean difference with the gold standard was 0.29 (SD 1.20) for assessment 1, 0.05 (SD 1.15) for assessment 2, and 0.17 (SD 1.18) total.

Table 4.

Agreement With the Gold Standard for the Score’s Individual Items

	Bone lesion	Spinal alignment	Location
Assessment 1 agreement in% (95% CI)	90.5 (90.4-90.5)	85.2 (85.2-85.3)	96.7 (96.6-96.7)
Assessment 2 agreement in% (95% CI)	93.3 (93.3-93.4)	92.9 (92.8-92.9)	95.7 (95.7-95.7)

Discussion

We describe the modification and internal validation of the previously published SISS for evaluation of spinal instability in the presence of pyogenic spondylodiscitis. To date, there is a lack of a widely accepted and universally used definition of spinal instability in spondylodiscitis and an associated scoring system. Even though the SISS has previously been shown to have high correlation with the chosen type of treatment and an excellent interrater reliability for the overall score, the wide range of potentially unstable lesions and the overall score’s extent led us to revise and validate the scoring system to make it more clinically applicable.⁵

Medical classification systems need to be comprehensive while keeping a certain level of simplicity to remain clinically applicable. As morphological characteristics which can be reliably and reproducibly identified are crucial to any spinal classification system, the mSISS continues to incorporate clear instability parameters as defined by the widely accepted SINS – here, it is important to note that instability caused by infection is comparable to that caused by neoplastic lesions as both develop over time rather than acutely as in traumatic injuries.¹² However, the criteria defined in the SINS were adapted to represent characteristics more distinct for spinal infection and to be clearly defined in CT imaging: localization, spinal deformity, segmental affection, and mechanical pain. We chose CT imaging as the only modality to base the mSISS on as CT is not only superior in the depiction of bone lesions but also is more available compared with MRI, especially in the setting of an acute infection.

In the mSISS, several adaptations were applied. Most importantly, the vertebral body involvement category was changed to represent segmental involvement, as in contrast to neoplastic lesions, spinal infection stems from the intervertebral disc rather than the vertebral body and therefore the spinal segment including the disc and adjacent vertebrae needs to be accounted for. Furthermore, this category was simplified by only having >50% and <50% involvement as a differentiation between endplate involvement and <50% segmental involvement biomechanically is not relevant for instability.¹² Similarly, the junctional spine is most prone to biomechanical instability which is why further distinction within the non-junctional spine does not have a clinical impact on surgical decision-making and therefore was removed in the mSISS. Lastly, mechanical pain is defined as pain that occurs with spinal loading and is therefore associated with spinal instability. Occasional non-mechanical pain does not reflect spinal instability and therefore is not required as a parameter for surgical decision-making. Overall, these categories therefore were simplified in the mSISS, which more adequately reflects the distinction between a stable and an unstable spine and at the same time is more clinically applicable.

In the present reliability assessment, we demonstrate excellent intra- and interrater reliabilities for the total score in a multicenter group of expert spine surgeons. The Kappa values derived from this analysis were comparable to the scores reported by the group publishing the score.⁵ The application of this classification system by a broader group of spine surgeons in an international setting is a much more robust illustration of its ease of use and suggests that spine surgeons can reliably apply this system to patients with spinal infections. While clinical validation needs to be demonstrated in future studies, proof of adequate reliability is a necessary first step before widespread adoption of the system can occur.

In both the intra- and the interrater analysis, spinal alignment showed the lowest agreement between assessments or raters. However, the agreement is comparable or better than that shown for the spinal alignment category in the SINS by the Spine Oncology Study Group in their initial reliability study.¹³ The uncertainty in this category stems from CT imaging not being ideal for evaluating spinal alignment, which normally would be evaluated in standing radiographs of the whole spine. However, this imaging often is not available in patients with a potentially unstable spine, which is why we reduced our scoring system to one imaging modality available in almost all patients with spinal infection. For the other categories, intra- and interrater reliabilities were substantial to excellent.

Compared to a recent external analysis by Xiong et al., we show significantly higher values for interrater reliabilities (0.87 vs 0.68 for the total score).⁶ Here, it is important to note that for the present analysis, modifications were made to the SISS to make it simpler and more comprehensive, thus possibly affecting measured reliabilities. In particular, the location and bone lesion categories were adapted as biomechanically it is primarily relevant whether a lesion is located in the junctional or the non-junctional zone and whether more or less than 50% of the segment are affected. Furthermore, an intensive introduction was given to the scoring system to ascertain the raters’ understanding of the score and scoring was performed exclusively by expert spine surgeons.

It is important to note that spinal instability remains only one factor in the decision-making process regarding the treatment of spinal infections. Other factors such as the presence of a neurological deficit or an epidural abscess as well as the overall patient status need to be taken into consideration when deciding whether surgical treatment is indicated. Thus, the mSISS is not meant to be a standalone scoring system but rather be incorporated into a framework for surgical decision-making. However, it is important to note that spinal instability is one of the key criteria in deciding whether surgery is necessary or not. Therefore, the mSISS may not only be used by spine surgeons but possibly even more importantly may aid non-surgical doctors such as radiologists or infectiologists in deciding whether a surgical consultation is necessary.

Some limitations need to be discussed. First, the study was performed retrospectively using representative images and videos of the cases, which may limit the accuracy of image interpretation. This may have caused a reduced ability to determine the full amount of infectious spinal involvement. Second, only cases of pyogenic spondylodiscitis were evaluated. Thus, our results do not allow for any conclusions regarding non-pyogenic spondylodiscitis. Furthermore, image scoring was performed by a panel of experts, thus future analyses with a broader range of raters is necessary. Lastly, clinical outcomes were not evaluated in this study. While this type of analysis was outside our study’s scope, it should receive attention in the future.

In conclusion, the modified Spinal Instability Spondylodiscitis Score represents a comprehensive but simple classification system for evaluating spinal instability in the presence of pyogenic spondylodiscitis. While we show excellent inter- and intra-rater reliabilities within the AO Spine Knowledge Forum Trauma & Infection experts, broad and cross-cultural international validation studies are still needed and already underway. It is important to note that the mSISS exclusively evaluates spinal instability without taking clear surgical indications such as neurological impairment or epidural abscess into account.

Footnotes

Acknowledgments

AO Spine is a clinical division of the AO Foundation, which is an independent medically-guided not-for-profit organization. Study support was provided directly through AO Network Clinical Research and AO Innovation Translation Center, Clinical Evidence.

ORCID iDs

Friederike Schömig

Mohammad El-Sharkawi

Mohamed M. Aly

Ratko Yurac

Gaston Camino Willhuber

Charlotte Dandurand

Martin Holas

Maximilian Reinhold

Andrei F. Joaquim

Eugen Cezar Popescu

Ulrich J.A. Spiegl

Sebastian F. Bigdon

Ethical Considerations

The presented reliability analysis was approved by the ethics committee of Charité – Universitätsmedizin Berlin (EA1/019/21).

Consent to Participate

Due to its retrospective design, the requirement for individual informed consent was waived by the ethics committee of Charité – Universitätsmedizin Berlin.

Author Contributions

FS: Conceptualization, Methodology, Investigation, Data Curation, Writing – Original Draft, Writing – Review & Editing, Visualization, Project Administration.

NT: Visualization, Writing – Review & Editing, Visualization. ME: Conceptualization, Methodology, Investigation, Writing – Review & Editing.

MA: Conceptualization, Methodology, Investigation, Writing – Review & Editing.

RY: Conceptualization, Methodology, Investigation, Writing – Review & Editing. GCW: Conceptualization, Methodology, Investigation, Writing – Review & Editing.

CD: Conceptualization, Methodology, Investigation, Writing – Review & Editing.

MH: Conceptualization, Methodology, Investigation, Writing – Review & Editing.

HSC: Conceptualization, Methodology, Investigation, Writing – Review & Editing. KS: Conceptualization, Methodology, Investigation, Writing – Review & Editing.

MR: Conceptualization, Methodology, Investigation, Writing – Review & Editing.

AFJ: Conceptualization, Methodology, Investigation, Writing – Review & Editing.

GDS: Conceptualization, Methodology, Investigation, Writing – Review & Editing.

CT: Conceptualization, Methodology, Investigation, Writing – Review & Editing.

JS: Conceptualization, Methodology, Investigation, Writing – Review & Editing.

ECP: Conceptualization, Methodology, Investigation, Writing – Review & Editing.

UJAS: Conceptualization, Methodology, Investigation, Writing – Review & Editing.

SFB: Conceptualization, Methodology, Investigation, Writing – Review & Editing.

TAM: Conceptualization, Methodology, Investigation, Writing – Review & Editing.

RB: Conceptualization, Methodology, Investigation, Writing – Review & Editing.

MP: Conceptualization, Methodology, Investigation, Writing – Review & Editing, Project Administration, Resources, Supervision.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was organized and funded by AO Spine through the AO Spine Knowledge Forum Trauma & Infection, a focused group of international experts.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data Availability Statement

The datasets generated during and/or analyzed during the current study are available from the corresponding author on request.*

References

Nickerson

Sinha

. Vertebral osteomyelitis in adults: an update. Br Med Bull. 2016;117(1):121-138. doi:10.1093/bmb/ldw003.

Lang

Rupp

Hanses

Neumann

Loibl

Alt

. [infections of the spine: pyogenic spondylodiscitis and implant-associated vertebral osteomyelitis]. Unfallchirurg. 2021;124(6):489-504. doi:10.1007/s00113-021-01002-w. Infektionen der Wirbelsäule : Pyogene Spondylodiszitis und implantatassoziierte vertebrale Osteomyelitis.

Fisahn

Alonso

Hasan

, et al. Trends in spinal surgery for pott's disease (2000-2016): an overview and bibliometric study. Glob Spine J. 2017;7(8):821-828. doi:10.1177/2192568217735827.

Kramer

Thavarajasingam

Neuhoff

, et al. Variation of practice in the treatment of pyogenic spondylodiscitis: a european association of neurosurgical societies spine section study. J Neurosurg Spine. 2024;41(2):263-272. doi:10.3171/2024.2.Spine231202.

Schömig

Perka

, et al. Georg schmorl prize of the German spine society (DWG) 2021: spinal instability spondylodiscitis score (SISS)-a novel classification system for spinal instability in spontaneous spondylodiscitis. Eur Spine J. 2022;31(5):1099-1106. doi:10.1007/s00586-022-07157-3.

Xiong

Huang

Narayanan

, et al. External performance of the spinal infection treatment evaluation (SITE) score and spinal instability spondylodiscitis score (SISS) in predicting operative intervention for de novo spinal infections. Spine J. 2025;25:2061-2070. doi:10.1016/j.spinee.2025.03.006.

Pluemer

Freyvert

Pratt

, et al. Ongoing decision-making dilemma for treatment of de novo spinal infections: a comparison of the Spinal Infection Treatment Evaluation Score with the Spinal Instability Spondylodiscitis Score and Spine Instability Neoplastic Score. J Neurosurg Spine. 2024;41(2):273-282. doi:10.3171/2024.2.SPINE23664.

Vaccaro

Lehman

Jr. Hurlbert

, et al. A new classification of thoracolumbar injuries: the importance of injury morphology, the integrity of the posterior ligamentous complex, and neurologic status. Spine. 2005;30(20):2325-2333. doi:10.1097/01.brs.0000182986.43345.cb.

Vaccaro

Schroeder

Divi

, et al. Description and reliability of the AOSpine sacral classification system. J Bone Joint Surg Am. 2020;102(16):1454-1463. doi:10.2106/jbjs.19.01153.

10.

Vaccaro

Lambrechts

Karamian

, et al. AO spine upper cervical injury classification system: a description and reliability study. Spine J. 2022;22(12):2042-2049. doi:10.1016/j.spinee.2022.08.005.

11.

Landis

Koch

. The measurement of observer agreement for categorical data. Biometrics. 1977;33(1):159-174.

12.

Fisher

DiPaola

Ryken

, et al. A novel classification system for spinal instability in neoplastic disease: an evidence-based approach and expert consensus from the spine oncology study group. Spine. 2010;35(22):E1221-E1229. doi:10.1097/BRS.0b013e3181e16ae2.

13.

Fourney

Frangou

Ryken

, et al. Spinal instability neoplastic score: an analysis of reliability and validity from the spine oncology study group. J Clin Oncol. 2011;29(22):3072-3077. doi:10.1200/jco.2010.34.3897.