Abstract
Study Design:
Retrospective.
Objectives:
To study the violation rate of 3 different types of facet joint violation (FJV) grading systems (Babu, Shah, and Park), and to evaluate the accuracy, reliability, and association with clinical outcomes of the above 3 grading systems.
Methods:
152 patients of lumbar spinal stenosis treated with percutaneous pedicle screw placement were enrolled in our study. FJV was evaluated on 3-dimensional lumbar CT reconstruction. Three types of grading systems were used to evaluate FJV: Babu’s system (grading by the severity of violation), Shah’s system (grading by side of violation), and modified Park’s system (grading by different components to cause violation). The violation rate and observer consistency of the 3 grading systems were analyzed. Clinical outcomes were evaluated by visual analog score (VAS), Oswestry disability index (ODI) score.
Results:
Kappa coefficients of interobserver consistency on Babu, Shah, and Park grading systems were 0.726,0.849,0.692, respectively. The violation rate of Babu, Shah, and Park grading systems were comparable, which were 34.54%, 32.57%, 33.55%, respectively. In all 3 grading systems, the postoperative VAS low-back pain and ODI scores in non-FJV groups were lower than those in FJV groups (P < .05), and there were no significant differences between 2 groups in VAS leg pain(P >.05).
Conclusions:
Babu, Shah and modified Park grading system are reliable grading systems, and it reported comparable violation rate. The self-reported clinical outcomes of patients with FJV were worse at 2-year follow-up. For clinical application, it is recommended to use 2 or even 3 different grading systems together to evaluate the FJV.
Introduction
Minimally invasive transforaminal lumbar interbody fusion (MIS-TLIF) has been widely used to treat lumbar degenerative diseases. 1 The accuracy of pedicle screw placement is crucial for surgery, any screws misplacement may cause a catastrophic consequence. Percutaneous pedicle screw placement has the advantage of avoiding paraspinal muscle and soft tissue injury. 2 However, without direct vision, percutaneous pedicle screw placement may increase the incidence of facet joint violation (FJV). FJV is a known but not well-grounded complication of pedicle screw placement, which destroys the integrity of superior unfused facet joint. It has been proven to be an important risk factor of adjacent segment degeneration (ASD)—a long-term worrisome complication of spinal fusion surgery. 3 Therefore, it is essential to understand FJV, its grading systems, and determine the impact of different violation systems on clinical outcomes.
To date, researchers do not have a clear consensus on FJV inclusion scope. A major issue is whether to include suspected violation (abutting or within 1 mm of the articulation) or screw head impingement into the superior facet. Thus, a wide variety of grading systems for FJV was built to answer different questions. Babu et al 4 developed a 4-grade scale system according to the severity of the violation. The more severe the damage to the integrity of the upper facet joint, the higher the grade. Shah et al 5 considered whether there was difference between unilateral or bilateral FJV, and conducted a 3-grade system including non-violation, unilateral violation, and bilateral violation. In addition, Park et al 6 focused on the impact of different violation components and classified the violation caused by screws or screws head into different grades. Due to the factors of different criteria and grading systems, the reported violation rate varied reported to vary from 0%-100%. 7 A meritorious grading system is important in its association with clinical outcomes. Although Jia Long 8 showed that patients with FJV had poor ODI scores and suffered low-back pain, there is dearth of research on the influence of different violation systems on clinical outcomes. Therefore, based on a review of literature, our study concluded on 3 types of FJV grading systems. They include: Babu’s system (grading by the severity of violation), Shah’s system (grading by side of violation), and modified Park’s system (grading by different components to cause violation). In the present study, we retrospectively analyzed 152 patients during percutaneous pedicle screw placement in MIS-TLIF to evaluate the accuracy, reliability, and clinical outcomes of the 3 systems aforementioned.
Methods
General Patient Information
This study was a retrospective study, so formal approval from the ethics committee of Qilu Hospital of Shandong University was not needed. Patients with single-segment lumbar spinal stenosis who underwent MIS-TLIF surgery with percutaneous pedicle screw placement from December 2015 to March 2017 were enrolled in the study.
Inclusion criteria as follows: (1) patients with single-segment lumbar spinal stenosis treated with MIS-TLIF by the same group of surgeons; (2) localization under anteroposterior and lateral C-arm fluoroscopy.
Exclusion criteria included: (1) a history of lumbar surgery; (2) lack of preoperative and/or postoperative image; (3) lack of 2-year follow-up data including VAS, ODI scores.
A total of 152 patients were selected based on the inclusion and exclusion criteria. (Figure 1). There were 49 males and 103 females, with ages ranging from 33 to 74 years (mean of 53.82 ±10.51 years). The surgical segment included: L3/4 17 cases, L4/5 84 cases, and L5/S1 51 cases. A total of 304 superior percutaneous pedicle screws were inserted.

Flow diagram of the study.
Surgical Procedures
Patients were placed in a prone position under general anesthesia, prepped, and draped in standard fashion. MIS-TLIF at left L4-5 briefly described below as an example of the surgical procedure. L4-L5 pedicles were marked before the surgery with fluoroscopic C-arm. A Jamshidi needle was localized at the 3 o’clock position of the right pedicle and at the 9 o’clock position of the left pedicle as confirmed by AP fluoroscopy. And under fluoroscopic guidance, the Jamshidi needle was incrementally advanced into the pedicle. Subsequently, K-wires were inserted through the needles. A skin incision (about 3-cm in length) was made between the L4 and L5 K-wires on left side. Serial dilators were consecutively passed to split the muscle fibers, and minimally invasive retractor was placed to expose L4-5 FJ. Next, FJ and hypertrophied ligamentum flavum were removed for complete decompression. Intervertebral discs and cartilage endplate were then removed, and bone chips and cage were implanted into the intervertebral space. Bilateral pedicle screws were inserted over the K-wires into the pedicles under fluoroscopic guidance, and then the connecting rods and screw caps were sequentially installed.
Facet Joint Violation Evaluation and Grading Systems
Continuous scanning was performed with Siemens Somatom Sensation 256 row spiral 3D CT, and evaluated at axial, sagittal, and coronal images of bone window. FJV was evaluated and graded by CT scan according to Babu, modified Park, and Shah systems. Babu system [4] is a 4-grade scale based on the degree of violation severity (Figure 2). To focus on the components causing the violation, the modified Park grading system [5] was represented as the schematic (Figure 3). Shah et al [6] conducted a 3-grade system classified by the side of violations, which was informed in Figure 4. An evaluation was independently and blindly conducted by 2 experienced spine surgeons. A comparison against non-senior rater was also performed. The senior surgeon made the final decision in any case of different results by the observers.

Criteria for grading violation of facet joint (Babu). A: grade 0. Screw was not in the facet and did not encroach upon the facet joint; B: grade 1. Screw was in the lateral facet but did not enter the articular facet; C: grade 2. Screw that penetrated the articular facet by ≤ 1 mm; D: grade 3. Screw traveling within the articular surface of the facet.

Criteria for grading violation of facet joint (Modified Park). A: Grade 0. no facet joint violation; B: Grade 1. Hardware within 1 mm from or abutting the facet joint, without clear joint involvement; C: Grade 2. Pedicle screw clearly within the facet joint; D: Grade 3. Pedicle screw head/ connector/ rod clearly within the facet joint.

Criteria for grading violation of facet joint (Shah). A: Grade 0. No facet joint violation; B: Grade 1. Unilateral facet joint violation; C: Grade 2. Bilateral facet joint violation.
Clinical Outcomes Measurement
The patient self-reported clinical outcomes were measured before surgery and at 2-year follow-up. Oswestry Disability Index (ODI) was used for clinical functional evaluations. Visual analog scale (VAS) was used to assess low-back pain and leg pain (a score of 0 indicated no pain and 10 represented the worst pain)
Statistical Analysis
Statistical analysis was performed with SPSS version 22.0. Continuous variables were presented as means and standard errors. Cohen Kappa was used to describe intraobserver and interobserver reliability. Student’s t-test was used to compare continuous variables. Paired t-test was used to evaluate the differences in the intergroup comparisons. Chi-square test and Fisher’s exact test were used to compare categorical data. P < .05 was considered as statistically significant.
Results
Kappa Coefficients of interobserver consistency of Babu, Shah, and Park grading systems were 0.726,0.849,0.692, respectively (Table 1). Both observer1 and observer 2 had substantial to almost perfect agreement in intraobserver consistency on the 3 FJV grading system. And Kappa Coefficients of interobserver consistency between senior surgeon and resident had a substantial agreement (Table 2). In Babu grading system, the violation rate was 34.54% (105/304), including 199 at grade 0, 61 at grade 1, 27 at grade 2, and 17 at grade 3. In modified Park system, the violation rate was 33.55% (102/304), including 202 at grade 0, 59 at grade 1, 24 at grade 2, and 19 at grade 3. In Shah grading system, the violation rate was 32.57% (99/304), including 75 patients at grade 0, 55 patients at grade 1, and 22 patients at grade 2. There were no significant differences in violation rate among the 3 grading systems (P > .05). (Table 1).
Interobserver Consistency on Babu, Shah, and Park Grading Systems.
FJV-Facet joint violation; Ob1-Observer 1; Ob2-Observer 2.
Kappa Coefficients Between Senior Surgeon and Resident.
FJV-Facet joint violation.
The preoperative VAS leg pain score was 6.61 ± 1.51, VAS low-back pain score was 6.14 ± 1.48, ODI score was 22.71 ± 6.08. At 2-year follow-up, VAS leg pain score was 1.81 ± 1.42, VAS low-back pain score was 1.91 ± 1.12, ODI score was 10.45 ± 4.04. VAS low-back pain, VAS leg pain and ODI scores improved significantly at 2 years follow-up for all patients (P < .05). In all FJV grading systems, the postoperative VAS low-back pain and ODI scores in non-FJV groups were lower than those in FJV groups (P < .05), while there were no significant differences between 2 groups in VAS leg pain(P > .05).
In Babu grading system, patients with moderate to severe violation had poorer ODI scores (12.25 ± 3.80 vs11.30±4.02, P = .031) than those in mild violation, however, the low-back pain score (2.25 ± 1.24 vs 2.20 ± 1.21, P = .636) was comparable. (Table 3).
Babu Grading System Impact on Clinical Outcomes.
FJV-facet joint violation; Pre- Preoperative data; Post-Postoperative data. P* refers to comparison between Mild FJV and Non-FJV; P# refers to comparison between Medium to severe FJV and Non-FJV; P refers to comparison between Medium to severe FJV and Mild FJV.
In modified Park system, either violation caused by screw (G2), screw head or rod (G3) had significantly higher VAS low-back pain (2.25 ± 1.19, 2.37 ± 1.39) and ODI (11.75 ± 2.75, 12.32 ± 4.40) scores than non-violation (G0) (P < .05). (Table 4).
Modified Park Grading System Impact on Clinical Outcomes.
SH-Screw head, P* refers to comparison between screw violation and Non-FJV; P# refers to comparison between SH/rod/connector violation and Non-FJV.
In Shah grading system, patients with bilateral FJV (grade 2) had significantly higher VAS low back-pain scores (2.45 ± 1.50 vs 1.71 ± 1.10, P = .012) and ODI scores (12.68 ± 3.12 vs 9.33 ±3.98, P = .000) than non-violation (grade 0). The VAS low-back pain and ODI scores in bilateral FJV group was higher than those in unilateral FJV group (P > .05). (Table 5).
Shah Grading System Impact on Clinical Outcomes.
P* refers to comparison between unilateral-FJV and Non-FJV; P# refers to comparison between Bilateral-FJV and Non-FJV; P refers to comparison between Bilateral-FJV and Unilateral-FJV
Discussion
With the improvement of biomechanics and anatomy, surgeons are gradually realizing the importance of protecting the integrity of facet joints during pedicle screws placement. 9 Our team conducted a search on PUBMED, MEDLINE and WANFANG database to select all articles presenting grading systems for FJV. A total of 18 grading systems was found. According to Kettler A et al, 10 a reliability test for grading systems was essential, and the result of interobserver Kappa or Intraclass Correlation Coeficient >0.60 was recommended for use. 9 grading systems meets the condition, however, most of them had the same core with simply different semantic replacement.Through induction and summary, scholars’ evaluation of FJV was mainly carried out from these 3 different aspects: 1.the severely of FJV; 2. the sides of violation 3.the violation components. Considering the clinical values and practicality, we concluded 3 types of FJV grading systems, include: Babu’s method (grading by severity of violation), Shah’s method (grading by side of violation), and modified Park’s method (grading by different components to cause violation). In addition, Patel et al 11 established a grading system based on autopsy, which can be used as a special evaluation method for autopsy research. The 3 grading systems aforementioned were recommended and were additionally evaluated according to their accuracy, reliability, and impact on clinical outcomes.
Intraobserver and interobserver reliability can imply whether the grading system is clear. If a standard classification is ambiguous, it is easy for the 2 observers to have different results. As a 3-grade scale system, the classification standard of Shah’s is intelligible. The original literature reported that its interobserver reliability result is almost perfect, and our results also showed k = 0.849 (almost perfect agreement). For Babu’s and Park’s grading system, both of them are 4-grade scales. The interobserver reliability reported by previous literature was 0.672 and 0.67, respectively. Our results showed that the kappa value was 0.726 and 0.692 (substantial agreement). Besides, the 2 experienced observers had substantial to almost perfect agreement in intraobserver consistency on the 3 FJV grading system. In addition, to test the reproducibility with low-experience raters, we further performed a comparison between a senior surgeon and a resident, and the result showed substantial agreement.Our results indicated that grading systems for FJV should always be tested for reliability, and we also suggest that the interobserver Kappa >0.60 is essential for FJV grading systems.
Due to difference in cases population, surgical methods, and standard of FJV in previous studies, the violation rate was inconsistent.12,13 It is critical to form a clear and unified scope of violation. The main controversy is whether to include suspected violation (abut with the FJ) or screw head impingement into the superior facet.14,15 Based on the previous biomechanical studies on facet joint damage caused by pedicle screws,16,17 we suggested a facet joint was considered violated if any part of screw/ screw head/ connector/ rod within 1 mm or abutting the facet joint, which was consisted with many previous studies.18,19,20,21 In our study, we enrolled the same cases under percutaneous pedicle screw placement and followed a unified scope of violation. The results showed that the violation rates in Babu’s, Park’s, and Shah’s systems were 34.54%, 33.55%, and 32.57%, respectively. The results of our study reflected that the 3 different grading systems were consistent in the violation rate. In clinical practice, if a surgeon adopt different evaluation systems for the same set of data, and resulted a large difference in violation rates, then it should be suspected of potential errors in the evaluating process.
The 3 FJV grading systems have different implications for us to estimate and predict clinical outcomes due to their unique perspectives. FJV is closely related to short-term low-back pain and long-term complication-adjacent segment degeneration. 22 Long Jia 8 retrospectively analyzed 99 patients who underwent single segment MIS-TLIF, and used Shah’s grading system to evaluate FJV. The results showed that at the last follow-up, the VAS low-back pain score (1.48 ± 0.51 vs 2.32 ± 0.72, P < .001) and ODI score (15.06 ± 3.92 vs 19.05 ± 5.30, P = .003) in the violation group were all worse than those in the non-violation group. Our results were consistent with Long Jia’s, and we further analyzed the effect of bilateral violation on clinical outcomes. Result showed that patients with a bilateral violation had worse self-reported clinical outcomes (VAS low-back pain, ODI) than patients with unilateral violation, though it did not reach significantly statistical difference (P = .075, P = .095). A Biomechanical test 23 confirmed that when bilateral facet joint was destroyed, the lumbar spine mobility is significantly increased and the spinal stability is weakened. Therefore, we hypothesize bilateral FJV would result in poorer clinical outcomes and result in a greater risk of ASD, and we expect more extensive and multi-center studies in the future to confirm this. In Babu’s study, patients who underwent a subsequent lumbar surgery had a significantly higher violation grade and incidence of grade 2 violations (35.7% versus 7.0%, P = .0095) at a 3-year follow-up. Besides, the incidence of grade 3 violations was greater in those who underwent additional surgery. Xu et al 24 also showed when superior facet joints were severely violated by screws, the biomechanical function significantly decreased. Babu’s grading system was adapted to analyze the effects of violation severity on clinical outcomes. The results showed that the scores of VAS low-back pain and ODI score in patients with moderate and severe violation (grade 2 and 3) were significantly higher than those in non-violation group. And patients with moderate to severe violation have poorer ODI scores (12.25 ± 3.80 vs11.30 ± 4.02, P = .031) than those in mild violation. Our work proved that the severity of FJV was closely related to postoperative clinical outcomes. Therefore, it is essential to avoid FJV, especially high-grade violation. Violation caused by screw head/ rod/ connecter is commonly seen in clinical practice. Yunfeng Xu 20 pointed this may be a risk of persistent low-back pain due to frequent impingement of screws head. In modified Park’s grading system, our results showed that the violation caused by screws head was 6.25% (19/304), and this type of violation also leads to poor VAS low-back pain and ODI functional recovery. Thus, the screws should not be inserted too deep in the percutaneous screw placement, and a proper distance must be ensured to avoid screws head from impinging the facet joint.
Conclusion
We have tested Babu’s, Park’s, and Shah’s grading systems and recommend their application. Patients with FJV had worse self-reported clinical outcomes at a 2-year follow-up. In clinical application, it is recommended to use 2 or even 3 different grading systems simultaneously to evaluate the FJV from multiple aspects.
Footnotes
Acknowledgments
We would like to acknowledge the reviewers for their helpful comments on this paper.
Authors’ Contributions
Yiwei Zhao and Suomao Yuan contributed equally on this paper. Xinyu Liu, Yiwei Zhao and Suomao Yuan participated in the design of the study. Yiwei Zhao and Suomao Yuan collected data. Yiwei Zhao and Wubo Liu performed the statistical analysis. Yiwei Zhao and Yonghao Tian conceived of the study, participated in its design and coordination and helped to draft the manuscript. All authors read and approved the final manuscript.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. It was a retrospective study, so formal patient informed consent was not needed.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
