Abstract
Background:
Several classification systems based on arthroscopy have been used to describe lesions of the ligamentum teres (LT) in young active patients undergoing hip-preserving surgery. Inspection of the LT and associated lesions of the adjuvant fovea capitis and acetabular fossa is limited when done arthroscopically but is much more thorough during open surgical hip dislocation. Therefore, we propose a novel grading system based on our findings during surgical dislocation comprising the full spectrum of ligamentous-fossa-foveolar complex (LFFC) lesions.
Purpose:
To determine (1) intraobserver reliability and (2) interobserver reproducibility of our new grading system.
Study Design:
Cohort study (diagnosis); Level of evidence, 3.
Methods:
We performed this validation study on 211 hips (633 images in total) with surgical hip dislocation (2013-2021). We randomly selected 5 images per grade for each LFFC item to achieve an equal representation of all grades (resulting in 75 images). The ligament, fossa, and fovea were subcategorized into normal, inflammation, degeneration, partial, and complete defects. All surgeries were performed in a standardized way by a single surgeon. The femur was disarticulated using a bone hook, the LT was inspected, documented and resected, then the fossa and fovea were documented with the femoral head in full dislocation using a 70° arthroscope. Six observers with different levels of expertise in hip-preserving surgery independently conducted the measurements twice, and intraclass correlation coefficients (ICC) were calculated to determine (1) intraobserver reliability and (2) interobserver reproducibility of the novel grading system.
Results:
For intraobserver reliability, excellent ICCs were found in both the junior and the experienced raters for grading the ligament, fossa, fovea, and total LFFC (ICCs ranged from 0.91 to 0.99 for the LFFC score). We found excellent interobserver reproducibility between raters for all items of the LFFC (all interobserver ICCs ≥ 0.76).
Conclusion:
Our new grading system for lesions of the LFFC is highly reliable and reproducible. It covers the full spectrum of damage more precisely than arthroscopic classifications do and offers a scientific basis for standardized intraoperative evaluation.
Keywords
Intra-articular lesions in joint-preserving hip surgery have focused traditionally on peripheral lesions of the chondrolabral complex. 23 While trauma is a well-known cause of central ligamentous-fossa-foveolar complex (LFFC) lesions, 6 hips with developmental dysplasia, 11,12 femoroacetabular impingement (FAI), 7,17,19 and osteoarthritis also regularly demonstrate LFFC lesions. The ligamentum teres (LT) is an innervated structure and has been recognized as a potential source of hip pain. 1,7,16,19,22 Specifically in athletes, lesions of the ligament are highly prevalent. 6,8,20 Damage to the LT is in fact the third most common injury in athletes who have undergone hip arthroscopy. 5,6 LT lesions are associated frequently with articular cartilage damage in the inferior middle part of the acetabulum as well as the apex of the femoral head. 15
An accurate, reliable, and reproducible description of such lesions is crucial to understand their origin, detect associated pathomechanisms, and potentially predict prognosis. To date, no descriptive standard that meets these requirements has emerged. Current arthroscopic descriptions of LFFC lesions have significant drawbacks. These include a lack of consideration of associated cartilage lesions at the foveal insertion and the fossa, 4,14 unreliable or missing intra- and interobserver analyses, 9,18,21 and incomplete visual analysis of the lesions. 4,14,21
In contrast to hip arthroscopy, surgical hip dislocation allows an unrestricted visualization of the LFFC, including a dynamic assessment of the hip. Based on our long clinical experience, we introduce a novel grading system for lesions of the LFFC with a systematic description of the LT, the acetabular fossa, and perifoveolar region. The aim of this study was to evaluate (1) intraobserver reliability and (2) interobserver reproducibility of this grading system. Our hypotheses were that the new grading system would have superior intraobserver reliability and interobserver reproducibility when compared with previous grading systems.
Methods
In this local institutional review board–approved diagnostic study we assessed 560 patients with informed consent, undergoing joint-preserving surgery by a single surgeon (M.T.) at 2 institutions (2013-2019 and 2019-2021, respectively) (Figure 1).

Flowchart of participant selection and randomization into groups. Asterisk indicates randomly chosen.
We excluded 179 hips that underwent arthroscopy and 72 that underwent periacetabular osteotomy. From 309 hips that underwent surgical dislocation, 98 hips with incompletely performed intraoperative documentation were excluded, leaving 211 hips for further evaluation. The mean age at surgery was 31 ± 11 years (range, 17-74), and 44% of patients were male (Table 1).
Descriptive, Radiographic, and Surgical Parameters of the Patients (N = 75 Patients; 75 Hips) a
a Continuous values are expressed as mean ± SD (range); other values are presented as No. (%). BMI, body mass index; FAI, femoroacetabular impingement.
For each hip, 3 images in total (1 of each of the structures from the LFFC) were available: 1 from the ligament, 1 from the acetabular fossa, and 1 from the perifoveolar area. This resulted in a total of 633 images. The developer of the novel grading system (V.M.S.) preliminarily graded the hips accordingly. Using a computer-generated randomization list, the first author (V.M.S.) collected consecutive images for each grade of the 3 items of the LFFC until 5 images per grade and structure were present. This resulted in a total of 75 images (25 for each component of the LFFC) in 75 different patients.
Based on our clinical observation, we developed a grading system for the LFFC with specific categorizations for the LT, the acetabular fossa, and the perifoveolar area. All 3 items were subcategorized into normal, inflammation, degeneration, partial, and complete defects. In contrast to previously presented categorical classification systems, the proposed grading system incorporates the entire spectrum of lesions observed during surgical hip dislocation for all surgical indications. The specific definitions with a detailed description of the lesions are summarized in Table 2 and illustrated in Figure 2.
Description of the Novel Classification System for Grading the Different Lesions of the Ligamentous-Fossa-Foveolar Complex

Grading system of the ligamentous-fossa-foveolar complex for lesions of the (A) ligament, (B) fossa, and (C) perifoveolar area is demonstrated using a schematic illustration (left column), an intraoperative image (right column), and the corresponding description.
The indications for surgical hip dislocation were symptomatic intra- and extra-articular FAI with or without femoral version abnormalities (not accessible using hip arthroscopy), posttraumatic lesions, and avascular necrosis of the femoral head (Table 1). We used the inventor’s original technique for hip dislocation. 13 A step-cut osteotomy of the greater trochanter was usually conducted, 2 except in cases necessitating distalization of the trochanter. Before transection of the ligament and dislocation of the femoral head, the joint was subluxated using a bone hook with the hip in flexion and external rotation. This allowed a full visualization of the pyramidal fan-shaped structure of the LT (Figure 3).

The femoral head is placed in external rotation and partially dislocated using a bone hook, then the ligamentous-fossa-foveolar complex is assessed and documented using a 70° arthroscope.
We used a 70° arthroscopic camera (4K Synergy Arthroscopy; Arthrex) for photographic documentation. The fossa and the perifoveolar area were inspected using a probe (Subtilis nerve root retractor cushing 10 mm; Accuratus) after transection and resection of the LT.
One of the authors (V.M.S.) blinded and randomized the 75 images. Six observers (C.A.Z., M.H., J.L., D.M., M.K.M., V.P.) with different levels of expertise in hip joint–preserving surgery independently conducted the assessments twice, with at least 1 month between assessments. The observers did not participate in the selection of the cases or in the blinding process. Three were board-certified staff hip surgeons with specific training in joint-preserving surgery (M.H., J.L., D.M.), and 3 were orthopaedic surgery residents (C.A.Z., M.K.M., V.P.). All raters were provided with schematic illustrations and intraoperative examples of the classification system (Figure 2) as well as a detailed description of the specific lesions (Table 2). The treating surgeon (M.T.) was not involved in the measurements. Given 6 observers each performed 2 measurements on 75 intraoperative images, this resulted in a total of 900 measurements for the final analysis. For final evaluation, we calculated the LFFC score, defined as the sum of the individual degeneration grades for the ligament, the acetabular fossa and the perifoveolar area for the 75 hips (corresponding to 75 patients).
Statistical Analysis
Using the Bonett method for sample size calculation, 3 given an expected intraclass correlation coefficient (ICC) reliability of 0.85 with a precision of ±0.09, 23 a confidence level of 95%, 6 raters, and an expected dropout rate of 0%, we calculated a minimum sample size of 21 images per item. We calculated the ICC (2-way model, absolute agreement) with 95% CI to determine the intraobserver reliability and interobserver reproducibility. All statistical analyses were conducted using MedCalc Statistical Software Version 19.8 (2021).
Results
Intraobserver Reliability
The highest intraobserver ICCs were found for ligament lesions, followed by perifoveolar and acetabular fossa lesions. Specifically, for ligament lesions, we found a mean ICC of 0.972 (95% CI, 0.926-0.985) (Table 3). For acetabular fossa lesions, we found a mean ICC of 0.883 (95% CI, 0.753-0.947). For perifoveolar lesions, we found a mean ICC of 0.946 (95% CI, 0.856-0.976). For the LFFC sum, we found a mean ICC of 0.949 (95% CI, 0.885-0.977). The mean ICC of the 3 senior raters was 0.942 (95% CI, 0.869-0.974) and that of the resident raters was 0.956 (95% CI, 0.901-0.980). The mean Kappa value was 0.83 (95% CI 0.74-0.93).
Results of Reliability and Reproducibility Analysis a
a ICC, intraclass correlation coefficient; LFFC, ligamentous-fossa-foveolar complex.
Interobserver Reproducibility
As with intraobserver reliability, ligament lesions had the highest ICC for interobserver reproducibility, followed by perifoveolar and acetabular fossa lesions (Table 3). Specifically, for ligament lesions, the mean ICC was 0.92 (95% CI, 0.88-0.96). For acetabular fossa lesions, the mean ICC was 0.79 (95% CI, 0.67-0.89). For perifoveolar lesions, we found a mean ICC of 0.88 (95% CI, 0.80-0.94). For the LFFC sum, we found a mean ICC of 0.93 (95% CI, 0.81-0.94). The mean ICC of the 3 senior raters was 0.89 (95% CI, 0.80-0.95), and for the resident raters, it was 0.90 (95% CI, 0.81-0.95). The mean interobserver reproducibility Kappa value was 0.60 (0.59-0.60).
Discussion
Lesions of the LFFC in joint-preserving surgery are reportedly highly prevalent although the underlying pathomechanism is not yet fully understood. Current classification systems are based on arthroscopic inspection only, which are subject to limitations regarding visibility. In addition, some of them focus on the integrity of the LT only and do not involve a description of (very commonly found) associated lesions of the acetabular fossa and the perifoveolar area. We have introduced and validated a novel grading system based on surgical hip dislocation and found it to be highly reliable and reproducible.
The lowest intraobserver values were mainly found for lesions of the acetabular fossa. This might be related to misinterpretation of the nature of the acetabular fossa, which often shows irregularities of its shape due to incomplete fusion of the triradiate cartilage. 10 In contrast to arthroscopic classifications, 4,9,14 our grading system has a very high reliability for lesions of the LT. We relate this to the unrestricted visibility of the pyramidal ligament structure when the femoral head is subluxated (Figure 3) using a bone hook. The direction of traction is parallel to the spatial orientation of the ligament, revealing the pyramidal structure and any lesions very well. During hip arthroscopy, the ligament initially relaxes under longitudinal traction and is then assessed using internal and external rotation. This can lead to an accordion effect of the ligament, which can be incorrectly identified as a partial rupture.
The values for interobserver reproducibility were generally lower compared with the intraobserver values. The lowest ICC of 0.76 was found for the acetabular fossa lesions. Interestingly, this evaluation was independent of the level of experience of the raters. We believe this adds to the robustness of our grading system.
In the literature, there is little information available about inter- and intraobserver variability of existing LFFC classification systems (Table 4).
Definition of Different Classification Systems in the Literature a
a Dashes indicate not applicable. BTS, Beighton test score; NA, not available.
b A or B depending on the laxity. A, no generalized laxity (BTS <3); B, generalized laxity (BTS ≥4).
The original articles describing a relatively rough evaluation of the LT by Gray and Villar 14 and Botser et al 4 did not include reliability or reproducibility. Later, Devitt et al 9 specifically assessed the reliability and reproducibility of both classifications and found only a fair agreement even with experienced observers. A main criticism was that the presence of synovitis was not in either classification but was considered an important finding. Based on this, Salas and O’Donnell 21 introduced a novel classification system that included therapeutic recommendations but did not provide inter- and interobserver analyses. Later, O’Donnell and Arora 18 published another classification system, also including synovitis as a criterion and with special consideration of joint hypermobility, but they did not perform an assessment of the reliability and reproducibility (Table 4).
Our values are substantially higher compared with the reported values for the Gray and Villar 14 classification. In addition, it includes selected features of the Salas classification 21 but on a systematic basis. This might be due to the larger number of items in our grading system and the use of kappa values instead of the ICC for statistical analysis. Even when using kappa analysis for our evaluation for better comparability, we noted considerably higher values (mean intraobserver reliability: κ = 0.83 [95% CI, 0.74-0.93]; mean interobserver reproducibility, κ = 0.60 [95% CI, 0.59-0.60]).
The current study has some limitations. First, the proposed grading system is descriptive only and does not take into account clinical information. The potential use for future prognostic or therapeutic recommendations has yet to be shown. Specifically, the weighting of the individual items needs to be evaluated. Moreover, most hip surgeons use an arthroscopic approach for the treatment of FAI and may encounter difficulties in fully applying our classification system. However, in contrast to the previously developed classification system, our system covers the entire degenerative cascade of the LFFC and is reliable and reproducible. Therefore, even with a limited application in hip arthroscopy, our grading system offers a validated tool for future studies on prognosis, treatment guidelines, and causes of lesions. A second limitation is that we used static photographs instead of video clips for validation. Since the features of our grading system are based solely on visual criteria and not tactile information, static images should suffice. In addition, we believe this feature makes our grading system more user-friendly and adds to its reproducibility.
Conclusion
We have introduced a novel grading system for the LFFC complex based on open surgery. The grading system seems to be robust, be independent of the experience level of the raters, and have high reliability and reproducibility. It seems to be superior to previously presented descriptions and offers a scientific basis for standardized intraoperative evaluation. As a first line, future studies will need to focus on the correlation of the grading system using magnetic resonance imaging so that it can be implemented in preoperative evaluation of the patients, which is a decisive factor for open joint-preserving surgery as well as hip arthroscopy. This will convert the descriptive grading system into a more clinically useful classification in open and arthroscopic surgery.
Footnotes
Final revision submitted February 23, 2022; accepted March 9, 2022.
The authors declared that there are no conflicts of interest in the authorship and publication of this contribution. AOSSM checks author disclosures against the Open Payments Database (OPD). AOSSM has not conducted an independent investigation on the OPD and disclaims any liability or responsibility relating thereto.
Ethical approval for this study was obtained from the regional ethics committee for the Canton of Bern (project ID: 2018-00078).
