Abstract
Background
Currently, is there no consensus on a widely accepted measurement technique for calculating the Hill-Sachs lesion (HSL). The purpose of this review is to provide an overview of the techniques and imaging modalities to assess the HSL pre-operatively.
Methods
Four online databases (PubMed, Embase, MEDLINE, and COCHRANE) were searched for literature on the various modalities and measurement techniques used for quantifying HSLs, from data inception to 20 November 2021. The Methodological Index for Non-Randomized Studies tool was used to assess study quality.
Results
Forty-five studies encompassing 3413 patients were included in this review. MRA and MRI showed the highest sensitivity, specificity, and accuracy values. Intrarater and interrater agreement was shown to be the highest amongst MRA. The most common reference tests for measuring the HSL were arthroscopy, radiography, arthro-CT, and surgical techniques.
Conclusion
MRA and MRI are reliable imaging modalities with good test diagnostic properties for assessment of HSLs. There is a wide variety of measurement techniques and imaging modalities for HSL assessment, however a lack of comparative studies exists. Thus, it is not possible to comment on the superiority of one technique over another. Future studies comparing imaging modalities and measurement techniques are needed that incorporate a cost-benefit analysis.
Introduction
A Hill-Sachs lesion (HSL) is categorized as a bony defect of the posterosuperolateral humeral head, often caused by prior episodes of anteroinferior glenohumeral dislocation.1,2 Recurrent instability at the glenohumeral joint is often observed after a HSL due to anterior glenoid impact by the posterolateral aspect of the humeral head resulting in subsequent pain and difficulty moving the shoulder joint.1,3 The clinical prevalence of HSLs range from 70% to 90% after an anterior shoulder dislocation and may approach up to 100% incidence rate in patients with recurrent anterior shoulder instability.4–7
Measurement of the HSL has been an area of interest for clinicians as quantification of bone loss is crucial in treatment decisions for patients with shoulder instability. Currently, various modalities can be used to measure a HSL such as computed tomography (CT) scans, magnetic resonance imaging (MRI), 3D CT and 3D MRI, magnetic resonance arthrography (MRA), among many others.8,9 In addition to the variety in imaging modalities, various measurement methods are currently available to evaluate and quantify the HSL such as length, width, and depth measurements.8–10 Measurement methods such as the renowned “on-track” and “off-track” concept utilizes the glenoid track, which consists of the contact area between the humeral head and glenoid during shoulder abduction and external rotation. This method determines whether the HSL engages the anterior glenoid rim resulting in shoulder dislocation, where it is termed “off-track,” or does not engage, known as “on-track.” 11
Currently, there is extensive literature reporting imaging agreement of other specific measurements such as glenoid bone loss. Most notably, Walter et al. determined the most accurate imaging techniques in measuring glenoid bone loss in anterior glenohumeral instability. 12 Despite the various modalities and methods available to measure the HSL, challenges still exist upon evaluation related to its 3D aspect of the humeral sphere and conflicting visibility during imaging, with each method having its own pros and cons along with varying degrees of reliability.13–16 This acts as a major barrier as minor changes in measurements in the context of HSLs when dealing with bipolar bone lesions in shoulder instability can have significant implications in patient's surgical treatment. Henceforth, the purpose of this review is to provide an overview of the imaging modalities and techniques to measure the HSL and to assess their diagnostic properties. It was hypothesized that 3D-computed tomography (3D-CT) and/or 3D MRI would be the most prevalent and reliable imaging modality to quantify the HSL.
Methods
Search strategy
The search terms included “shoulder,” “Hill-Sachs,” “bone loss,” and similar phrases (Appendix Table 1). PUBMED, EMBASE, MEDLINE, and COCHRANE databases were searched for literature on the reliability of imaging modalities and measurement techniques for quantifying the HSL from database inception to 20 November 2021. The search terms were then entered into Google Scholar to ensure that articles were not missed. Inclusion criteria were (1) HSL; (2) quantification by imaging modalities; (3) present a method for measuring HSLs; (4) human studies; and (5) English language. The exclusion criteria were: (1) measurement of other major shoulder pathologies (e.g. glenohumeral, Bankart lesions) without mention of an HSL; (2) review articles; (3) non-imaging studies; (4) cadaver studies; (5) case reports and editorials.
Study screening
Systematic screening was in compliance with Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) and Revised Assessment of Multiple Systematic Reviews (R-AMSTAR) guidelines.17,18 Two reviewers (S.K., H.F.) independently screened the titles and abstract, and full-texts in duplicate. Discrepancies were discussed and resolved with input of a third reviewer (A.S). The references of included studies were also screened using the same systematic approach to capture any additional relevant articles.
Data abstraction
Data were extracted independently by two reviewers (S.K., H.F.) who abstracted relevant data from included articles, recording onto a spreadsheet in Microsoft Excel (Version 2016; Microsoft, Redmond, Washington) designed a priori. Authors were contacted for clarification if data was unclear or not reported. Extracted data included, but were not limited to, year and journal of publication, sample size, study design, level of evidence, and patient demographics (e.g. gender, age, etc.). Information regarding modality and measurement techniques, reliability, and tests diagnostic properties when present were documented.
Statistical analysis
Due to high statistical and methodological heterogeneity, a meta-analysis could not be performed, and the results are summarized descriptively. Descriptive statistics such as mean, range, and measures of variance (e.g. standard deviations, 95% confidence intervals [CI]) are presented where applicable. The intraclass correlation coefficient (ICC) was used to evaluate inter-reviewer agreement for assessing study quality. A kappa (κ) statistic was used to evaluate inter-reviewer agreement at all screening stages. Agreement was categorized a priori as follows: ICC/κ of 0.81 to 0.99 was considered as almost perfect agreement; ICC/κ of 0.61 to 0.80 was substantial agreement; ICC/κ of 0.41 to 0.60 was moderate agreement; 0.21 to 0.40 fair agreement and a ICC/κ value of 0.20 or less was considered slight agreement. 19 Statistics were performed using Microsoft Excel (Version 2016; Microsoft, Redmond, Washington).
Quality assessment
The methodological quality of non-randomized studies was evaluated using the methodological index for nonrandomized studies (MINORS). 14 A score of 0, 1, or 2 is given for each of the 12 items on the MINORS checklist with a maximum score of 16 for non-comparative studies and 24 for comparative studies. Methodological quality was categorized a priori as follows: a score of 0–8 or 0–12 was considered poor quality, 9–12 or 13–18 was considered fair quality, and 13–16 or 19–24 was considered excellent quality, for non-comparative and comparative studies, respectively.
Results
Study characteristics
The initial search on the topic yielded a total of 4250 articles. After removing 861 duplicates, a systematic screening process yielded 45 articles that met inclusion criteria (Figure 1). One study was found upon reviewing references of included studies. Of the included studies, there were 19 retrospective cohort (42%), 18 prospective cohort (40%), and seven other studies (16%). One of the included studies was a conference abstract (2%) (Table 1).

PRISMA flow diagram.
Study characteristics and methodological quality.
NR: not reported.
Study quality
There was substantial agreement between the reviewers for title and abstract screening (κ = 0.759; 95% CI 0.461–0.877), and almost perfect agreement for full-text screening (κ = 0.838; 95% CI 0.662–1.000). The majority of the studies (44%; n = 19) were level II evidence, whereas 15 studies (42%; n = 18) were level III evidence, and six studies (14%; n = 6) were level IV evidence. The mean MINORS's scores across comparative and non-comparative studies are 20.1 and 9.9, respectively, indicating excellent and fair quality of evidence, respectively. Furthermore, there was excellent agreement between the raters in their classification of these studies (ICC = 0.99; 95% CI 0.99–0.99) (Table 1).
Distribution of modalities used and reference tests
Among the 45 studies, the distribution of modalities was MRA (23%), MRI (23%), 3D-CT (13%), CT (11%), computerized arthrotomography (CTA) (9%), ultrasound (US) (9%), radiography (5%), 3D-magnetic resonance (3D-MRI) (2%) as an index test. Twenty-seven out of the 42 studies (64%) reported using a reference test. The distribution of reference tests was arthroscopy (n = 23; 63.8%), surgical techniques (n = 7; 19.4%), radiographs (n = 3; 8.3%), MRI (n = 2; 5.5%), and arthro-CT (n = 1; 2.8%) (Table 1).
Patient characteristics
A total of 3413 patients and 3431 shoulders were included in this review.8–10,12,13,20–59 Of the included patients, 74% (1974 out of 2672) were male; nine studies (21%) did not report on gender distribution. The mean age was 28.8 ± 6.3 years, calculated from 37 studies (82%); eight studies (18%) did not report on age.13,21,24,25,45,49,50,55
Measurement techniques
Computed tomography (CT)
Reported measurement techniques varied amongst the studies as well as modalities.
For CT, humeral residual articular arc and percentage of articular arc loss, HSL width and depth, percentage of anterior glenoid defect, bare area, on-track and off-track, Franceschi grading, Calandra classification, Richards grading, Hall grading, Rowe grading, Flatow percentage, linear-based and area-based methods were used.10,31,39,52 The reported sensitivity when confirmed by CT was between 20% and 65%, and the specificity was between 41.7% and 87%.25,39 Accuracy, positive predictive value (PPV), and negative predictive value (NPV) were not reported amongst the CT studies. Intrarater agreement was only reported in one study, where there was 33% agreement for on-track measurements. 56 Interrater agreement ranged between 41% and 76% (Table 2).10,25,40,52,56 Radiographs were the only reference tests that were reported. 25 For 3D CT, circle area, height, and length of humeral head, HSL length and depth, anatomical neck width, on-track and off-track, and clock-face methods were reported.8,9,21,22,28,38,57 Only one study included data on sensitivity, specificity, PPV, and NPV values using the Calandra method. 22 The values reported were 76.3%, 100%, 100%, and 46.2%, respectively. 22 One study reported ICC values ranging from 0.92 and 0.99 for intrarater agreement. 28 Likewise, this study also reported ICC values for interrater agreement, ranging from 0.77 and 0.99 (Table 2). Arthroscopy was the only reference test reported.9,21,22,28
Detecting the presence of Hill-Sachs lesions with computed tomography (CT).
NR: not reported.
Computed arthrotomography (CTA)
For CTA, HSL depth, P/R (notch defect/radius) index calculation, and normal base area measurement were reported.13,27,42 When confirmed by CTA, the reported sensitivity ranged between 20% and 93%, whereas the specificity ranged between 90% and 95%. Accuracy, PPV, and NPV were reported in only one study, where it was 90%, 67%, and 98%, respectively. 30 Intrarater and interrater agreements were only reported in one study, where it was κ = 0.71, and κ = 0.30, respectively (Table 3). 13 The distribution of reference tests used was arthroscopy and AP radiographs.13,27,30
Detecting the presence of Hill-Sachs lesions with computed arthrotomography (CTA).
NR: not reported.
Magnetic resonance imaging (MRI)
For MRI, on-track and off-track, linear-based and area-based methods, as well as a modified Cetik method were reported.12,20,40,51,56,58 When confirmed by MRIs, the sensitivity values ranged between 16.7% and 96.3%, whereas the specificity values ranged from 67% to 100%.12,29,37,48,58 Accuracy of detecting the presence of HSLs with MRI ranged from 67% to 88%.37,48,58 PPV ranged from 14% to 65%, whereas NPV ranged from 85% to 91%.12,58 Intrarater agreement ranged from 41% to 86%. 58 Interrater agreement ranged from ICC = 0.33 to 1.00 (Table 4). 43 The distribution of reference tests was arthroscopy, radiographs, and surgical techniques.23,29,33,37,43,48,51,58
Detecting the presence of Hill-Sachs lesions with magnetic resonance imaging (MRI).
NR: not reported.
Magnetic resonance arthrography
For MRA, a method described by O’Brien et al. where measurements of the humeral circumference, as well as the depth of the HSL were used in analysis of the lesion as the most accurate reflection of Hill-Sachs volume. 45 Sensitivity values for detecting the presence of HSLs with MRA ranged from 69% to 100%.24,26,30,32,34,41,46,47,49 Specificity values ranged from 0% to 100%.24,26,30,32,34,41,46,47,49 Accuracy varied from 81% to 100%, PPV from 45% to 100%, and NPV from 88% to 100%.24,30,32,34,41,46,47 Intrarater agreement was only reported in one study to be ICC = 1.00, and interrater agreement to be ICC = 0.97 (Table 5). 45 The distribution of reference tests was arthroscopy and surgical techniques.23,24,26,30,32,34,35,41,46,47,49
Detecting the presence of Hill-Sachs lesions with magnetic resonance arthrography (MRA).
NR: not reported.
Ultrasound
For US, only one method was reported as a calculation of the Hill-Sachs volume using V = 4/3 π 1/2a 1/2bc, where a, b, and c represent the width, length, and depth of the lesion, respectively. 50
Only one study reported values for sensitivity (95.6%), specificity (92.8%), and accuracy (95.0%). 36 There were no reported values for both intrarater and interrater agreement (Table 6). The distribution of reference tests was arthroscopy, arthro-CT, and surgical techniques.23,36,42,47,50
Detecting the presence of Hill-Sachs lesions with ultrasound (US).
NR: not reported.
Clinically relevant bone loss
This scoping review was able to identify four studies (10%) that reported glenoid bone loss percentages ranging from 8.9% to 23.5%.9,10,56,58 Threshold values varied among the modalities used in the studies. Hardy et al. assessed a threshold value for making a precise risk factor for failure after an arthroscopic stabilization procedure. 27 A ratio between depth of the Hill-Sachs lesion (D) and the humeral head radius (R) from conventional radiograph was analyzed, it was found that when the D/R ratio threshold was more than 15%, the failure rate was 56% contrary to only 16% failure when the D/R ratio was less than 15%. Stefaniak et al. determined that for CT measurements, good or moderate ICC values were observed and “reasonable” or above threshold values of 30% of minimal detectable change (MDC 95%). 8 Beason et al. chose arbitrary threshold value ranges based on previously reported in the literature for glenoid (25%) and humeral head (<20%, 20–40%, >40%). 55 In agreement with previous studies, Ozaki et al. state that many reports suggest a large HSL to be one of the most important risk factors for postoperative recurrence after arthroscopic Bankart repair. 22 Critical sizes of these lesions have been reported as depths of more than 16% of the humeral head diameter, area more than 25% of the articular surface of the humeral head, and volume greater than 250 mm.31,60–62 Shijith et al. determined that CT is an effective modality for assessing the amount of bone loss on the glenoid side or head of the humerus, with glenoid width bone loss of more than 9.8% or Hill-Sachs defect of more than 14.8 mm being the critical defects after which the frequency of dislocations increases (Table 7). 39
Detecting the presence of Hill-Sachs lesions compared with various modalities.
NR: not reported.
Discussion
In the current review, there is significant variability in imaging modality and measurement techniques, with MRI and depth being the most prevalent, respectively. The current literature on the assessment of HSL demonstrates a wide range of measurement techniques and imaging modalities with support for MRI and MRA. However, results should be taken with caution due to the small number of included studies with each modality, the variability in study designs, and the lack of high-quality and comparative studies. In addition, the variety of measurement techniques corroborate the lack of standardization and agreement regarding the best modality and measurement method, suggesting that at this point there is no clear superiority of one imaging modality or measurement technique above the other.
MRA showed the highest sensitivity and specificity values amongst the different imaging modalities, but with values that range considerably from 69% to 100%.24,26,32,34,46,49 Accuracy (81–100%), and intra-rater and inter-rater agreement was highest amongst MRA compared to all other modalities. MRA is reliable in diagnosing various shoulder pathologies such as intra-articular cartilage and ligaments injuries, labral tears, and rotator cuff disease amongst others. 63 MRA can also be effective in measuring HSLs in adolescent patients and can help address bony complications of HSLs to accurately assess the lesion. 59 Another consideration is the viewing angle of the MRI. For example, the abduction-external rotation view and the apprehension test position are both recommended as useful techniques for detection of anterior shoulder instability, with the latter being possibly more beneficial in HSL examination when using indirect arthrography. 35 Despite its numerous benefits, MRA imaging has its drawbacks, most notably it is an expensive imaging tool. In addition, metal in the vicinity of the lesion can interfere with the true signal. 30 Furthermore, the reproducibility and accuracy of MRA assessments is moderate. 32
Many studies have shown success with measuring HSLs with other modalities, most notably standardized CT protocols. In fact, most clinicians use CT imaging as part of their preoperative assessment process when dealing with patients with shoulder instability. However, given the variability and inconclusive results among CT studies in this review, it is difficult to conclude its reliability despite its use by clinicians. Another consideration is the potential radiation exposure patients may experience during CT imaging.58,59,64 Fortunately, newer CT protocols have shown a reduction in radiation exposure by developing low dose scans protocols, therefore decreasing this concern.65,66 There exists a need to conduct analyses to directly compare imaging modalities not only regarding accurate measuring of HSLs, but also their safety profile and potential exposure risks to patients.
In 3D-CT HSLs measurement methodologies, analysis of a two-dimensional image of a three-dimensional object leaves many discrepancies. This often leads to misinterpretation in raters from image imperfections or measurement errors. 8 However, 3D imaging adds the benefit of modeling which can show the nature of the defect alongside the location for considerations of operative repair. 21 MR imaging more accurately quantifies Hill-Sachs interval as the rotator cuff insertion is more clearly visible than with CT scans and allows for evaluation of soft-tissue injuries accompanying primary anterior shoulder instability. 58
Unfortunately, accurate measurement of HSLs volume is often difficult. The wide variety of measurement methods reflect the lack of agreement in this area, although imaging findings do not always reflect what is observed arthroscopically.59,67 Recently, arthroscopic evaluation has been questioned due to poor accuracy and reliability when compared to CT scans with the potential of overestimating bone defects in patients with glenoid bone loss. 68 Thus, even when the most commonly used reference or gold standard test in this scoping review was arthroscopy in around 85% of the studies, there are concerns about its precision.
Preoperative planning in glenohumeral instability plays a pivotal role in determining appropriate treatment plans for patients, therefore analyzing imaging modalities and measurements of glenoid and humeral bone loss is essential for the treatment decision-making process. Among the 45 studies, CT-based tests and magnetic resonance-based tests were the most prevalent imaging modalities used as index tests. Currently there is no consensus on an accepted threshold value for HSL that will lead to a certain surgical treatment as its importance relies more in a bipolar defect concept. This creates numerous complications for quantifying bone loss, predicting engagement prior to surgery, and deciding the best treatment for anterior glenohumeral instability patients. 16 In contrast, most of the attention has been given to quantifying glenoid bone defects. The threshold values of glenoid bone loss above which arthroscopic Bankart repairs may fail have been widely accepted as ≥25% glenoid width loss, equivalent to ≥19 % of the glenoid length and ≥20 % of the surface area created by a best-fit circle on the inferior surface of the glenoid.16,69 However, a better understanding of shoulder instability as a bipolar problem reflected in the glenoid track concept and its potential treatment implication warrant a more precise quantification of the HSLs to offer patients the best treatment alternative.
An analysis of the quantification methods for HSLs identified in the included studies shows that measurement of the depth of the lesion is most prevalent. HSL depth measurements have been shown in addition to other quantification methods such as length and width measurements. Given the limited quantitative data and variability in modalities used among all different techniques, difficulties arise in identifying a “gold standard” for quantifying HSLs. To address the discrepancies between preoperative and intraoperative measurements of HSLs, a precise method for quantification of HSL needs to be established amongst clinicians and radiologists. Although our understanding of glenohumeral pathologies has grown exponentially, there remains a lack of consistency and agreement in the evaluation of this injury. For current surgeons, it is equally important that each technique's benefits and drawbacks are extensively studied and considered for each unique patient presentation to achieve the most accurate and best diagnosis of the HSL to dictate intervention planning. On the other hand, there is a need to establish the role of imaging modalities to optimize the decision-making process while reducing the economic burden of the healthcare system when using these resources.
Limitations
This review consists of limitations. Firstly, a meta-analysis was not performed as there was high statistical and methodological heterogeneity among the studies and thus, results are summarized descriptively. Furthermore, although multiple imaging modalities and measurement techniques were investigated, there was a lack of a good quality and quantity of evidence available for each. Thus, our ability to comprehensively comment on a “gold-standard” and provide meaningful recommendations is limited.
High-quality comparative studies with large sample sizes should be conducted in the future to determine an optimal imaging modality and to identify the best and more effective measurement technique. Therefore, future studies should standardize assessments of accuracy and reliability for imaging modalities and measurement techniques in quantifying the HSL. Future studies should also assess how treatment decisions can change based on the use of MRI with or without MRA. Lastly, an economic/cost-benefit analysis of imaging modalities should be conducted to help guide clinicians and radiologists on what is the best modality to measure HSLs.
Conclusion
MRA and MRI are reliable imaging modalities with good test diagnostic properties for assessment of HSLs. There is a wide variety of measurement techniques and imaging modalities for HSL assessment, however a lack of comparative studies exists. Thus, it is not possible to comment on the superiority of one technique over another. Future studies should directly compare the accuracy and reliability of imaging modalities and measurement while also conducting cost-benefit analyses.
Supplemental Material
sj-docx-1-sel-10.1177_17585732221123313 - Supplemental material for Variability in quantifying the Hill-Sachs lesion: A scoping review
Supplemental material, sj-docx-1-sel-10.1177_17585732221123313 for Variability in quantifying the Hill-Sachs lesion: A scoping review by Shahrukh Khan, Ajaykumar Shanmugaraj, Haseeb Faisal, Carlos Prada, Sohaib Munir, Timothy Leroux and Moin Khan in Shoulder & Elbow
Footnotes
List of Abbreviations
Contributorship
All authors contributed substantially to the conception and design, or acquisition of data, or analysis and interpretation of data; drafted the article or revised it critically for important intellectual content; provided the final approval of the version to be published; and agreed to act as guarantor of the work (ensuring that questions related to any part of the work are appropriately investigated and resolved).
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
