Sage Journals: Discover world-class research

Abstract

Background:

Relative humeral retrotorsion (rHRT) is an osseous adaptation in overhead athletes garnered from repetitive overhead throwing. Accurate measurement of anatomic humeral retrotorsion (aHRT) is important as it aids in the determination of rHRT, which influences glenohumeral range of motion profiles. While computed tomography scans are the gold standard for assessing aHRT, their limited clinical utility has driven interest in accessible alternatives, such as diagnostic ultrasound.

Purpose/Hypothesis:

The purpose of this study was to validate a handheld ultrasound device (HH-US) as a clinically accessible tool to measure aHRT in baseball and softball athletes. It was hypothesized that a HH-US device will be reliable and valid when quantifying aHRT compared with an established benchmark, diagnostic musculoskeletal ultrasound (MSK-US).

Study Design:

Cohort study (Diagnosis); Level of evidence, 3.

Methods:

Data were collected from collegiate baseball and softball athletes at 2 local universities. Participants were uninjured at the time of testing and over 17 years old. Anatomic HRT was measured bilaterally using both MSK-US (GE Venue Go) and handheld ultrasound (GE Vscan Air) using previously established methods. The intraclass correlation coefficient, standard error of measure, and minimal detectable change, as well as Bland-Altman plots, were used to assess reliability and agreement between devices, respectively.

Results:

A total of 93 athletes were included in this study. HH-US had excellent intrarater reliability (ICC₂,₁ = 0.98; 95% CI, 0.94-0.97; SEM₉₀ = 1.77°; and MDC₉₀ = 4.12°). There was acceptable agreement between the HH-US and MSK-US. The mean difference between devices was −0.63° and 0.48° for the throwing and nonthrowing arms, respectively. Analysis of Bland-Altman plots demonstrated no significant bias across the range of measurements. HH-US measurements were completed in <2 minutes.

Conclusion:

Keywords

humeral retrotorsion MSK ultrasound baseball softball overhead athletes

Humeral torsion, described as the twist of the humerus, is defined as the angular difference between the orientation of the axis of the proximal humeral head and the distal epicondylar axis.^12,28 Anatomic humeral retrotorsion (aHRT) refers to a posteromedial orientation of the humeral head axis with respect to the distal humeral epicondylar axis.¹⁴ At birth, the humeri are positioned in bilateral aHRT.¹⁴ During growth and development, humeral anteversion (increasing glenohumeral internal rotation [GIR] and decreasing glenohumeral external rotation [GER] range of motion [ROM]) occurs, resulting in a humeral head axis that is anteriomedial with respect to the distal humerus.¹⁴ However, on the throwing arm, this natural process becomes delayed when throwing begins at a young age, while the nonthrowing arm continues to progress through humeral anteversion. While this is a natural process, it is important to understand that the natural progression of aHRT can be affected by a multitude of variables, including sex, country of origin, handedness, and genetics, resulting in larger degrees of aHRT on the throwing arm.^6,15

Bilateral measurement of aHRT is particularly important to account for when managing overhead athletes. Relative HRT (rHRT), defined as the difference between throwing arm aHRT and nonthrowing arm aHRT, becomes most apparent during skeletal maturation, primarily in the presence of large volumes of overhead activity and sport participation.^25,29,50 More specifically, there is more aHRT preserved on the throwing arm when compared with the nonthrowing arm, and this anatomic bony adaptation has been shown to influence shoulder ROM.^16,37,41 Previous studies on normal, throwing-related adaptations in overhead athletes have demonstrated anatomic changes in the shoulder, including osseous, capsular, and soft tissue adaptations from the resultant kinetic loads on the throwing arm.¹ It has been theorized that the repetitive activity of maximum GER and abduction, a movement done during the late cocking phase of throwing, may slow the natural derotation process leading to the difference of rHRT seen in throwers.^43,44,52 Additionally, the opposing forces between the glenohumeral internal and external rotators during throwing impart rotational stresses above and below the proximal humeral physis.¹⁹ Therefore, it is a clinical necessity to account for osseous adaptation in the form of rHRT when assessing glenohumeral rotational ROM. Several authors have reported that when rotational motion is considered in the context of rHRT, GER deficits are significantly more common than GIR deficits in both injured and uninjured baseball cohorts.^16,37,41 While this has not been studied in softball athletes in isolation, it is not unreasonable to suspect similar outcomes in softball athletes, given the comparable anatomic adaptations established.

Previously established methods have been used to quantify aHRT. Computed tomography (CT) scans are considered the gold standard and have the best reported psychometric properties related to measurement error.^3,46 While CT scans are reliable and valid, quantifying aHRT using this methodology requires equipment that is not readily available to most clinicians and incurs the burdens of time, cost, and radiation exposure.^3,23,46 To accommodate the lack of clinical availability of CT scans, alternative methods have been introduced in the literature, including palpation of the humeral tubercles and clinical prediction models.^6,11,45,56 However, methodological flaws have been demonstrated in the utility of these methods, including poor validity and agreement and large degrees of error. The clinical and logistic limitations of the aforementioned methods have given rise to the popularity of diagnostic musculoskeletal ultrasound (MSK-US) as a reliable and valid measurement practice.^25,36,39,52 This method has been tested and proved to be valid when compared with CT.³⁶ Additionally, aHRT measured via MSK-US has been studied in overhead throwing athletes and has consistently demonstrated both excellent intra- and interrater reliability.^16,17,21,25 However, MSK-US devices are costly, adding to the limitations of using a reliable and validated technique to assess for aHRT in overhead athletic populations.

Many point-of-care ultrasound devices are now handheld, which allows for ease of transport and reduced cost compared with MSK-US ultrasound devices.³⁰ However, concerns regarding the image quality and validity of handheld devices exist when compared with MSK-US.^8,30 Handheld ultrasound (HH-US) devices have been used in a clinical setting and boast advantages over MSK-US, including the use of a single probe device for multiple functions, compact size, ease of transport, relatively low cost, and good image quality in most functions.⁸ HH-US devices have been previously used to visualize a variety of superficial and deep anatomic structures.^8,20,24 The use of a HH-US device garners the ability to fill a viable gap to clinically measure aHRT by offering availability at a lower cost than traditional diagnostic MSK-US. The purpose of this study was to validate a HH-US device as a clinically accessible tool to measure aHRT in baseball and softball athletes. We hypothesize that a HH-US device will be reliable and valid when quantifying aHRT compared with an established benchmark, MSK-US.

Methods

Study Design

Data for this study were prospectively collected in a collegiate baseball and softball cohort. All data were directly input into 1 Excel for Mac document (Microsoft, version 16.78.3). All participants read and signed an informed consent form approved by the University of Texas Health Science Center at Houston (HSC-MH-21-1041) before enrollment in the study.

Participants

Data were collected from 2 local universities within the baseball/softball training facilities. Inclusion criteria for this study were as follows: (1) participation in National Collegiate Athletic Association Division I baseball or softball and (2) uninjured at the time of data collection. Participants were excluded if they were <18 years old. A combination of pitchers and position players was enrolled, all with varying injury histories, to ensure sufficient variability existed.³⁶

Procedures

For clarification purposes, the authors refer to the use of the GE Venue Go diagnostic ultrasound as “musculoskeletal ultrasound” or MSK-US (performed first) and the use of GE Vscan Air as “handheld ultrasound” or HH-US (performed second).

Anatomic Humeral Retrotorsion: Musculoskeletal Ultrasound

Anatomic humeral retrotorsion was assessed bilaterally using an indirect ultrasonographic technique that has been described and validated by previous researchers.^36,39 This indirect technique measures aHRT by calculating the forearm inclination angle relative to a standardized humeral position.⁵³ One examiner (S.M.K.), a physical therapist with 8 years of experience using MSK-US, performed all aHRT measurements. The examiner asked each player to lie supine on a standard treatment table in a hook-lying position to maintain a neutral lumbar spine. Ultrasound gel (Cardinal Health) was placed on a linear ML6-15 probe connected to the Venue Go R3 (GE HealthCare) ultrasound machine. Examiner 1 placed the participant's shoulder in 90° of abduction with the elbow in 90° of flexion and neutral rotation and positioned the probe over the anterior aspect of the participant's glenohumeral joint. A rolled towel was placed under the participant's humerus to maintain a neutral position of the glenohumeral in the coronal plane. The probe was aligned perpendicular to the long axis of the humerus in the frontal plane. With the probe level (as designated by a bubble level on the face of the probe), examiner 1 rotated the humerus until the deepest part of the bicipital groove was visualized and the apexes of the greater and lesser tubercles were parallel to the horizontal plane (Figure 1). Examiner 2 (F.A.M.), a residency-trained physical therapist with 2 years of experience and previously established reliability for inclinometry, then placed the digital inclinometer just below the shaft of the ulna to record the degrees of inclination. The digital inclinometer was zeroed to the vertical plane before the measurement. This process was repeated twice. The mean of the 2 values was used for final analysis; a third measure was performed if there was a wide discrepancy (>3°) between the first 2 measures. Intraclass correlation coefficient (ICC_2,1) with 95% confidence intervals and standard error of measure (SEM) values were calculated for aHRT in pilot testing and found to be excellent (ICC_2,1 = 0.91; 95% CI, 0.72-0.97; SEM = 2.08°).

Figure 1.

Measurement of anatomic humeral retrotorsion using a linear ultrasound probe placed over the anterior shoulder with musculoskeletal ultrasound.

Anatomic Humeral Retrotorsion: Handheld Ultrasound

Each player was assessed in the same position as previously described using the Vscan Air (GE HealthCare), using the same procedures described above (Figure 2). The linear ML3-12 probe was used for aHRT measurements. Ultrasound gel (Cardinal Health) was placed on a linear probe, and the Vscan Air iPad application was connected via Bluetooth.

Figure 2.

(A) Measurement of humeral retrotorsion using a linear ultrasound probe with the GE Vscan Air handheld ultrasound. (B) User interface via tablet and wireless GE Vscan Air probe fixed with a bubble level.

Figure 3 demonstrates the picture quality generated from both the MSK-US and HH-US for comparison of quality.

Figure 3.

Visualization of the deepest portion of the bicipital groove while the lesser and greater tubercle remain parallel using the (A) musculoskeletal ultrasound and (B) GE Vscan Air handheld ultrasound. The red lines demonstrate the humeral tubercles parallel to the horizontal plane; the orange lines represent the deepest portion of the bicipital groove.

Criterion Validity Procedures

Anatomic HRT was measured on both the throwing and nonthrowing arms twice with MSK-US during the fall 2024 baseball and softball seasons. A third measurement was done if there was a wide discrepancy (>3°) between the first 2 measures. The mean of the 2 values was used for the final analysis. The second session of aHRT measurements was conducted 2 months later using HH-US on throwing and nonthrowing arms. Two trials were documented, and the mean of the measures was used for final analysis. To demonstrate data collection efficiency using the HH-US unit, a standard stopwatch was used to mark the beginning and end of bilateral aHRT measurements for every athlete, so the total time of the measurement technique could be determined.

Reliability Procedures

To determine the intrarater reliability of the Vscan Air HH-US unit, 2 aHRT trials were documented on the throwing arm, 5 minutes apart, on each participant. As aHRT is considered a stable measure in an adult population, 5 minutes were used to decrease the burden of time on participants and study staff. The mean of the 2 values at each time point was used for final analysis.

Statistical Analysis

A priori power analysis was performed to determine the appropriate sample size required to demonstrate the reliability of the Vscan Air based on an estimation of precision.^4,35 We used an expected reliability (ICC) value of 0.90 with a precision of ±0.05 based on our previously established reliability with an MSK-US device. With a 95% confidence level and a total of 2 repetitions per participant, it was determined that a sample size of at least 57 athletes was necessary to confidently report the reliability of the Vscan air method for measuring aHRT via HH-US.

Data were reported as means ± standard deviations for continuous data and total count (with percentages) for categorical data. The duration of time for HH-US data collection was reported in seconds. Data on the throwing arm and nonthrowing arm for both MSK-US and HH-US were assessed for normality using the Kolmogorov-Smirnov test. All data were normally distributed (P > .20).

To determine the intrarater reliability of the HH-US unit, the intraclass correlation coefficient (ICC_2,1) was utilized. The ICC values ranged from 0 to 1; values <0.5 indicated poor reliability, 0.5 to 0.75 moderate reliability, 0.75 to 0.9 good reliability, and >0.9 excellent reliability.²⁷ Confidence intervals were calculated at the 95% level for the reliability coefficients. The error associated with a onetime aHRT measure using the HH-US unit was calculated by using the SEM: SEM = $SDx [\sqrt{1 - ICC}]$ . The SEM was then multiplied by the z score (1.64) for a 90% CI for the true score about the observed score (SEM₉₀). To determine the smallest amount of change that can be detected within the HH-US measurement, the minimal detectable change (MDC) was calculated: MDC = $SEMx [\sqrt{2}]$ . The MDC was then multiplied by the z score (1.64) for a 90% CI, indicating a true change beyond measurement error, with 90% confidence (MDC₉₀).

When establishing the criterion validity of the HH-US unit, 2 variables were investigated: aHRT in degrees on both the throwing and nonthrowing arms. To determine the agreement between the MSK-US and HH-US for measuring aHRT, Bland-Altman plots were generated. A 1-sample t test was used before constructing the plots to help assess if the mean difference between the 2 measurement methods was significantly different from zero.⁷ The differences between the MSK-US and the HH-US aHRT measure (MSK-US – HH-US) were then plotted against the mean of the MSK-US and the HH-US aHRT measure (MSK-US + HH-US / 2). The variability of the differences was examined using limits of agreement (LoAs) and computed as the mean of the difference ± 1.96 * standard deviation of the difference. LoAs show the range within which 95% of the differences between measurements are expected to fall. The areas of confidence around the mean difference and the LOAs were also calculated to estimate the size of the possible sampling error.¹⁸ The calculations used to estimate the sampling error are described in detail by Giavarina.¹⁸ The upper and lower maximum allowed differences, defined as an acceptable level of difference between the 2 measurement methods, were computed for both arms by calculating ±2 SD of the mean difference.¹⁸ The maximum allowed difference should be larger than the upper LoA and lower than the lower LoA. It is recommended that 95% of the data points lie ±2 SD of the mean difference.¹⁸ Lastly, a linear regression was used to check for proportional bias within the Bland-Altman plot. Ludbrook³² stated that if the slope of the regression line fitted to the Bland-Altman plot is not significantly different from zero, then the proportional bias is absent.

Results

The initial cohort enrolled was composed of 105 baseball and softball players. However, there were instances of missing data. For the validity aim of this study, 12 athletes initially enrolled were dismissed following the fall season and did not complete the second testing session; additionally, 1 participant's nonthrowing arm was not documented. For the reliability aim of the study, 18 athletes were not present for the second testing time point. The final sample included in our criterion validity study was 93 throwing arms and 92 nonthrowing arms, and the final sample included in our reliability study was 87 throwing arms. Table 1 describes the demographics of the baseball and softball validity cohorts. The average length of time to evaluate for bilateral aHRT with the HH-US device in this cohort was 102 ± 32 seconds.

Table 1

Demographic Data for Both the Baseball and Softball Cohorts^a

Characteristic	Baseball	Softball
Number	73	20
Age, y	20 ± 12.4	19 ± 1.3
Height, in.	73 ± 2.8	66 ± 2.9
Weight, kg	89 ± 9.5	71 ± 14.9
Throwing arm, right, No. (%)	59 (80.8)	17 (85.0)
Year in sport, No. Freshman Sophmore Junior Senior Redshirt Graduate student	14 14 18 8 13 4	8 8 2 1 1
Years participating in sport	15 ± 2.3	14 ± 1.9
Position, No. Pitcher Non-pitcher	38 35	7 13

Values are presented as mean ± standard deviation unless otherwise indicated.

Reliability Results

Intraclass correlation coefficient (ICC_2,1) with 95% confidence intervals, SEM, and MDC values were calculated for aHRT values collected via HH-US (ICC_2,1 = 0.98; 95% CI, 0.94-0.97; SEM₉₀ = 1.77°; and MDC₉₀ = 4.12°). Intrarater test-retest reliability was excellent.

Criterion Validity Results

Figures 4 and 5 provide the Bland-Altman plots for both the throwing and nonthrowing arms, respectively. The raw data used to construct the Bland-Altman plots are provided in Table 2, as suggested by Giavarina.¹⁸ The mean ± SD between MSK-US and HH-US on the throwing and nonthrowing arms was −0.634°± 3.94° and 0.489°± 3.76°, respectively. The mean differences for both arms were not different from zero (throwing arm, t = −1.55, P = .12; nonthrowing arm, t = 0.125, P = .90), and 95% of the differences between measurements fell within −8.35° and 7.088° on the throwing arm and −6.88° and 7.85° on the nonthrowing arm, respectively. The confidence limits for the mean of the differences and upper and lower LoAs ranged from 1.621 to 2.811, indicating low variability in the calculated differences between the 2 measurements. The upper maximum allowed difference was larger than the LoA, and the lower maximum allowed difference was lower than the lower LoA for both arms. As such, the level of disagreement between the 2 methods was considered acceptable. Lastly, the linear regression results suggested no proportional bias (throwing arm, B = −0.41, P = .32; nonthrowing arm, B = 0.21, P = .53) or bias in which one method of measurement produced values higher or lower than the other method by a proportional amount.²

Figure 4.

Throwing arm Bland-Altman plot of the difference between musculoskeletal ultrasound (MSK US) and handheld ultrasound (US) versus the mean of the 2 measurements. The mean of the difference is represented by the solid black line, while the dotted lines represent the upper and lower limits of agreement (LoA) from −1.96 to +1.96 SD. The shaded gray areas represent confidence limits for the mean and upper and lower LoAs. Both the x- and y-axes represent degrees.

Figure 5.

Nonthrowing arm Bland-Altman plot of the difference between musculoskeletal ultrasound (MSK US) and handheld ultrasound (US) versus the mean of the 2 measurements. The mean of the difference is represented by the solid black line, while the dotted lines represent the upper and lower limits of agreement (LoA) from −1.96 to +1.96 SD. The shaded gray areas represent confidence limits for the mean and upper and lower LoAs. Both the x- and y-axes represent degrees.

Table 2

Raw Data Used to Construct the Bland-Altman Plots^a

Data	Unit	Standard Error Formula	Standard Error	t value for 29 Degrees of Freedom	Confidence(SE *t)	Confidence^b
Throwing arm
Number	93
Degrees of freedom (n – 1)	92
Difference mean (d)	−0.634	$\sqrt{{SD}^{2}} / n$	0.408	1.986	0.810	−1.445 to 0.176
SD	3.94
(d) – 1.96 SD (lower LoA)	−8.3568	$\sqrt{3 {SD}^{2}} / n$	0.707	1.986	1.405	−9.762 to −6.951
(d) + 1.96 SD (upper LoA)	7.088	$\sqrt{3 {SD}^{2}} / n$	0.707	1.986	1.405	5.682 to 8.493
(d) – 2 SD (lower maximum allowed difference)	−8.5144
(d) + 2 SD (upper maximum allowed difference)	7.2456
Nonthrowing arm
Number	92
Degrees of freedom (n – 1)	91
Difference mean (d)	0.489	$\sqrt{{SD}^{2}} / n$	0.392	1.662	0.651	−0.162 to 1.140
SD	3.76
(d) – 1.96 SD (lower LoA)	−6.8806	$\sqrt{3 {SD}^{2}} / n$	0.678	1.662	1.128	−8.009 to −5.752
(d) + 1.96 SD (upper LoA)	7.8586	$\sqrt{3 {SD}^{2}} / n$	0.678	1.662	1.128	6.730 to 8.987
(d) – 2 SD (lower maximum allowed difference)	−7.031
(d) + 2 SD (upper maximum allowed difference)	8.009

Raw data (difference mean, difference standard deviation, upper LoA, and lower LoA) were used to construct the Bland-Altman plots for the throwing and nonthrowing arms. The table also includes the elements needed to calculate the maximum allowed differences and confidence limits for the difference mean, upper LoA, and lower LoA. LoA, limit of agreement.

Confidence is the range from (“unit of interest” - confidence) to (“unit of interest” + confidence).

Discussion

The major findings of our study demonstrate that the intrarater reliability for the HH-US device in measuring aHRT is excellent, and the HH-US device is valid compared with MSK-US. Specifically, we found excellent intrarater test-retest reliability of the HH-US device (ICC_2,1 = 0.98; 95% CI, 0.94-0.97) with a small standard error of the measure (1.77°). Regarding the validity of the HH-US, the mean difference between MSK-US and HH-US for the throwing and nonthrowing arms was 0.634° and 0.489°, respectively; the mean differences between devices were not significantly different from zero (throwing arm, P = .12; nonthrowing arm, P = .90). Lastly, there was no proportional bias between devices, as demonstrated by the nonsignificant results of the linear regression (throwing arm, B = −0.41, P = .32; nonthrowing arm, B = 0.21, P = .53).

The results of the present study indicate that 2 people (1 sonographer and 1 measurer) can consistently measure aHRT using the GE Vscan Air. These findings support our hypothesis. The SEM₉₀, derived from the reliability statistic (ICC), enables clinicians to decide on a single aHRT score for a given patient. For example, if a given patient's aHRT measurement is 15° on the throwing arm, and the SEM₉₀ using the HH-US device is 1.7°, this indicates that the true aHRT for this patient would be ±1.7° about the observed measurement of 15°. Additionally, the MDC₉₀ indicates the amount of change required for a result to be considered greater than measurement error. The MDC₉₀ can be used to make decisions about aHRT results performed consecutively over time in individual patients. For example, a change beyond 4.12° would be considered a change beyond what could be attributed to measurement error.

The GE Vscan Air HH-US is accurate in measuring aHRT compared with a standard MSK-US unit, supporting our hypothesis. Bland-Altman plots were constructed to compare HH-US to MSK-US, providing an interpretation of measurement agreement between the methods.³³ The interpretation of agreement was identified using several different methods with the Bland-Altman plot: (1) calculation of bias (or mean of the difference), (2) calculation of variability of the differences using LoAs, (3) calculation of the areas of confidence around the mean difference and LoAs, and (4) calculation of the maximum allowed differences. The bias, defined as the mean of the difference, identifies any systematic difference between the 2 measurement techniques. Our results show that the mean differences for both the throwing and nonthrowing arms were close to zero, indicating minimal difference between the methods; thus, the HH-US is neither underestimating nor overestimating aHRT. Moreover, the data points are scattered equally above and below without any specific trend, signifying that the HH-US device provides readings similar to those of the MSK-US device. The 95% LoAs are similar for both arms, with slightly more variability on the throwing arm; however, only 7 data points (7%) lie outside of the LoAs for both arms. Additionally, the tight confidence limits presented in this study are narrow, indicating a precise estimate of the agreement between MSK-US and HH-US. The maximum allowed difference aids in the interpretation of a predefined clinical agreement level, in which case, differences below or above the level are not meaningful. Because this is the first study, to our knowledge, to investigate the agreement between these 2 measurements, a statistical level of difference was computed by calculating ±2 SD of the mean difference.^18,31,34 The recommendation from Giavarina¹⁸ is that 95% of the data points lie within ±2 SD of the mean difference. Of the data, 93% and 94% lie between the maximum allowed difference values for both the throwing and nonthrowing arms. In conjunction with this statistical method, the authors also identified normative aHRT values in the already published literature to determine how different the mean aHRT is using nearly identical methodology. On average, there is a 7° difference between the already published work that has investigated aHRT values among healthy college baseball players, indicating that a 7° difference between methods is likely acceptable and falls within both of our Bland-Altman plot LoAs.^37,41,49,51

Measuring aHRT is a clinical necessity in the evaluation of overhead throwing athletes. Previously established methods to measure aHRT include CT, palpation of the humeral tubercles, clinical prediction models, and MSK-US. Each method has its advantages and limitations. CT scans are considered the gold standard for measurement accuracy, but they are limited in clinical applicability and entail costs and radiation exposure to the patient.^3,23,46

Previous studies have investigated a clinical method for measuring aHRT by palpation of the humeral tubercles.¹¹ An advantage of this method is cost-effectiveness, because it requires minimal instrumentation. However, this method relies on the palpation of bicipital tuberosities and has resulted in significantly underestimated aHRT values compared with other methods.⁵⁶ When investigating an asymptomatic population, the palpation method for quantifying aHRT demonstrates good intrarater reliability of aHRT (ICC = 0.849).⁴⁵ However, there are concerns with the validity of palpation when compared with other previously established methods, such as diagnostic ultrasound, yielding low agreement between the methods (r≤ 0.326).⁴⁵ Another cost-effective method of quantifying aHRT is through a clinical prediction model.⁶ This aHRT prediction model demonstrated acceptable accuracy in estimating aHRT from a series of predictors and can be used in conjunction with a standard clinical examination to estimate aHRT.⁶ However, this model included only professional baseball pitchers, making its applicability to other age ranges, levels of play, and positions unknown. As previous research has demonstrated a strong correlation between aHRT values obtained through CT scans and MSK-US,³⁶ it is not unreasonable to assume that, given the results of the present study, HH-US is a viable, time-effective alternative to quantitatively measure aHRT. In addition to being time-efficient, when the cost of the HH-US unit is compared with the MSK-US unit, the HH-US is approximately 4.5 to 9.7 times more cost-effective. Additionally, the average time to take bilateral aHRT measurements in this study was <2 minutes, demonstrating the time efficiency of this device in addition to the reliability and validity.

The presence of rHRT is well documented in baseball players. Overhead throwing athletes typically demonstrate greater aHRT on the throwing arm as compared with the nonthrowing arm, resulting in a mean difference ranging from 9° to 23°.^10,22,54 In the authors’ clinical experience, rHRT can reach up to 30°. This rHRT has been previously investigated in its relationship to glenohumeral ROM in overhead throwing athletes, a commonly cited risk factor for injury.⁵ As HRT has been shown to explain a portion of the variance in measures of glenohumeral rotational ROM,²² it is imperative to understand the effects that rHRT has on glenohumeral rotational profiles. More specifically, significant differences between clinically measured glenohumeral ROM and HRT-corrected ROM have been clearly delineated in a large cohort of injured baseball athletes.¹⁶ The results demonstrated a paradoxical shift in the interpretation of common findings in injured baseball athletes, with 172 baseball players demonstrating an average GER ROM deficit of −10° to −13°, with no observed GIR deficit (or GIRD) when accounting for rHRT.¹⁶ These findings, alongside other studies including similar results in healthy baseball athletes,^37,41 highlight the importance of accurately measuring aHRT in a comprehensive clinical examination in an overhead throwing athlete.

A Clinical Example

To further demonstrate the relationship between rHRT and the clinical interpretation of glenohumeral ROM, we present a relevant clinical example. Table 3 displays bilateral rotational ROM, bilateral HRT, and the calculated side-to-side differences between the dominant and nondominant arms.

Table 3

Clinically Measured Objective Data^a

Measure	Dominant Arm	Nondominant Arm	Difference (D – ND)
GER, deg	110	105	+5
GIR, deg	15	35	−20
TROM, deg	125	140	−15
HRT, deg	5	25	−20

D, dominant; GER, glenohumeral external rotation; GIR, glenohumeral internal rotation; HRT, humeral retrotorsion; ND, nondominant; TROM, total rotational range of motion.

Based solely on the clinical ROM measurements, this athlete demonstrates a 5° gain in GER, a 20° loss of GIR, and a 15° loss of total arc of motion. Previous interpretation would classify this as a pathologic GIRD.⁹ Under these assumptions, the sports medicine team might attribute the athlete's pain and dysfunction to internal rotation loss and prescribe posterior shoulder stretches. However, when applying previously published HRT-corrected equations,¹⁶ incorporating consideration to nonmodifiable osseous adaptations, our interpretation shifts (Table 4). This method includes using the rHRT to determine the directionality of glenohumeral motion loss.

Table 4

Relative Humeral Torsion-Corrected Interpretation of Range of Motion^a

rHRT-Corrected External Rotation	rHRT-Corrected Internal Rotation
GER difference + rHRT difference =	GIR difference – rHRT difference =
5°+ (–20°) = −15°	(–20°) – (–20°) = 0°

GER, glenohumeral external rotation; GIR, glenohumeral internal rotation; rHRT, relative humeral retrotorsion difference.

Applying the rHRT-corrected equations shifts the interpretation of this athlete's shoulder ROM profile. Rather than having pathologic GIRD, this athlete has a 15° deficit in GER ROM with no evidence of GIRD. In this scenario, targeted interventions should instead focus on addressing soft tissue impairments related to external rotation deficits rather than internal rotation deficits. The clinical pitfalls of inappropriately determining the directionality of ROM loss in the throwing athlete are not robustly known at the present time. When only clinical ROM measurements are considered, practitioners risk misidentifying the deficient rotational direction and implementing unnecessary or ineffective interventions. In the presence of a glenohumeral total arc of ROM loss, a previously known risk factor for shoulder and elbow injury,^{13,40,42,47,48,55} it is imperative to correctly address the direction of motion loss; the present authors believe the interpretation of these motion deficits in the context of rHRT helps to address this gap in determining directionality. This example underscores the importance of accounting for rHRT in the evaluation of glenohumeral ROM, as well as the necessity for an accessible way of measuring HRT, such as HH-US devices.

This study is not without its limitations. Participants included in these results were collegiate baseball and softball players, leaving the utility of this device for assessing aHRT in skeletally immature athletes unknown. However, it is reasonable to suggest that similar findings would be found in a younger population or the general population. Anatomic HRT changes most rapidly around age 8 as players begin to throw during skeletal maturation, with the potential to continue to around age 16. Therefore, college athletes were our demographic of interest as growth plates are closed and aHRT values become stabilized.^14,26,38 Additionally, the interrater reliability was not studied using this HH-US device. In the absence of established interrater reliability, there is a risk of improper measurement of aHRT when using multiple raters, which may limit the future application of this methodology if clinicians are not adequately trained in the use of HH-US; the decision to forego an interrater reliability analysis in these results was due to the constraints of time associated with data collection at our university affiliates and access to these athletes without disrupting their academic and sports schedules. Finally, while not a limitation affecting the results of the present study, the battery life of the HH-US device used in this study is approximately 50 minutes of continuous scan time. When evaluating large cohorts for aHRT at a single time point, the device may be impractical to complete the evaluation process due to battery lifetime constraints.

Conclusion

Our study showed that HH-US provides a reliable and valid measurement of aHRT in comparison to MSK-US and may be an accessible option for clinicians evaluating overhead athletes. Given its strong agreement with established methods, HH-US offers an efficient and cost-effective alternative to MSK-US when measuring aHRT in baseball and softball athletes. Accurate assessment of aHRT is a clinical necessity in evaluating overhead throwing athletes to identify adaptations that may contribute to ROM differences in overhead athletes.

Footnotes

Final revision submitted September 23, 2025; accepted October 19, 2025.

One or more of the authors has declared the following potential conflict of interest or source of funding: J.E.C. has received royalties and speaking fees from Arthrex, travel expenses from Pylant Medical, and support for education from MedInc of Texas. AOSSM checks author disclosures against the Open Payments Database (OPD). AOSSM has not conducted an independent investigation on the OPD and disclaims any liability or responsibility relating thereto.

Ethical approval for this study was obtained from the University of Texas Health Science Center at Houston (HSC-MH-21-1041).

ORCID iDs

Sean M. Kennedy

J. Craig Garrison

References

Astolfi

Struminger

Royer

Kaminski

Swanik

CB.

Adaptations of the shoulder to overhead throwing in youth athletes. J Athl Train. 2015;50(7):726-732.

Batterham

AM.

Bias in Bland-Altman but not regression validity analyses. Sportscience. 2004;8:42-47.

Boileau

Bicknell

Mazzoleni

Walch

Urien

CT scan method accurately assesses humeral head retroversion. Clin Orthop Relat Res. 2008;466:661-669.

Bonett

DG.

Sample size requirements for estimating intraclass correlations with desired precision. Stat Med. 2002;21(9):1331-1335.

Bullock

Faherty

Ledbetter

Thigpen

Sell

TC.

Shoulder range of motion and baseball arm injuries: a systematic review and meta-analysis. J Athl Train. 2018;53(12):1190-1199.

Bullock

Shanley

Collins

, et al. Development and internal validation of a humeral torsion prediction model in professional baseball pitchers. J Shoulder Elbow Surg. 2021;30(12):2832-2838.

Bunce

Correlation, agreement, and Bland–Altman analysis: statistical analysis of method comparison studies. Am J Ophthalmol. 2009;148(1):4-6.

Burleson

Swanson

Shufflebarger

, et al. Evaluation of a novel handheld point-of-care ultrasound device in an African emergency department. Ultrasound J. 2020;12:1-5.

Chou

PP-H

Chou

Y-L

Wang

Y-S

Wang

R-T

Lin

H-T

. Effects of glenohumeral internal rotation deficit on baseball pitching among pitchers of different ages. J Shoulder Elbow Surg. 2018;27(4):599-605.

10.

Crockett

Gross

Wilk

, et al. Osseous adaptation and range of motion at the glenohumeral joint in professional baseball pitchers. Am J Sports Med. 2002;30(1):20-26.

11.

Dashottar

Borstad

JD.

Validity of measuring humeral torsion using palpation of bicipital tuberosities. Physiother Theory Pract. 2013;29(1):67-74.

12.

Deuevoise

Hyatt

Townsend

GB.

Humeral torsion in recurrent shoulder dislocations: a technic of determination by x-ray. Clin Orthop Relat Res. 1971;76:87-93.

13.

Dines

Frank

Akerman

Yocum

LA.

Glenohumeral internal rotation deficits in baseball players with ulnar collateral ligament insufficiency. Am J Sports Med. 2009;37(3):566-570.

14.

Edelson

The development of humeral head retroversion. J Shoulder Elbow Surg. 2000;9(4):316-318.

15.

Edelson

Variations in the retroversion of the humeral head. J Shoulder Elbow Surg. 1999;8(2):142-145.

16.

Entler

Kruseman

Kennedy

, et al. The role of humeral torsion on glenohumeral rotation in injured baseball players. Orthop J Sports Med. 2024;12(8):23259671241260084. doi:10.1177/23259671241260084

17.

Feuerherd

Sutherlin

Hart

Saliba

SA.

Reliability of and the relationship between ultrasound measurement and three clinical assessments of humeral torsion. Int J Sports Phys Ther. 2014;9(7):938.

18.

Giavarina

Understanding Bland Altman analysis. Biochem Med (Zagreb). 2015;25(2):141-151.

19.

Greenberg

Fernandez-Fernandez

Lawrence

JTR

McClure

The development of humeral retrotorsion and its relationship to throwing sports. Sports Health. 2015;7(6):489-496. doi:10.1177/1941738115608830

20.

Greiner

Kaiser

Maurer

Stroszczynski

Jung

EM.

Wireless handheld ultrasound for internal jugular vein assessment in pediatric patients. Clin Hemorheol Microcirc. 2024;86(4):441-449.

21.

Harris

Maier

Freeston

, et al. Differences in glenohumeral range of motion and humeral torsion between right-handed and left-handed professional baseball pitchers. Am J Sports Med. 2022;50(9):2481-2487.

22.

Helmkamp

Bullock

Rao

Shanley

Thigpen

Garrigues

GE.

The relationship between humeral torsion and arm injury in baseball players: a systematic review and meta-analysis. Sports Health. 2020;12(2):132-138. doi:10.1177/1941738119900799

23.

Hernigou

Duparc

Hernigou

Determining humeral retroversion with computed tomography. J Bone Joint Surg Am. 2002;84(10):1753-1762.

24.

Kaiser

Herr

Greiner

Stroszczynski

Jung

E-M.

Mobile handheld ultrasound with Vscan air for the diagnosis of deep vein thrombosis. Clin Hemorheol Microcirc. 2023;83(2):149-161.

25.

Kennedy

Hannon

Conway

Creed

Garrison

JC.

Effect of younger starting pitching age on humeral retrotorsion in baseball pitchers with an ulnar collateral ligament injury. Am J Sports Med. 2021;49(5):1160-1165. doi:10.1177/0363546521990808

26.

Kocher

Waters

Micheli

LJ.

Upper extremity injuries in the paediatric athlete. Sports Med. 2000;30:117-135.

27.

Koo

MY.

A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med. 2016;15(2):155-163.

28.

Krahl

VE.

The torsion of the humerus: its localization, cause and duration in man. Am J Anat. 1947;80(3):275-319.

29.

Kurokawa

Yamamoto

Ishikawa

, et al. Differences in humeral retroversion in dominant and nondominant sides of young baseball players. J Shoulder Elbow Surg. 2017;26(6):1083-1087.

30.

M-PT

Voigt

Nathanson

, et al. Comparison of four handheld point-of-care ultrasound devices by expert users. Ultrasound J. 2022;14(1):27.

31.

M-J

Zhong

W-H

Liu

Y-X

Miao

H-Z

Y-C

M-H.

Sample size for assessing agreement between two methods of measurement by Bland−Altman method. Int J Biostat. 2016;12(2):/j/ijb.2016.12.issue-2/ijb-2015-0039/ijb-2015-0039.xml.

32.

Ludbrook

Confidence in Altman–Bland plots: a critical review of the method of differences. Clin Exp Pharmacol Physiol. 2010;37(2):143-149.

33.

Mansournia

Waters

Nazemipour

Bland

Altman

DG.

Bland-Altman methods for comparing methods of measurement and response to criticisms. Glob Epidemiol. 2021;3:100045.

34.

MedCalc. Sample size calculation: Bland-Altman plot. MedCalc Software Ltd. Accessed March 1, 2025. https://www.medcalc.org/manual/sample-size-bland-altman.php

35.

Monti

Ambrogi

Sardanelli

Sample size calculation for data reliability and diagnostic performance: a go-to review. Eur Radiol Exp. 2024;8(1):79.

36.

Myers

Oyama

Clarke

JP.

Ultrasonographic assessment of humeral retrotorsion in baseball players: a validation study. Am J Sports Med. 2012;40(5):1155-1160. doi:10.1177/0363546512436801

37.

Myers

Oyama

Goerger

Rucinski

Blackburn

Creighton

RA.

Influence of humeral torsion on interpretation of posterior shoulder tightness measures in overhead athletes. Clin J Sport Med. 2009;19(5):366-371. doi:10.1097/JSM.0b013e3181b544f6

38.

Myers

Kennedy

Arnold

, et al. A narrative review of little league shoulder: proximal humeral physis widening is only one piece of the puzzle, it is time to consider posterior glenoid dysplasia. JSES Int. 2024;8(4):724-733.

39.

Noonan

Shanley

Bailey

, et al. Professional pitchers with glenohumeral internal rotation deficit (gird) display greater humeral retrotorsion than pitchers without gird. Am J Sports Med. 2015;43(6):1448-1454.

40.

Pozzi

Plummer

Shanley

, et al. Preseason shoulder range of motion screening and in-season risk of shoulder and elbow injuries in overhead athletes: systematic review and meta-analysis. Br J Sports Med. 2020;54(17):1019-1027.

41.

Reuther

Sheridan

Thomas

SJ.

Differentiation of bony and soft-tissue adaptations of the shoulder in professional baseball pitchers. J Shoulder Elbow Surg. 2018;27(8):1491-1496. doi:10.1016/j.jse.2018.02.053

42.

Rose

Noonan

Glenohumeral internal rotation deficit in throwing athletes: current perspectives. Open Access J Sports Med. 2018;9:69-78.

43.

Sabick

Kim

Y-K

Torry

Keirns

Hawkins

RJ.

Biomechanics of the shoulder in youth baseball pitchers: implications for the development of proximal humeral epiphysiolysis and humeral retrotorsion. Am J Sports Med. 2005;33(11):1716-1722.

44.

Sabick

Torry

Kim

Y-K

Hawkins

RJ.

Humeral torque in professional baseball pitchers. Am J Sports Med. 2004;32(4):892-898.

45.

Salamh

Hanney

Champion

, et al. The reliability and validity of a clinical measurement proposed to quantify humeral torsion. Int J Sports Phys Ther. 2021;16(6):1504.

46.

Schlemmer

Dosch

Gicquel

, et al. Computed tomographic analysis of humeral retrotorsion and glenoid retroversion [in French]. Rev Chir Orthop Reparatrice Appar Mot. 2002;88(6):553-560.

47.

Shanley

Kissenberth

Thigpen

, et al. Preseason shoulder range of motion screening as a predictor of injury among youth and adolescent baseball pitchers. J Shoulder Elbow Surg. 2015;24(7):1005-1013.

48.

Shanley

Rauh

Michener

Ellenbecker

Garrison

Thigpen

CA.

Shoulder range of motion measures as risk factors for shoulder and elbow injuries in high school softball and baseball players. Am J Sports Med. 2011;39(9):1997-2006.

49.

Shanley

Thigpen

Clark

, et al. Changes in passive range of motion and development of glenohumeral internal rotation deficit (gird) in the professional pitching shoulder between spring training in two consecutive years. J Shoulder Elbow Surg. 2012;21(11):1605-1612.

50.

Takenaga

Goto

Tsuchiya

, et al. Relationship between bilateral humeral retroversion angle and starting baseball age in skeletally mature baseball players—existence of watershed age. J Shoulder Elbow Surg. 2019;28(5):847-853.

51.

Thomas

Swanik

Kaminski

, et al. Humeral retroversion and its association with posterior capsule thickness in collegiate baseball players. J Shoulder Elbow Surg. 2012;21(7):910-916.

52.

Tokish

Curtin

Kim

Y-K

Hawkins

Torry

MR.

Glenohumeral internal rotation deficit in the asymptomatic professional pitcher and its relationship to humeral retroversion. J Sports Sci Med. 2008;7(1):78.

53.

Whiteley

Ginn

Nicholson

Adams

Indirect ultrasound measurement of humeral torsion in adolescent baseball players and non-athletic adults: reliability and significance. J Sci Med Sport. 2006;9(4):310-318.

54.

Whiteley

Ginn

Nicholson

Adams

RD.

Sports participation and humeral torsion. J Orthop Sports Phys Ther. 2009;39(4):256-263.

55.

Wilk

Macrina

Fleisig

, et al. Deficits in glenohumeral passive range of motion increase risk of elbow injury in professional baseball pitchers: a prospective study. Am J Sports Med. 2014;42(9):2075-2081.

56.

Yaari

Mullaney

Fukunaga

Thein

McHugh

Nicholas

SJ.

Assessment of humeral torsion by palpation in baseball pitchers: a validation study. Int J Sports Phys Ther. 2020;15(6):1073.

The Use of a Handheld Ultrasound Device to Measure Humeral Retrotorsion in Baseball and Softball Athletes: A Validation Study

Abstract

Background:

Purpose/Hypothesis:

Study Design:

Methods:

Results:

Conclusion:

Keywords

Methods

Study Design

Participants

Procedures

Anatomic Humeral Retrotorsion: Musculoskeletal Ultrasound

Anatomic Humeral Retrotorsion: Handheld Ultrasound

Criterion Validity Procedures

Reliability Procedures

Statistical Analysis

Results

Reliability Results

Criterion Validity Results

Discussion

A Clinical Example

Conclusion

Footnotes

ORCID iDs

References