Abstract
This study evaluated the repeatability and reproducibility of using high-frequency quantitative ultrasound (QUS) measurement of backscatter coefficient (BSC), grayscale analysis, and gray-level co-occurrence matrix (GLCM) textural analysis, to characterize human rotator cuff muscles. The effects of varying scanner settings across two different operators and two US systems were investigated in a healthy volunteer with normal rotator cuff muscles and a patient with chronic massive rotator cuff injury and substantial muscle degeneration. The results suggest that BSC is a promising method for assessing rotator cuff muscles in both control and pathological subjects, even when operators were free to adjust system settings (depth, level of focus, and time-gain compensation). Measurements were repeatable and reproducible across the different operators and ultrasound imaging platforms. In contrast, grayscale and GLCM analyses were found to be less reliable in this setting, with significant measurement variability. Overall, the repeatability and reproducibility measurements of BSC indicate its potential as a diagnostic tool for rotator cuff muscle evaluation.
Keywords
Introduction
Rotator cuff tears are one of the most common causes of shoulder pain, affecting millions of patients globally.1,2 After rotator cuff tendon injury, the muscle progressively deteriorates with fibrosis and fat deposition, complicating surgical repair and worsening clinical outcomes.3,4 Medical imaging techniques, in particular, ultrasound (US) and magnetic resonance imaging (MRI), are critical for diagnosing rotator cuff tendon tears and for evaluating the status of the muscle. 5 Reported performance of US for rotator cuff tendon tears is similar to that of MRI, 6 though US is only moderately accurate for the diagnosis of rotator cuff fatty atrophy. 7 However, the added advantages of accessibility, portability, and cost-effectiveness make US particularly appealing.
B-mode image evaluation, the mainstay of US imaging in routine clinical practice, provides subjective rotator cuff assessment. Quantitative ultrasound (QUS) techniques have emerged, which convey more objective information about tissue status. 8 One group of techniques reduces system-dependent effects by utilizing raw radiofrequency (RF) data and calibrated phantoms with known acoustic properties to derive fundamental tissue parameters, such as backscatter coefficient (BSC).9,10 The efficacy of the BSC for normal and diseased tissue characterization has been demonstrated by several authors in a variety of organs such as the liver, 11 kidney, 12 and prostate, 13 though utilization in skeletal muscle has been lacking.
For US quantification of muscle tissue, typically post-processed B-mode images have been used, with one of the most common measures being grayscale analysis, referred to as “echo intensity.”14–17 Some authors have also extracted texture parameters from B-mode images, such as through gray-level co-occurrence matrix (GLCM) analyses.15,18 Notably, virtually all prior studies using US to quantify muscle tissue maintain constant system settings between participants regardless of size or pathology, as it is generally accepted that B-mode-based analysis is sensitive to varying settings such as beam focus, frequency, transmit and receiver gains, and time-gain compensation (TGC). 19 However, in clinical settings, radiologists and sonographers adjust these settings to optimize image quality and contrast. Variables that motivate the different settings include patient body habitus with resultant varying target tissue depths, differing disease states, and subjective operator preference of tissue contrast. 20
There is an increasing need to accelerate clinical translation of QUS. However, there are insufficient studies of repeatability and reproducibility involving muscle, particularly in accordance with guidelines endorsed by the Quantitative Imaging Biomarkers Alliance (QIBA),21,22 and there is incomplete understanding of the sources of variability (different US machines, operators) that affect QUS for this application. Thus, the purpose of this study was to assess the repeatability and reproducibility of QUS based on analysis of raw RF data (BSC) for in vivo rotator cuff muscle evaluation, and compare these outcomes with those of grayscale and textural analysis (GLCM) applied to routine B-mode images of the rotator cuff. The effects of varying scanner settings across different operators and US machines were investigated in a healthy volunteer and a patient with chronic rotator cuff injury. We hypothesize that normalized BSC-based parameters are repeatable and reproducible outcomes for rotator cuff muscle evaluation, and that these parameters would outperform B-mode image-based analyses in the setting of varying scanner settings.
Materials and Methods
Study Design and Ultrasound Acquisition
Our institutional review board approved this study and written informed consent was obtained. Two participants were recruited, including one healthy volunteer without shoulder complaints (healthy participant, 36-year-old man, BMI: 18.6) and one patient with known, chronic massive rotator cuff tearing in both shoulders (injured participant, 73-year-old man, BMI: 28.2). Magnetic resonance imaging exams interpreted by a musculoskeletal radiologist demonstrated that the healthy participant had intact bilateral rotator cuffs and normal supraspinatus and infraspinatus muscles (all Goutallier grade 0), whereas the injured participant had bilateral massive rotator cuff tears and diseased supraspinatus (both Goutallier grade 3) and infraspinatus (both Goutallier grade 2) muscles. 23 The participants were selected based on the fact that they had bilateral, symmetric muscles that were representative of clinically important grades. Goutallier grade 0 is entirely normal and grades 2 and 3 are the most important grades in the determination of potential surgical treatment because those are the grades where significantly higher failure rates occur.24,25 These two participants represent clinically significantly different conditions; a larger study of participants across all clinical grades was purposely not performed to minimize the introduction of other patient-specific variables in order to focus on scanner and operator effects.
Two clinical US machines with linear probes were utilized (14L5, S2000, Siemens Healthineers, Erlangen, Germany and UHF22, Vevo MD, Fujifilm Visualsonics, Toronto, ON, Canada) and beam-formed RF signals and B mode images were both acquired. Imaging was performed by two experienced operators (E.Y.C, a musculoskeletal radiologist with 12 years of US experience and L.T.S, a general radiologist with 9 years of US experience). The supraspinatus muscle was imaged in short axis approximately 1-inch medial to the acromion and the infraspinatus muscle was imaged in a similar plane inferior to the scapular spine as shown in Figure 1. Presets were restored to default values before each muscle acquisition and operators were instructed to adjust depth, focus, TGC and receiver gain (but not frequency) to optimize the image quality as would be performed in routine clinical practice. After each muscle RF acquisition, data from a commercially available homogeneous tissue-mimicking phantom (calibrated from 1 to 10 MHz) containing 117GU Zerdine formulation (Sun Nuclear, Norfolk, Virginia) was obtained. After the phantom image acquisition, the presets were restored again and a separate B-mode image was acquired by each operator. Three acquisitions of muscle RF, phantom RF, and B mode data were acquired bilaterally by each operator on the two participants, with three repeated measurements for each acquisition.

Imaging location of ultrasound (US) exam with corresponding MR image. (A) Photograph of a volunteer shows the imaging locations for the supraspinatus and infraspinatus muscles (red lines). The curved blue line outlines the scapular spine and acromion. (B) The sagittal T1-weighted fast-spin echo image from the healthy volunteer shows the normal supraspinatus and infraspinatus muscles without fatty infiltration (Goutallier grade 0). (C) Flow chart of study design. There are two operators, two machines, two participants, two sides, two muscle groups, 3 different acquisitions (RF muscle & phantom, B-mode images) with three repeated measurements.
In total, there were two operators (E.Y.C and L.T.S), two machines (S2000, 14L5 and Vevo MD UHF 22), two participants (healthy and injured), two sides (right and left), two muscle groups (supraspinatus and infraspinatus), three repeated measurements per acquisition, comprising 96 elements of both BSC and B-mode based measurements in the dataset of this study shown in Figure 1(C)).
Analytical Method
QUS Analysis Methods
BSC (
The BSC was determined using the reference phantom method that explicitly accounts for experimental factors affecting the ultrasound signal. 26 This method involves a comparison at the same depth of the power spectra from tissue with the power spectra from a reference phantom whose BSC and attenuation coefficient are known:
where f is the frequency of the acoustic waves,
with N the number of different tissue layers,
Image GLCM Texture Analysis
The B-mode images recorded from the two US machines were analyzed using the GLCM algorithm. 32 The GLCM is defined as a histogram of co-occurring grayscale intensity pairs in corresponding pixels of an image:
The B-mode US image is treated as a matrix of grayscale pixel intensities,
Grayscale Analysis
0 to 255 levels of the grayscale B mode images intensity were considered. Grayscale mean and standard deviation values were assessed on the same ROI as the GLCM analysis by utilizing “regionprops” function in MATLAB (v2020b, The Math Works, Natick, MA).
ROI Selection
RF and B-mode datasets were arbitrarily assigned to two scientists experienced with QUS processing. The beamformed RF data analysis was standardized using an estimator graphical user interface (GUI) for offline processing. The supraspinatus (Figure 2(A), (B), (E), and (F)) and infraspinatus (Figure 2(C), (D), (G), and (H)) muscle ROIs and overlying tissues for both RF and B-mode data were outlined by the two scientists under the guidance of a musculoskeletal radiologist. Specifically, the ROIs of the cuff muscles were delineated by locating the overlying trapezius and deltoid muscles and then enclosing the epimysium of the supraspinatus and infraspinatus muscles. The GUI selected sub-ROIs within the given ROI with 75% overlap. The sub-ROI dimensions were

Representative B-mode ultrasound (US) images of rotator cuff muscles with orange dash lines as the region of interest. (A) Image of the supraspinatus muscle in a healthy volunteer obtained with the 14L5 transducer. (B) Image of the supraspinatus muscle in a patient with chronic, massive rotator cuff tearing obtained with the 14L5 transducer. (C) The image of the infraspinatus muscle in a healthy participant was obtained with the 14L5 transducer. (D) The image of the infraspinatus muscle in an injured participant was obtained with the 14L5 transducer. (E–H) B-mode images were obtained in the same manner as (A–D) except with the UHF22 transducer. Note that the echogenicity of the muscles is not comparable between images as depth, focus, and gain have been independently adjusted.
Statistical Analysis
Statistical analysis was performed with IBM SPSS Statistics for Windows version 28.0 (IBM, Armonk, NY, USA). Descriptive statistics were summarized with mean ± standard deviation. QIBA endorsed guidelines were followed and a summary of the metrics used in this study is shown in Table 1.21,22
Summary of All QIBA Repeatability and Reproducibility Statistical Metrics.
RC = repeatability coefficient; RDC = reproducibility coefficient; ICC = intraclass correlation coefficient; SD = standard deviation.
To test the between-image repeatability, the dataset was divided into subgroups since factors (i.e., operators, left/right side, machines) may affect repeatability. The repeatability was assessed using a one-way random-effects model:
where
The reproducibility of operators/machines was assessed using a two-way random effects model:
where
Results
In this study, we evaluated three effects of variability inter-images, inter-operator, and inter-machine on three different outcome parameters: iBSC, B mode grayscale intensity, and GLCM contrast under the QIBA guidelines for technical performance assessments.
B-Mode, BSC, and GLCM Textural Measurement Results
Representative B-mode images from the participants are shown in Figure 2 with orange dashed lines highlighting the ROIs. The left column of images showed the supraspinatus and infraspinatus muscles acquired from the healthy volunteer (healthy participant). The right column of images was acquired from the patient with chronic, massive rotator cuff tearing (injured participant). B-mode images showed injured participant to have a deeper targeted muscle compared with healthy participant. Both radiologists independently observed that the rotator cuff muscles in the healthy participant were generally more hypoechoic with higher contrast compared to surrounding fat, whereas the internal architecture of the degenerated rotator cuff muscles in injured participant was generally effaced.
Twelve single-image BSCs (three each for two sides by two operators) were computed for each participant. Figure 3(A) and (B) display the BSC(f) curves from the 14L5 probe on healthy and injured participants. The injured participant (red) demonstrated increased BSC values compared to the healthy participant (blue) for both supraspinatus and infraspinatus muscle groups. Figure 3(C) and (D) show BSC(f) curves from the UHF22 probe for both participants with a statistically significant difference between healthy and injured participants. Figure 3(E) and (F) summarize the iBSC outcomes by including two different operators, left and right sides of the rotator cuff, and three repeated measurements. Injured participant demonstrated increased iBSC value compared with healthy participant for both infraspinatus and supraspinatus muscle for both probes.

Quantitative ultrasound outcomes from rotator cuff muscles. (A) Raw BSC data from two operators on the supraspinatus muscles from two participants using the 14L5 probe (healthy participant in blue, injured participant in red). (B) Raw BSC data from two operators on the infraspinatus muscles from two participants using the 14L5 probe (healthy participant in blue, injured participant in red). (C and D) Raw BSC data for supraspinatus and infraspinatus muscles using the UHF22 probe. (E and F) Bar plot of iBSC data for supraspinatus and infraspinatus muscles using 14L5 and UFH 22probe.
Repeatability of Single-Image Measures, Between-Operator and Between-Machine Reproducibility of iBSC Parameter
The between-image repeatability was evaluated under specific conditions (two operators using two machines on each side of the two participants for two muscle groups), which resulted in six measurements in each individual group (3 measurements each on healthy participant and injured participant). The descriptive statistics, between-participant standard deviation (SD), between-image SD, repeatability coefficient (RC), ICC(1,1), and ICC(1,3) for iBSC are summarized in Table 2. By examining the descriptive statistics, the data acquired using the 14L5 (5.21 ± 18.73) and UHF22 (2.95 ± 7.93) transducers show a similar order of magnitude of iBSC within the central frequency bandwidths (7–10 MHz for 14L5, 8–10 MHz for UHF 22).
Between Images Repeatability Estimate for the iBSC Measures Under Various Conditions on Human Shoulder Muscles (Supraspinatus and Infraspinatus.
In this scenario, each group contains six repeated measurements (from healthy and injured participants). iBSC = integrated backscatter coefficient; RC = repeatability coefficient; ICC = intraclass correlation coefficient; SD = standard deviation.
As shown in Table 2, the between-participant SD values are larger than the between-image SD values, indicating repeatability of the iBSC measurement. Our results also demonstrate generally high reliability (ICC values in six cases were classified excellent, six cases were good, and four cases were moderate). No iBSC measurements were classified as poor reliability. Moreover, ICC(1,3) for the three-image measures had 12 cases classified as excellent and four cases as good reliability. Overall excellent to good reliability was demonstrated for the BSC-based measurements for both US systems.
The between-operator reproducibility was assessed using two-way random effect models with operators and participants as the main random effects. The iBSC statistical metrics, between-participant SD, between-operator SD, QIBA reproducibility SD, and between-image SD values without dividing by different operators, are summarized in Table 3. For each muscle, each side, and each machine (eight different groups), the between-operator and QIBA reproducibility SD values are smaller than between-participant SD values, indicating that less variability was introduced by the operators, interaction, and error terms. As a result, there is excellent reproducibility by the two operators for iBSC measurements.
Operators Reproducibility Estimate for Single-iBSC Measures Under Various Conditions on Human Shoulder Muscles (Supraspinatus and Infraspinatus), Calculated Using the Two-Random Effect Model.
iBSC = integrated backscatter coefficient; RDC = reproducibility coefficient; SD = standard deviation.
Means for the two-random effect method, a negative or zero component estimate occurs.
S2000 exhibits higher overall mean iBSC (5.21 ± 18.73) than that of Vevo MD (2.95 ± 7.93) potentially due to different central frequency range of the two transducers. Table 4 demonstrates that iBSC values showed excellent reproducibility between machines since the between-machine SD values are lower than the between-participant SD values for every condition. QIBA-reproducibility SD values are also lower than the between-participant SD values for four cases. In comparison, the mean RDC for the between-operator test was 17.69 ± 12.25, which was lower than the mean RDC for the between-machine test at 29.72 ± 18.32. Overall, iBSC parameter acquires a good repeatability and reproducibility of detecting the healthy versus injured rotator cuff muscles.
Machine Reproducibility Estimate for Single-BSC Measures Under Various Conditions on Human Shoulder Muscles (Supra and Infra), Calculated Using the Two-Random Effect Model.
RDC = reproducibility coefficient; SD = standard deviation.
Means for the two-random effect method, negative or zero component estimate occurs.
Repeatability of Single-Image Measures, Between-Operator and Between-Machine Reproducibility of Grayscale Mean Intensity Parameter
The repeatability metrics for the grayscale mean are summarized in Table 5. The 14L5 transducer acquired higher grayscale mean values than the UHF22 transducer, but the standard deviation values were lower than those from the UHF22. Poor repeatability was observed since the between-image SD values were equal to or larger than the between-participant SD. As a result, ICC(1,1) for grayscale mean had one case as excellent, six cases as moderate, and nine cases as poor reliability. Even though ICC(1,3) was for an average of three repeated measurements, only one case was considered excellent, six cases as good, two cases as moderate, and seven cases as poor.
Repeatability Measurements of Grayscale Mean Measurements on B-mode Rotator Cuff Muscles.
RC = repeatability coefficient; ICC = intraclass correlation coefficient; SD = standard deviation.
means for the one-random effect method, a negative or zero component estimate occurs. This means the true variance equals zero.
Table 6 demonstrates the QIBA between-operator reproducibility metrics of the grayscale mean parameter. Between-operator SD values are larger than between-participant SD values for two cases, and QIBA-reproducibility SD values are larger than between-participant SD values for six cases. Consequently, the operators, interaction, and error terms introduced a larger amount of variability of grayscale mean parameter compared with the reproducibility of the iBSC parameter.
Operator Reproducibility Estimate of Single Grayscale Mean Measures Under Various Conditions on Human Shoulder Muscles (Supraspinatus and Infraspinatus), Calculated Using the Two-Random Effect Model.
RDC = reproducibility coefficient; SD = standard deviation.
Means for the two-random effect method, negative or zero component estimate occurs.
Table 7 shows the poor reproducibility of grayscale mean measurements with between-machine SD values larger than the between-participant SD values in every case. Moreover, between-operator reproducibility is much smaller than between-machine reproducibility since the mean RDC for the between-operator test is 29.65 ± 20.44, and RDC for the between-machine test is 349.02 ± 12.66. The poor reproducibility between machines can be explained by differences in grayscale mean values between the S2000 (236.87 ± 5.86) and Vevo MD transducers (59.88 ± 21.88).
Machine Reproducibility Estimate of Single Grayscale Mean Measures Under Various Conditions on Human Shoulder Muscles (Supraspinatus and Infraspinatus), Calculated Using the Two-Random Effect Model.
RDC = reproducibility coefficient; SD = standard deviation.
Means for the two-random effect method, negative or zero component estimate occurs.
Repeatability of Single-Image Measures, Between-Operator and Between-Machine Reproducibility of GLCM Contrast Parameter
The statistical metrics for GLCM contrast is summarized in Table 8. The descriptive statistics indicate that the supraspinatus muscle had a slightly reduced GLCM contrast mean value than the infraspinatus muscle. Supraspinatus and infraspinatus muscles have similar standard deviation values across all individual groups. Poor repeatability was observed with GLCM contrast since the between-image SD values were generally equal to or larger than the between-participant SD. ICC(1,1) demonstrated one case as excellent, one case as good, one case as moderate, and thirteen cases as poor reliability. ICC(1,3) demonstrated had two cases as excellent, one case as good, three cases as moderate, and ten cases as poor reliability.
GLCM Contrast Repeatability Measurements on Shoulder Muscle.
RC = repeatability coefficient; ICC = intraclass correlation coefficient; SD = standard deviation.
Means for the one-random effect method, a negative or zero component estimate occurs.
Table 9 demonstrates that between-operator SD values are larger than between-participant SD values for four cases, and the QIBA reproducibility SD values are larger than between-participant SD values for seven cases, indicating that most of the variability was introduced by the operators, interactions, and error terms. Operators demonstrated poorer reproducibility for both grayscale means and GLCM contrast measurements compared with iBSC values.
Operator Reproducibility Estimate of Single-GLCM Contrast Measures Under Various Conditions on Human Shoulder Muscles (Supraspinatus and Infraspinatus), Calculated Using the Two-Random Effect Model.
RDC = reproducibility coefficient; SD = standard deviation.
Means for the two-random effect method, negative or zero component estimate occurs.
Table 10 demonstrates that GLCM contrast of S2000 mean value (0.21 ± 0.03) is higher than Vevo MD mean value (0.18 ± 0.06). GLCM contrast also demonstrated poor between-machine reproducibility since the between-machine and QIBA reproducibility SD values are larger than the between-participant SD values for nearly every case, indicating the variability introduced by the different machines, the interaction between machines and participants, and measurement errors are greater than the between-participant variability. Overall, iBSC measurements exhibited more repeatable and reproducible results than both the grayscale means and GLCM contrast parameters.
Machine Reproducibility Estimate of Single GLCM Contrast Measures Under Various Conditions on Human Shoulder Muscles (Supraspinatus and Infraspinatus), Calculated Using the Two-Random Effect Model.
RDC = reproducibility coefficient; SD = standard deviation.
Means for the two-random effect method, negative or zero component estimate occurs.
Discussion
This study investigated the repeatability and reproducibility of BSC, grayscale, and GLCM texture-based analyses in two representative participants’ healthy and degenerated rotator cuff muscles, where operators were free to adjust system settings for optimal image quality. Published data suggests that BSC-based measurements obtained using raw RF data and calibrated phantoms are system-independent, 34 and repeatability and reproducibility investigations have been performed in soft tissues such as a liver 27 and median nerve 28 with promising results. This is an early if not the first study to evaluate BSC using raw RF data in normal and pathologic rotator cuff musculature.
QUS, grayscale and GLCM textural analysis have been performed on skeletal muscles through grayscale14–17 and GLCM textural 18 analyses on uncalibrated B-mode images, though the repeatability and reproducibility testing of these measurements are lacking, particularly under varying conditions in accordance with guidelines endorsed by QIBA. It has been suggested that grayscale and GLCM textural analyses are much more dependent on the settings adjusted by the operators19,35 than QUS features. However, image optimization is not only common practice, but important in musculoskeletal ultrasound, where higher frequency transducers are routinely utilized and structures can vary greatly in depth and attenuation. 36 Without image optimization on a per patient basis, the visualization of tissue characteristics, boundaries, and ultimately diagnoses can be severely impaired. 20
General Overview of BSC-Based Outcomes
The participants were chosen to represent extreme ends of the range of rotator cuff pathology typically encountered in clinical practice. MRI was used to determine muscle status, which is recognized as the gold standard,1,37 but is much slower, less convenient and less cost-effective compared with US. On the routine B-mode images, both radiologists in this study identified the fatty degeneration of the rotator cuff muscles in the injured participant. This is consistent with prior research showing moderate accuracy (72%–85%) of US for substantial fatty atrophy of rotator cuff muscles using qualitative B-mode image evaluation. 7 In the future, clinically evaluation of QUS compared to MRI for a sufficient sample size would be an important next step.
In our study, we found that the magnitude of iBSC of the injured participant was higher than the control, which might be explained by more densely packed scatterers in the injured muscles of injured participant.38,39 The increased scatterers in the rotator cuff muscle may be due to fat infiltration or fibrosis after the primary injury to the tendon, 40 somewhat analogous to elevated BSC values in the fatty liver.11,41
Repeatability and Reproducibility Metrics
Our results indicate that iBSC measurements in rotator cuff muscles are repeatable. Overall, the repeatability and reproducibility metrics (RC, RDC) are comparable to previous literature. 28 The good repeatability metrics demonstrate the capability of iBSC to distinguish between the healthy and injured rotator cuff muscles (particularly Goutallier grade 3 degenerated RC) with varied US settings. ICC(1,3) with a better repeatability performance suggests that future clinical application of BSC based measurements should obtain multiple repeated measurements. Moreover, the between-operators SD is comparatively low to between-participants SD. Therefore, operators did not contribute significantly to the overall variability within this study. Table 8 also demonstrates excellent between-machine reproducibility with low variability across different US platforms.
In contrast, poor repeatability for grayscale mean and GLCM contrast parameters was shown since the between-participants SD values are comparable or smaller than the between-images SD values for most groups. Most of the ICC results were classified as poor, consistent with poor repeatability of grayscale and GLCM analyses for the detection of rotator cuff muscle injury with varying operator presets.19,42 Similarly, the QIBA reproducibility SD values are greater than the between-participant SD values, consistent with poor reproducibility between operators. These results confirm the importance of maintaining US settings if grayscale mean and GLCM contrast are to be utilized. The between-machine variability is greater than the between-participant variability, so a clear distinction between healthy and injured rotator cuff muscles could not be made. The signal treatment methods, hardware and transducer (e.g., beam shape, element size, elements interspace, and lens properties), and frequency bandwidths vary between the two US platforms. Among these sources of variability, the differences in signal treatment methods is especially important for altering the grayscale and textural outcomes. 42
In our study, iBSC reliability measurements of the rotator cuff muscles achieved an average ICC > 0.8, which compares favorably with other modalities and parameters. Specifically, an average ICC of 0.8 was reported for stiffness values of the supraspinatus muscle using magnetic resonance elastography, 43 and ICC values >0.75 was reported for shear modulus obtained with a shear wave elastography. 44 However, the grayscale means measurements (mean ICC = 0.35) and GLCM contrast measurements (mean ICC = 0.2) are not robust with varying US settings.
Limitations
First, our study only included one healthy volunteer and one patient with chronic rotator cuff disease. However, the intent of our study was to establish the repeatability and reproducibility of quantitative imaging across different operators and platforms in normal and pathological muscle; studies of this type often contain 1 to 2 volunteers.45,46 Still, future clinical studies should be conducted on more participants to formally compare control and disease groups and determine how patient-to-patient variability (e.g., gender, age, BMI, and various disease states) affect BSC-based outcomes in various muscles. Second, our study was not exhaustive with regards to the QUS imaging and analytical methods that could have been used, plus two US scanners were employed. For instance, others have previously modeled envelope statistics using uncalibrated RF data
47
or analyzed image intensity of B-mode images normalized with a reference phantom.
48
Therefore, future studies should be performed to determine the repeatability and reproducibility of these additional methods. Third, the accuracy of the QUS metrics in this study was not determined since it requires a reference confirmation (e.g., ex vivo histological samples). However, a separate modality, MRI, was used in this study to confirm the state of the participant muscles, and the MRI exams demonstrated that both subjects were symmetric from side to side. Fourth, we fixed the GLCM offsets (
Conclusion
QUS BSC-based measurements using the reference phantom method demonstrate good repeatability and reproducibility for rotator cuff muscle evaluation across two different operators and two US platforms. Furthermore, because of apparent system and settings independence, BSC-based measurements demonstrate greater repeatability and reproducibility compared with grayscale and GLCM textural-based analysis. Therefore, BSC-based measurements may be preferable to grayscale and textural analyses for the evaluation of rotator cuff muscle degeneration in real clinical settings. In the meantime, BSC-based measurements require US scanner research mode to acquire RF data and additional offline processing.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: We gratefully acknowledge funding from the VA R&D service (I01CX001388, I01BX005952, and I01CX002118), NIH (R01AR075825 and K01AR080257), the Department of Defense (W81XWH-20-1-0927).
