Abstract
Background:
Ultrasound imaging has become popular among physiotherapists for monitoring diastasis rectus abdominis postpartum, but its reliability requires further exploration.
Objectives:
To investigate physiotherapists’ intra-tester, inter-tester, and test-retest reliability of inter-recti distance measurement utilizing real-time ultrasound across a mixed women sample.
Design:
Reliability study.
Methods:
Volunteers comprising nulliparous and parous women of different ages and body mass index participated. Five physiotherapists performed ultrasound measurements, following sonographic training. Four conditions were tested in supine; rest, curl-up, transversus abdominus activation, and transversus abdominus with curl-up. Three locations were randomly measured: umbilicus, 3 cm above the umbilicus, and halfway between the umbilicus and xiphoid process. For intra-tester reliability, each therapist undertook three repeated measurements. For inter-tester reliability, mean inter-recti distance measurements were explored across any two tester combinations within and across sessions. Test-retest reliability explored repeated measurements 5–8 days later. Data were analyzed with intraclass correlation coefficients2,1.
Results:
54 women (33.2 ± 15.2 years old, body mass index: 24.2 ± 3.7), 19 (35.2%) being parous participated. Intra-tester reliability across each physiotherapist was very good (intraclass correlation coefficients = 0.677–0.989). Intra-session reliability across any two testers yielded very good results (intraclass correlation coefficients = 0.76–0.92), whereas across-session yielded good reliability (intraclass correlation coefficients >0.76) except for one condition (3 cm above umbilicus in combined transversus abdominus and curl-up). Test-retest reliability was also very good (intraclass correlation coefficients = 0.78–0.96). Significant differences in inter-recti distance were found, with parous women showing consistently larger values (p < 0.05).
Conclusion:
Physiotherapists, following sonographic training, can reliably measure inter-recti distances in both nulliparous and parous women across active and resting tasks. Thus, ultrasound measurement of inter-recti distance is recommended in physiotherapy practice for monitoring diastasis rectus abdominis and assessing rehabilitation progress. However, sub-umbilical inter-recti distance measurements and the impact of co-contraction on reliability require further research.
Introduction
Diastasis rectus abdominis (DRA) is a connective tissue impairment resulting in the separation of the two rectus abdominis (RA) muscles, which is commonly seen in pregnant and postpartum women, causing physical, functional, and esthetic concerns in some of them, depending on its individual presentation and severity.1–3 Although natural resolution seems to occur for up to 6 months postpartum in some women, approximately for a third of them, DRA is still evident after a year.4–7 Conservative management, and in particular physiotherapy, appears to be, according to the latest literature and guidelines, the recommended approach in such cases, focusing predominantly on a carefully prescribed progressive therapeutic exercise program of the abdominal and trunk musculature.8–12
DRA assessment and diagnosis are performed by measurement of inter-recti distance (IRD), preferably via ultrasound (US) or calipers.13–15 US imaging has recently become popular among physiotherapists (PTs), for the initial clinical assessment and quantification of IRD and/or for monitoring IRD changes as an outcome measure during rehabilitation of DRA in women.16–18 It offers numerous advantages over other diagnostic modalities, as it is radiation-free, cost-effective, portable, and non-invasive. 14 Its superior measurement consistency compared to other clinical measurement tools13–15 allows PTs to measure IRD in a standardized way, distinguishing between patients with the condition and those without. Repeatable IRD measurements are essential for reliable evaluation of DRA severity and for IRD tracking over time. Additionally, since reductions in IRD can be gradual, subtle, and often imperceptible through caliper/tape, or manual evaluation, US assessment provides a reliable method that can detect even subtle changes in IRD, enabling precise monitoring of rehabilitation interventions’ efficacy. 15 This quantitative approach strengthens evidence for physiotherapy practices and enables clinicians to monitor progress in their rehabilitation plans. It is thus, important for PTs to be able to measure IRD reliably by US.
Some previous reports have investigated IRD reliability via US across PTs and other healthcare specialists, achieving good results in most cases.19–23 However, certain issues require further exploration. For example, in most cases, one16,19,21,22 or two PT testers (at most) have been used for IRD measurements,20,23 whereas novice examiners are scarcely used 23 across reliability studies. However, it would be useful to explore whether more than one or two newly trained (in US imaging) PTs would measure reliably. This is particularly important, as it reflects real-world conditions, where more and more PTs with varying levels of expertise are gradually adopting US in clinical practice. In terms of sample, previous reliability studies were mostly limited to nulliparous women with small IRDs19,23 or early postpartum samples,21,22 with only two studies including more mixed population samples (young and older nulliparous and parous women).16,20 It is however important to explore reliability across a more diverse sample in terms of age, body mass index (BMI), and parity as they can influence the variability of IRD due to differences in abdominal tissue and muscle characteristics. Including a more diverse participant sample ensures the findings are clinically applicable to a broader population. Additionally, linea alba appears to be distorted in parous compared to nulliparous women or older women or women with high BMI, increased abdominal skinfold thickness, and reduced passive tension or firmness of the abdominal wall and linea alba,24–26 and it would, thus, be crucial to determine the stability in IRD measurements across PT testers. Furthermore, most reliability studies have measured IRD either during rest,16,23 or during a single contraction, such as head lift, 27 abdominal crunch,19,20 abdominal drawing in, 19 or across different static postures. 22 However, as PTs during rehabilitation often use inner and outer core co-contractions, such as transversus abdominus (TrA) submaximal contractions together with RA contractions,25,28,29 it would be valuable to explore IRD behavior during co-contractions; and, in such cases, reliability across US measurements in co-contracting dynamic conditions is important.
Given the above, the aim of this study was to investigate the intra-tester, inter-tester, and test-retest reliability of IRD measurements performed by PTs utilizing real-time US under passive (at rest) and different dynamic (contracting and co-contracting) conditions, across a mixed women population sample. This study builds upon previous research by including (i) a diverse sample with a wider range of ages, BMI, and parity, (ii) novice (in US) PTs, and (iii) dynamic co-contraction tasks, which are commonly applied in rehabilitation settings. A secondary aim was to compare IRD measurements between nulliparous and parous women across different testing conditions.
Methods
Design and ethics
This was a reliability study. Ethical approval was given from the Ethical Committee of the University of Patras and the entire work has been carried out in accordance with The Code of Ethics of the World Medical Association (Declaration of Helsinki) for experiments involving humans. The study conforms to the Guidelines for Reporting Reliability and Agreement Studies. 30
Sample size calculation
To ensure adequate statistical power for assessing reliability, sample size calculations were conducted for the primary analyses of interest. For intra-tester reliability, a priori power analysis was performed using G*Power Software 3.1.9.7 (Heinrich-Heine-Universität Düsseldorf, Düsseldorf, Germany) with a significance level of 0.05, a desired power of 80%, and an effect size of 0.25 based on previously reported values. 3 The analysis indicated that a minimum of 28 subjects were required to achieve sufficient power. To account for potential dropouts, a 20% increase was applied, resulting in an adjusted sample size of 34 subjects. Similarly, for inter-tester reliability, using the same parameters, the power analysis determined that at least 40 subjects were necessary. An additional 20% was added to account for potential dropouts, leading to a final sample size of 51 subjects.
Sample
A convenience adult women sample around the broader university campus area was invited to participate in the study. Recruitment was achieved via advertisements and university-mediated means (email/e-platform university announcements, etc.). Inclusion criteria were women aged 18 years or older. Exclusion criteria included severe low back pain at the time of the study, previous major abdominal surgery, neurological problems, and connective tissue diseases, which could affect their muscle tone. All participants signed an informed consent form prior to participation.
Examiners and training procedure
Five PTs, with varying levels of clinical experience, ranging from 2 to 30 years in musculoskeletal physiotherapy performed the US measurements. Two out of the five PTs had prior knowledge of rehabilitative US imaging (RUSI), whereas the remaining three were novices in RUSI. Τo ensure consistency in US measurements, a full-day training procedure was delivered to them by a consultant radiologist, who specialized in musculoskeletal sonographic imaging, followed by weekly independent and joint practice by the PTs. The training session combined theoretical components (US fundamentals, transducer handling, image optimization, key anatomical landmarks, etc.) and practical hands-on practice in measuring IRD. The latter included measurements under the radiologist’s supervision, with real-time feedback. Novice testers engaged in additional joint practice with experienced testers for several weeks to strengthen their skills. All testers were competent in capturing quality US images and in performing consistent measurements before participating in the study. As minor concerns can arise during routine practice, the radiologist’s supervision and feedback throughout the study helped resolve uncertainties and minimize inconsistencies.
US measurement of IRD and measurement locations
A General Electric US unit, Versana Active 2-D real-time US model was utilized with a 10-Hz linear transducer. Images were all taken in brightness mode (B-mode). Transducer placement was standardized by marking anatomical landmarks with a measuring tape (upper border of the umbilicus, 3 cm above umbilicus, and halfway between umbilicus and xiphoid process), and ensuring perpendicular alignment to the abdominal wall. Specifically, the transducer was placed transversely on the measurement point and vertically to the RA muscle fiber direction, until the two RA muscles were clearly visualized in the US. US measurements were taken at the end of expiration to reduce variability. Focus, contrast, depth, and convex view settings were independently manipulated by each PT to increase clarity of the image needed to be obtained, as previously described.18,19 Once each of the testing conditions was satisfactorily achieved, the image was frozen and IRD was measured using the system’s built-in digital measurement caliper. IRD was measured as the distance between the medial borders of the RA muscles by placing the two calipers at the end of the hyperechoic fascia and the beginning of the hypoechoic RA (Figure 1).

Ultrasound images of IRD measured in a nulliparous woman at rest (a) and during a curl-up (b), where the measurement point was halfway between the upper border of the umbilicus and the xiphoid process.
Three measurement locations were tested in random order as previously recommended,18,26,27,31 (i) upper border of the umbilicus, (ii) 3 cm above the umbilicus (upper border), and (iii) halfway between the umbilicus (upper border) and the xiphoid process (measured with a tape between the two marked anatomical points). IRD measurements below the umbilicus were excluded due to known challenges in achieving reliable results in this region,19,27,32,33 often attributed to subcutaneous fat and anatomical differences of the rectus sheath. This was also confirmed during pilot testing, where imaging was less consistent.
Each measurement was conducted independently, with both the examiners and subjects blinded to all prior measurements. An independent researcher, blinded to both, subjects and examination procedures, extracted and documented all variables for each subject using spreadsheet software.
To minimize variability, efforts were made to measure all participants at approximately the same time of day. Participants were also instructed to avoid food or drink for at least 1 h before examination and to void their bowel/bladder prior to the assessment if necessary.
Testing conditions
Four conditions, simulating common resting and exercise tasks, were standardized and tested in a crook lying position, lying supine on an examination bed with hips and knees flexed to about 90° and hands resting by the side. The conditions (all measured at the end of expiration) were as follows: (i) relaxed position (REST), (ii) curl-up (CU), where a head lift until shoulder blades were off the bed was achieved, (iii) TrA activation, where IRD measurement was taken during a TrA contraction, and (iv) combined TrA and CU (TrA + CU), where IRD measurements were taken following a joint contraction, where TrA would first be activated and CU would follow. For the TrA contraction, prior to the initiation of the measurements, each subject was taught how to perform such a contraction according to specific verbal instructions, tactile feedback as well as US visualization according to previous studies.34–36
Reliability testing
For intra-tester reliability, each therapist (tester) undertook three repeated US measurements across all locations and conditions.
For inter-tester reliability, both within-session and across-session agreements were explored by the PTs. For within-session inter-tester reliability, IRD mean measurements (out of three) for each location and task were explored across any two testers within the same session. For this procedure, four pairs of testers were formed where one of the testers (Tester 1) was always the same PT (A.S., acting as the “reference”) and the second tester would be one of the other four PTs. For the measurements, each pair would alternatively measure each subsequent subject. For across-session reliability, IRD mean measurements (out of three) for each location and task were explored across two different sessions (conducted on different days) between any two testers. Based on this, 10 tester pairs were formulated, ensuring that all possible paired combinations (out of the five PTs) attributed equally to the measurements.
Test-retest reliability was explored by repeating all US measurements 5–8 days later. Two PTs (one experienced and one novice) were involved in this procedure.
The rationale for using pairs of raters was to ensure consistency and control over the sample, thereby minimizing variability in data collection methods and conditions. By consistently assigning the same experienced tester to all paired combinations, we aimed to ensure uniformity and stability in measurement technique, enhance reliability, and detect any potential issues with reliability in advance. This approach provided a more thorough examination of a challenging and novel technique and strengthened the robustness of our findings.
Statistical analyses
Descriptive and inferential statistics have been used for the analysis. Reliability was estimated utilizing an intraclass correlation coefficient (ICC) assessed with a two-way random effects model (ICC2,1) and 95% confidence intervals (CIs). ICCs were chosen as the standard measure for assessing both absolute agreement and consistency across repeated measurements. Mean measurements out of three repeated trials performed by each PT were used for the analysis across each location and condition. ICC values were interpreted based on established guidelines 37 ; values <0.40 indicate poor reliability, between 0.40 and 0.59 indicate fair reliability, between 0.60 and 0.74 indicate good reliability, and between 0.75 and 1.00 indicate excellent reliability. Standard error of measurement was calculated as an estimate of variability in data, with a lower value indicating higher precision and reliability, and a higher value indicating greater variability of measurement due to random error. Differences across nulliparous and parous women were also reported utilizing independent sample’s t tests. Data were analyzed in SPSS (version 27 IBM Corp., Armonk, NY, USA).
Results
Fifty-four women aged 33.2 ± 15.2 years old and having a BMI ranging from 17.6 to 35.4 participated in the study. Of these, 35 (64.8%) were nulliparous and 19 (35.2%) were parous (9 primiparous and 10 multiparous). All of them were Caucasian in race, of Greek ethnicity, and lived in the broader Achaia region (including Patras and Aegio). A total of 9264 images were captured and measured by the testers. The sample’s demographic profile is available in Table 4 in Supplementary Material.
Intra-tester reliability
For the intra-tester reliability, two of the testers measured the whole sample (n = 54) and the three other testers measured a smaller sample group (n = 15). Intra-tester reliability for each PT’s repeated measurements was very good, with ICCs ranging from 0.68 to 0.99 and small CIs across locations and tasks (Table 1). This indicates that novice RUSI users can reliably perform repeated IRD measurements within the same session.
Intra-tester and within-session inter-tester reliability results.
CIs: confidence intervals; ICC: intraclass correlation coefficient; SEM: standard error of measurement; TrA: transversus abdominus; U: umbilicus (upper border); ½ XU: halfway between xiphoid process and umbilicus (upper border).
Inter-tester reliability
For within-session reliability agreements, four pairs performed the measurements, where Tester 1 was always the same across all paired combinations. Two pairs measured 14 subjects each, and the other two measured 13. Within-session reliability across any two testers yielded very good results across all combination pairs and across all locations and tasks with ICCs between 0.76 and 0.92 (Figure 2). This indicates that different novice RUSI users can achieve consistent IRD measurements during the same session.

Cumulative reliability graph employing minimum values of ICCs across locations and tasks.
Across-session inter-tester reliability was explored across 20 women from the sample (10 nulliparous and 10 parous). Overall, 10 pairs of testers were synthesized, accounting for all possible paired combinations and, to evenly explore all of them, each paired combination measured two women (a nulliparous and a parous one). This procedure overall yielded good reliability across positions and tasks, suggesting that novice RUSI users can achieve consistent IRD measurements during different testing sessions. However, one location and condition (3 cm above umbilicus in TrA + CU task) showed low and moderate reliability (ICCs of 0.28 and 0.50 between two pairs of testers; Table 2), indicating variability in the measurements at that site (Figure 2).
Across-session inter-tester reliability and test-retest reliability results.
CIs: confidence intervals; ICC: intraclass correlation coefficient; SEM: standard error of measurement; TrA: transversus abdominus; U: umbilicus (upper border); ½ XU: halfway between xiphoid process and umbilicus (upper border).
Test-retest reliability
Test-retest reliability was explored in two testers in 18 women of the sample (nine nulliparous and nine parous), and results from both PTs were very good across all positions and tasks, with ICCs ranging from 0.84 to 0.96 (Table 2), indicating consistent IRD measurements over time across both testers (Figure 2).
As a secondary aim of the study, mean IRD measurements conducted by two testers (A.S. and T.-E.P.) across the entire sample of parous and nulliparous women were calculated and are presented in Table 3. Statistically significant differences in mean IRD between parous and nulliparous women were observed across most conditions and measurement locations (p < 0.05) for both testers.
Mean IRDs across two testers in nulliparous (n = 35) and parous (n = 19) women.
p value explored via independent sample’s t test.
IRD: inter-recti distance; SD: standard deviation; TrA: transversus abdominus; U: umbilicus (upper border); ½ XU: halfway between the xiphoid process and umbilicus (upper border).
p is significant at 0.05 level. **p is highly significant (p < 0.001).
Discussion
Reliable measurement of IRD is essential for PTs to track changes during DRA rehabilitation to evaluate the efficacy of interventions, such as targeted abdominal exercises and to adjust treatment plans based on measurable outcomes. 12 This reliability study has shown that PTs, following sonographic training, can reliably measure IRD in a mixed women sample across a range of active and resting tasks. This study’s findings agree with most findings of previous reliability studies, yielding reliable intra-tester, inter-tester, and test-retest IRD measurements.16,19–23,26,27
Our reliability study’s strengths are the enrollment of a more diverse women sample, the utilization of a relatively bigger (compared to previous reliability studies) PT sample of novice and experienced in US PTs, as well as the measurement of IRD not only at rest but also across a range of dynamic contractions, which is more applicable within PT rehabilitation. In addition, a variety of reliability measurements (including intra-tester, test-retest, and within- and across-session inter-tester reliability) were explored in the study.
Our women sample comprised a wide age range (20–65 years old) of nulliparous (64.8%), primiparous (16.7%), and multiparous (18.5%) women, with variability (and statistically significant differences) in IRDs between nulliparous and parous women (Table 3), variable BMI, ranging from underweight women (BMI of 17.6) to overweight ones (BMI of 35.4), as well as great variability across years post delivery among them (ranging from 6 months postpartum to 35 years; Table 4 in the Supplementary Material). This variability is clinically valuable in exploring the reliability of IRD measurements across a more diverse women sample, including both, lower BMI subjects with small and potentially “easily visible” IRDs as well as those with higher BMI, where increased abdominal fat and greater distortion within the linea alba may pose imaging challenges (Figure 3 in the Supplementary Material). These parameters play an important role in image quality and, thus, may potentially affect the reliability of US imaging.24,26,32 Previous reliability studies have concentrated either on healthy subjects20,23 and/or on early postpartum women.16,19,22,27 However, women of wider anthropometric variability (such as variable BMI and age) are often referred for physiotherapy rehabilitation; therefore, such diversity is representative of everyday clinical practice.
The enrollment of five PTs, with a mixture of novice and more experienced in US imaging, was also desirable. Previous reliability studies have used one or two PTs and in most studies these were all well experienced with RUSI. However, it is important to establish whether both experienced as well as novice in US imaging PTs are capable of making reliable measurements. This, again, reflects current PT practice, where more PTs are starting to use and measure via RUSI,15,18,24 especially since ultrasonography is not typically part of PTs’ undergraduate curricula, and many clinicians start using it only after additional postgraduate training.
Intra-tester results across all five PTs and inter-tester reliability measurements were satisfactory. Especially intra-tester and within-session inter-tester reliability was very good to excellent (ICCs >0.8) in most cases. Across-session inter-tester measurements also yielded reliable results between paired combinations. However, two US measurements, both of which were measured 3 cm above the umbilicus in TrA + CU condition yielded moderate and moderate to low reliability. Clinically, this suggests that there might be significant variability in IRD measurements during these tasks, and it is important to consider this when interpreting patient measurements. While novice RUSI users might obtain reliable IRD measurements, certain conditions or locations may require more attention in measurement. Interestingly, PTs did not report difficulty in US measurements during this testing position. However, it could be that this high variability (accompanied by also large CIs) was attributed to the inherent variability in muscle activation patterns across different individuals during abdominal co-contraction, and also it is not unlikely that this TrA + CU contraction reduces stability across transducer positioning, due to increased abdominal wall movement or anatomical tension in this area during the contraction, thus limiting measurement reliability. Advanced assessment tools or methods could address these issues, such as surface electromyography to ensure consistent contractions, physical boundaries to restrict excessive subject movement, and specialized garments that stabilize the probe during complex movements. To our knowledge, reliability studies have not looked at IRD measurements in co-contractions; thus, further work is needed in this direction. Nevertheless, no other measurements reported low agreements.
Furthermore, test-retest reliability across two testers was also very good, although a smaller women sample was involved. However apart from the US training, which was provided by a highly specialized musculoskeletal medical sonographer, sufficient “hands-on” practice was important for establishing measurement stability, as commented by all PTs in our study, and as recommended in the literature.3,20,23,38
Apart from the IRD measurements during rest, abdominal contraction and co-contraction US measurements were explored. Linea alba behavior during tasks such as CUs, TrA activation, and combined maneuvers has been previously described to slightly alter IRD, 25 as also observed in our study. These maneuvers are routinely used in clinical practice to assess linea alba behavior and to train generation of linea alba tension, as well as to track DRA rehabilitation progress. 12 For this reason, we included them in our reliability testing to propose additional assessment conditions and provide measurable outcomes for clinicians. Although some reliability studies have explored IRD measurements during abdominal contraction tasks5,14,19,21,27 and none during co-contractions, such measurements are certainly more useful for PTs, as they can be directly applied during rehabilitation or, as recently named, in RUSI. Thus, taking reliable IRD measurements across contracting/co-contracting exercises is of great importance for PTs during rehabilitation, for establishing patient progress, evaluating the quality of performance of the exercise, using the IRD as an outcome and follow-up measure.
Interestingly, a recent scoping review, evaluating IRD US measurements obtained by PTs, has proposed specific recommendations, which are enlightening and helpful for reducing between-study variability. 18 In particular, standardization of the subject’s position, breathing phase during measurement, and number of measurements per location, as well as recommended locations, have been mentioned, which have all been encountered in detail in the present study. Moreover, regarding measuring location, apart from the superior umbilical border (which was recommended and has been used in all studies), they propose three other sites, which take into account individual linea alba length, such as halfway or a quarter length between the xiphoid process and superior umbilical border. Indeed, the present study has measured halfway between the xiphoid and upper border of the umbilicus, thus taking individual anthropometric differences into account.
According to our clinical observations, in parous women, IRD tended to decrease during a CU maneuver, slightly increase during a TrA contraction, and remain unchanged during a TrA + CU maneuver, which is consistent with previous findings. 25 Interestingly, the TrA + CU maneuver seems to counteract distortion of the linea alba by providing tension, which may improve abdominal support, optimize force transfer across the abdominal wall, and enhance lumbopelvic function, 25 potentially contributing to long-term IRD reduction. This finding suggests that rehabilitation strategies for postpartum women with DRA should focus on exercises promoting linea alba tension to support recovery and prevent further IRD widening. Although the mechanisms behind the long-term reduction in IRD remain unclear, these exercises are clinically valuable. However, for nulliparous women, IRD changes during CU maneuvers were more variable, with some experiencing an increase and others a decrease. The effects of TrA and TrA + CU maneuvers on IRD were similar between nulliparous and parous women. However, further assessment of these IRD changes was beyond this study’s aims and scope. Future research should investigate IRD changes in nulliparous versus parous women using larger samples and potentially more diverse IRD values. Additionally, increased IRD might be only one aspect of postpartum abdominal wall insufficiency, and DRA might present with multiple structural and functional deficits of the trunk. Future research should consider additional factors such as linea alba laxity or stiffness, distortion patterns, and linea semilunaris width, which may influence abdominal wall function. Although these aspects were beyond the scope of this study, they represent important directions for future investigation to fully capture abdominal wall integrity using US.
Limitations
One of the major limitations of this study is the fact that IRDs below the umbilicus were not measured. This was partly attributed to the low levels of reliability sub-umbilically,19,27,32,33 and partly due to the difficulties encountered with this measurement during the pilot study. However, as larger IRDs are found above the umbilicus, rather than below,6,16,22,31 it was considered acceptable to measure from the umbilicus and above. Technical challenges also posed limitations, particularly regarding probe stabilization during complex maneuvers, which could have influenced measurement consistency. Future research should explore optimized imaging and standardization procedures to improve measurements below the umbilicus or during co-contraction maneuvers, enabling a more comprehensive understanding of IRD across the entire abdominal region. Another limitation is that within- and across-session IRD measurements were compared across PT pairs. Although it would have been desirable to explore all PTs at the same time in all locations and conditions, practical constraints led to evaluating PT agreement within- and across-session across combination pairs. In addition, as we were unsure of this study’s reliability outcome (especially the inter-tester results), we decided for the within-session inter-tester reliability to utilize the same tester as a “reference” point (A.S. acted as Tester 1) within all paired combinations. We felt that this could assist in detecting in advance any discrepancies in measurements; however, as within-session inter-tester reliability yielded good results across all tester pairs, we concluded that using the same tester throughout the session didn’t significantly impact the reliability of the measurements. Additionally, test-retest reliability was assessed over a 5–8-day period. Although evaluating long-term reliability would provide deeper insights and better reflect clinical practice, the lengthy assessments and the need for participants to commute to the assessment site increased the risk of dropouts. Thus, a 5–8-day interval was selected to balance data integrity with participant retention.
Also, another limitation is that all measurements were captured in lying, which is a non-weight-bearing position. Previous studies22,39 have explored weight-bearing positions, such as sitting, standing, and squatting; however, these measurements often involve experienced sonographers and may require strict standardization protocols to ensure consistent muscle activation and posture. Thus, we did not incorporate them in our study. However, weight-bearing positions may not be practical for reliable outcome tracking, as increased muscle contractions and intra-abdominal pressure could introduce variability in IRD measurements. Additionally, such positions may not be suitable for initial assessments, as many women have limited control over their pelvic floor and TrA muscles in these positions. Thus, we prioritized non-weight-bearing positions, as we believe establishing reliability under stable conditions is a necessary first step with novice sonographers. Future research could explore these more complex functional movements while ensuring consistent muscle activation to provide insights into reliability and IRD behavior under dynamic conditions, which may better reflect real-life functional demands and later stages in rehabilitation.
To minimize measurement variability, several preventive measures were implemented, including conducting assessments at approximately the same time of day for each participant, ensuring they refrained from food or drink for at least 1 h prior to the examination, and having them void their bowel and bladder before measurements. However, factors such as recent physical activity could not be controlled, which may have introduced variability in IRD measurements.
Future studies should also consider including testers with varying experience levels in RUSI to further investigate the impact of expertise on IRD reliability. Moreover, given the relatively small sample size in this study, larger and even more diverse samples are recommended to better understand how participant variability influences the reliability and generalizability of IRD measurements. Additionally, validity testing was not addressed in the current reliability study, as conducting a validity analysis was beyond its scope. Future research should consider assessing IRD measurement validity across novice ultrasonographers to further support the clinical utility of RUSI.
Nevertheless, it is noted that all types of reliability were established in the current study, and its results are comparable with previous reliability studies, thus, supporting the use of IRD US measurements by PTs in clinical practice.
Conclusions
Overall, the reliability of US measurements of IRD across a mixed women population sample, as measured by five PTs who undertook ultrasonographic training combined with hands-on practice, was very good. Reliability in one resting and three abdominal contraction tasks across parous and nulliparous women of variable age and BMI was excellent for repeated measurements by the same PT (intra-tester), very good across raters (inter-tester) within the same and across different sessions, and excellent during re-testing (test-retest reliability). The study supports the use of RUSI by trained PTs to consistently monitor IRD changes during rehabilitation, or as a reliable outcome measure across women with DRA problems or postpartum. However, as this was a reliability study, not a validation study, further studies comparing PT-acquired measurements to those of expert sonographers or established reference standards are needed to confirm measurement validity and inform broader clinical applicability.
Supplemental Material
sj-docx-1-whe-10.1177_17455057251361999 – Supplemental material for Physiotherapists’ reliability of inter-recti distance measurement with real-time ultrasound across a mixed women population sample
Supplemental material, sj-docx-1-whe-10.1177_17455057251361999 for Physiotherapists’ reliability of inter-recti distance measurement with real-time ultrasound across a mixed women population sample by Evdokia Billis, Anastasia Skoura, Tatiana-Elena Papakonstantinou, Dimitra Tania Papanikolaou, Maria Tsekoura, Maria Andriopoulou, Charalampos Matzaroglou, Sofia Lampropoulou, Dimitra Koumoundourou, Eftichia Trachani, Theofani Bania and Elena Drakonaki in Women's Health
Footnotes
Acknowledgements
The publication fees of this article have been financed by the Research Council (ELKE) of the University of Patras.
Ethical considerations
The study was approved by the “Ethical Committee of the University of Patras,” University of Patras, Greece (Number 13890/22-6-2022).
Consent to participate
All patients/participants (examiners) provided written informed consent to participate in the study.
Consent for publication
All patients/participants provided written consent for publication of their data (e.g. images, clinical data) in this study.
Author contributions
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The article procerssing charnges of this manuscript have been financed by the Research Council (ELKE) of the University of Patras. The publication of the article in Open Access mode was financially supported in part by HEAL-Link.The funders played no role in the design, conduct, or reporting of this study.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability statement
All data relevant to the study are included in the article or are available as supplementary files.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
