Sage Journals: Discover world-class research

Abstract

Background

Accurate organ measurement is essential in neonatal ultrasound in order to guide clinical decisions. However, interobserver variability remains a challenge, especially in preterm infants, where small organ dimensions and technical limitations can affect reproducibility. Consistent agreement between examiners is crucial to ensure reliable and standardized assessments. This prospective observational study evaluates interobserver agreement in ultrasound measurements of the liver, kidney, and spleen in preterm infants.

Methods

In this prospective observational study, a total of 74 ultrasound examinations were performed in 30 preterm infants, with seven infants being measured twice. The 37 paired assessments included independent measurements of the liver (midsternal line (MSL), midclavicular line (MCL), and anterior axillary line (AAL)), kidneys, and spleen by two experienced examiners. Statistical analyses included the Wilcoxon matched-pairs signed rank test, paired t-test, Pearson correlation coefficient (PCC), intraclass correlation coefficient (ICC), and Bland–Altman analysis.

Results

No significant differences were found between examiners (P > .05), except for the generally least reproducible liver length in the MCL. PCC and ICC values for all measurements exceeded 0.929, indicating excellent interobserver agreement.

Conclusion

Ultrasound investigations remain a reliable and reproducible tool for organ size assessment in neonatal care, even in extremely low birth weight infants with tiny anatomical structures. The strong interobserver agreement emphasizes the importance of standardized measurement protocols and ultrasound training, ensuring consistency in clinical practice.

Keywords

Neonatal ultrasound interobserver abdominal organs preterm infants

Introduction

Abdominal sonography is an essential diagnostic tool in neonatal medicine, providing real-time and non-invasive assessment of parenchymal organs such as the liver, kidneys, and spleen. Accurate organ size measurements are essential for tracking growth, development, and early pathological changes in both preterm and term infants. Liver size abnormalities may indicate hepatomegaly, which can be associated with congestive heart failure, infections, metabolic disorders, liver tumors, or malformations.^1–4 Renal dimensions play a crucial role in diagnosing congenital anomalies of the kidney and urinary tract (CAKUT), for example, renal dysplasia and duplex kidney, or ciliopathies like polycystic kidney disease.^5–7 Equally, splenomegaly can indicate sepsis, portal hypertension, or hematological disorders like leukemia.^8–10 Obtaining accurate and reproducible measurements in preterm infants presents specific challenges. Their small organ sizes, limited ultrasound windows, and frequent movement due to restlessness or respiratory effort contribute to measurement variability. These technical factors make interobserver agreement extremely important to ensure diagnostic reliability. Although interobserver variability has been examined in neonatal brain^11–13 and lung ultrasound,¹⁴ there is currently no published research evaluating agreement in abdominal organ measurements in preterm infants. Existing studies on sonographic interobserver reliability in abdominal imaging are limited overall and focus primarily on older children and adults.^{15, 16} While abdominal ultrasound is widely used in extremely low birth weight infants, for example, to detect conditions such as necrotizing enterocolitis,^{17, 18} no studies have evaluated the reproducibility of standardized liver, kidney, and spleen measurements in this population. This study aims to evaluate the interobserver agreement between two experienced examiners in standardized ultrasound measurements of the liver, kidneys, and spleen in preterm infants, using established statistical approaches to assess reproducibility in clinical practice.

Material and Methods

Study Design and Setting

This prospective study was conducted at a level III neonatal intensive care unit (NICU) at the Hannover Medical School, Lower Saxony, Germany, between March 2024 and January 2025. The NICU provides care for approximately 100 very low birth weight infants (<1.5 kg) annually, forming a diverse patient cohort. Most patients are inborn, with a smaller number of outborn transfers. All preterm infants admitted during the study period were eligible for inclusion if they were 37 0/₇ weeks of gestational age at the time of ultrasound examination and could be independently examined by both investigators within a 24-h interval. Infants were excluded if they had congenital anomalies or conditions that could affect abdominal organ size (e.g., hepatic or abdominal tumors, metabolic hepatopathies, polysplenia, or polycystic kidney disease). Further exclusion criteria included scheduling-related limitations: if the investigators’ duty rosters did not overlap within the required 24-h timeframe, it was not feasible to include the infant. The lack of written informed consent by the parents or legal guardians also led to exclusion. In total, 254 infants were screened for eligibility during the study period. Of these, 154 were excluded due to scheduling constraints, 56 due to lack of parental consent, eight due to early postnatal death, and six due to congenital or organ-specific anomalies. The final study population consisted of 30 preterm infants (Figure 1). This sample size was chosen based on feasibility and is in line with previously published interobserver agreement studies in neonatal ultrasound.^{13, 19} To evaluate its adequacy, a retrospective precision-based calculation was performed according to Bonett.²⁰ Assuming an expected intraclass correlation coefficient (ICC) of 0.90 and a two-rater design, this sample size yields a 95% confidence interval of approximately ± 0.05. The study was approved by the local Ethics Committee and written informed consent was obtained from all legal guardians prior to participation. The study was conducted in accordance with the ethical standards of the 1964 Declaration of Helsinki and its later amendments.

Figure 1.

Participant Flow Diagram: Overview of Screening, Exclusions, and Final Inclusion for Interobserver Ultrasound Analysis.

Variables

The primary outcome was the level of interobserver agreement in ultrasound measurements of the liver, kidneys, and spleen. No exposures or predictors were defined, as the study was designed to assess reproducibility rather than to examine causal relationships or prognostic factors. Potential confounding variables such as body weight, postmenstrual age, and sex were recorded during data collection but were not included in the statistical analysis, since adjustment was not applicable to the methodological objective of the study. No effect modifiers were considered. Diagnostic exclusion of congenital anomalies or organ-specific abnormalities was based on clinical evaluation and available imaging or medical record evidence, following standard NICU diagnostic practice.

Ultrasound Device and Examination Procedure

All ultrasound scans were taken using a GE Venue Go R4™ ultrasound scanner (GE HealthCare Technologies, Chicago, Illinois, USA) with a convex probe (8C). Infants were examined in a supine position to ensure consistency. In accordance with standard NICU care, neonates were neither woken nor sedated for the examination.

Ultrasound measurements were performed independently by two pediatricians with extensive experience in neonatal ultrasound: examiner 1 had 5.5 years and examiner 2 had 9 years of experience, both held DEGUM level 1 certification in pediatric sonography (DEGUM = German Society for Ultrasound in Medicine). The time interval between the two examinations of the same infant was limited to a maximum of 24 h to minimize potential physiological variations. Both examiners used the same ultrasound device, probe, system presets, and patient positioning protocol to ensure technical consistency across all examinations. In addition, all measurements were performed in accordance with standardized DEGUM guidelines and current reference protocols.^{21, 22}

The craniocaudal liver length was measured in three strictly sagittal planes: midsternal line (MSL), using the aorta as a guide; the midclavicular line (MCL), with the gallbladder as a hallmark; and the longest craniocaudal alignment in the anterior axillary line (AAL). The spleen length was assessed in an oblique longitudinal scan from the left side, measuring from the upper to lower pole. Renal volume was estimated using the ellipsoid formula: volume = length × width × depth × π/6, with the kidneys consistently measured from the ventral side. To prevent examiner bias, images and measurement data were stored on separate, independent servers, ensuring that each examiner had no access to the others results. This setup eliminated any potential influence of prior measurements on subsequent assessments.

Statistical Analysis

Data analyses were performed using GraphPad Prism version 10.4.1 (GraphPad Software, Boston, Massachusetts, USA). Data distribution was tested for normality using the Shapiro–Wilk test. Normally distributed data were presented as mean ± standard deviation (SD), while non-normally distributed data were given as median and interquartile range (IQR). Depending on the distribution, differences between examiners were analyzed using either a paired t-test for normally distributed data or the Wilcoxon matched-pairs signed rank test for non-normally distributed data. Since the primary outcome of this study was the level of agreement between the two examiners, interobserver reliability was assessed using the ICC and Pearson’s correlation coefficient (PCC). A Bland–Altman analysis was performed to evaluate systematic bias and limits of agreement (LoA). For normally distributed data, LoA was defined as the mean difference ± 1.96 SD, while for non-normally distributed data, the median difference and the 2.5th-97.5th percentile range were used. As seven infants underwent measurements twice, their values were averaged, resulting in 30 data points for all statistical analyses except for kidney measurements, since one infant had a horseshoe kidney, reducing the number of statistical comparisons to 29 data points. Apart from the excluded renal measurement, no missing data occurred. All other variables were fully complete across the entire study population. To preserve individual pairwise comparisons, the data from all the examinations were used for the Bland–Altman analysis.

Results

Study Population

A total of 254 preterm infants were assessed for potential inclusion during the study period. Of these, 154 could not be enrolled due to scheduling constraints, 56 were not enrolled because no parental consent was obtained, eight died early postnatally, and six had congenital or organ-specific anomalies. This resulted in a final cohort of 30 infants, in whom a total of 74 ultrasound examinations were conducted, with each of the 37 assessments performed independently by two examiners. Among them, seven infants underwent measurements twice. Liver and spleen measurements were complete for all infants; one kidney measurement was excluded due to a horseshoe kidney, resulting in 29 complete renal datasets. The study cohort included 16 females (53%) and 14 males (47%). The mean gestational age at birth was 28 3/₇ weeks ± 3 3/₇ weeks, with a median birth weight of 0.865 kg (IQR: 0.64-1.2) and a median birth length of 34 cm (IQR: 30.2-36.8). The first ultrasound examination was performed at a mean postnatal age of 42 days ± 29 days (Table 1).

Table 1.

Demographic and Clinical Characteristics of the Study Population.

Characteristics	Value*
Number of preterms	30
Number of examinations	74 (37 from each examiner)
Infants with two examinations	7 (23%)
Sex	Female: 16 (53%) Male: 14 (47%)
Gestational age (p.m., weeks)	28 3/₇ ± 3 3/₇ Range: 23 4/₇–36 6/₇
Birth weight (kg)	0.865 (0.64–1.2) Range: 0.5–3.56
Birth length (cm)	34 (30.2–36.8) Range: 28.5–50
Age at first examination (days)	42 ± 29 Range: 1–103
Weight at first examination (kg)	1.8 ± 0.69 Range: 0.695–3.56
Length at first examination (cm)	40 ± 5 Range: 33–55

Note: *Non-normally distributed data are shown as median interquartile range (IQR), whereas normally distributed data are presented as mean ± standard deviation (SD).

Across all measured parameters, no significant differences were observed between examiners (P > .05), except for liver length in MCL, where a small but statistically significant difference was detected (P = .0369). Overall interobserver agreement was excellent, with ICC ranging from 0.929 to 0.964 and PCC ranging from 0.930 to 0.965, indicating strong reliability (Table 2 and Figure 2). The highest agreement was noted for spleen measurements (ICC = 0.964, PCC = 0.965), while the lowest agreement was found for liver length in the AAL (ICC = 0.929, PCC = 0.930). Bland–Altman analysis further confirmed the strong agreement between examiners.

Table 2.

Comparison and Interobserver Agreement of Measurement Results.

Organ	Examiner no. 1*	Examiner no. 2*	P Value**	ICC	PCC
Liver, MSL (cm), n = 37	3.61 ± 0.49 Range: 2.8–4.9	3.62 ± 0.52 Range: 2.8–4.8	P = .9168	0.945	0.945
Liver, MCL (cm), n = 37	4.3 (3.9–4.7) Range: 3.5-6.7	4.5 (4.0-4.9) Range: 3.5–6.7	P = .0369	0.955	0.960
Liver, AAL (cm), n = 37	4.6 (4.2–4.9) Range: 3.5–6.9	4.6 (4.0–4.9) Range: 3.5–6.7	P = .121	0.929	0.930
Spleen (cm), n = 37	3.46 ± 0.61 Range: 2.2–4.6	3.44 ± 0.62 Range: 1.9–4.4	P = .1841	0.964	0.965
Right kidney (mL), n = 35	8 (7–12.5) Range: 3–16	8 (6-11.5) Range: 3–17	P = .23	0.951	0.957
Left kidney (mL), n = 35	9.1 ± 3.42 Range: 3–17	9.5 ± 3.62 Range: 3–17	P = .0763	0.941	0.947

Notes: *Non-normally distributed data are shown as median interquartile range (IQR), whereas normally distributed data are presented as mean ± standard deviation (SD). **P values, intraclass correlation coefficient (ICC), and Pearson correlation coefficient (PCC) were calculated for 30 measurements, except for renal measurements (n = 29) due to one infant with a horseshoe kidney. Since some infants were measured twice, their values were averaged, reducing the total number of measurements. A paired t-test was used for normally distributed data, while the Wilcoxon matched-pairs signed rank test was applied otherwise. P > .05 indicates no significant difference between examiners. AAL, anterior axillary line; MCL, midclavicular line; MSL, midsternal line.

Figure 2.

Note: Analysis is based on 30 independent data points, except for renal measurements (n = 29) due to one infant with a horseshoe kidney. Larger points indicate overlapping data from identical measurements.

The mean/median differences were small for all parameters, with 95% LoA ranging within clinically acceptable ranges (Figure 3). Example images of the MSL, AAL, spleen, and kidney are shown in Figures 4 and 5.

Figure 3.

Note: Each point represents a pair of measurements, where the x-axis shows the mean value of the respective measurements from examiner 1 (A) and examiner 2 (B), and the y-axis represents their difference (A-B). The middle line shows the mean/median difference, and the dotted lines indicate the limits of agreement (LoA). For normally distributed data, the LoA were defined as the mean difference ± 1.96 standard deviation (SD), while non-normally distributed data are presented using the median difference, with LoA based on the 2.5th-97.5th percentiles. Larger points indicate overlapping data from identical measurements.

Figure 4.

Note: (a) MSL measurement by examiner 1, (b) MSL measurement by examiner 2, (c) AAL measurement by examiner 1, (d) AAL measurement by examiner 2. AoD, descending aorta; Li, liver; rKd, right kidney; Vb, vertebral body.

Figure 5.

Note: (a) Spleen measurement by examiner 1, (b) spleen measurement by examiner 2, (c) right kidney length by examiner 1, (d) right kidney length by examiner 2. Li, liver; rKd, right kidney; Sp, spleen; Sto, stomach.

Discussion

This study represents the first systematic interobserver analysis of abdominal organ measurements in preterm infants, offering valuable insights into the reproducibility of neonatal ultrasound. While previous studies have examined interobserver variability in individual organ measurements, particularly in older children and adults,^23–26 comprehensive data on liver, kidney, and spleen measurements in preterm infants have been lacking.

Our findings demonstrate high interobserver agreement across all parameters, with no statistically significant differences between examiners for almost all measurements. The only exception was the measurement of liver length in MCL, which showed a small but statistically significant difference (P = .0369). However, this variation appears to be of limited clinical relevance, as confirmed by the Bland–Altman analysis, which showed narrow LoA within acceptable ranges. One possible explanation for the higher variability in MCL measurements is the use of the gallbladder as a landmark, which is more prone to positional shifts due to respiratory movement and changes in its filling state. A fully distended gallbladder can appear in a wide area of the abdomen. This allows for variable positioning of the transducer, which limits the reproducibility of the measurement. Equally important is the physiological increase in the craniocaudal liver length in the MCL region, which means that small lateral transducer displacements in the sagittal plane can lead to significant differences in the measurement due to the pronounced change in liver length in this area. In contrast, MSL and AAL appear to be more stable reference points, potentially leading to greater measurement consistency. Opinions regarding the selection of the MCL as a valid and reproducible measurement point for craniocaudal liver length have differed in past studies.^27–29 The inclusion of MCL measurements in the present study follows current recommendations,²² whereas DEGUM only endorses MSL and AAL measurements.²¹ Given its frequent use in liver assessments, further refinement of MCL-measurement protocols and examiner training could help minimize variability and improve reliability.

Clinical Implications and Perspectives

The high interobserver reliability in this study affirms the clinical value of sonography as a precise and reproducible imaging tool in neonates, including extremely low birth weight infants with very small anatomical structures. Accurate organ measurement is essential for the early detection of hepatosplenomegaly, renal growth abnormalities, and other congenital anomalies, allowing for timely intervention and monitoring. Liver size assessment is particularly relevant for detecting conditions such as infections, metabolic hepatopathies, or congestion due to cardiac failure.^{4, 30, 31} Even small inconsistencies in measurement technique could influence clinical decision-making and interindividual follow-up, underscoring the need for consistent transducer positioning and respiratory phase control during imaging. In this context, artificial intelligence (AI) is increasingly being explored as a tool to improve measurement accuracy. AI has shown potential in reducing variability in sonographic imaging and in clinical decision-making,^32–34 but its role in neonatal abdominal ultrasound remains limited. Future research could explore whether AI-assisted approaches can further improve measurement accuracy. It should not be dismissed that the experience of the examiner remains indispensable.

Strengths and Limitations

A major strength of this study is its prospective design, ensuring systematic data collection and analysis. The inclusion of a well-defined cohort of preterm infants enhances the clinical relevance of the findings, especially for NICU settings. Additionally, this study provides a comprehensive evaluation of interobserver agreement across multiple abdominal organs, using a standardized measurement protocol. The strict blinding of examiners further strengthens the study, ensuring that measurements were performed independently and without access to each other’s results, minimizing potential bias.

However, some limitations must be acknowledged. First, while a sample size of 30 infants (74 examinations) is appropriate for an interobserver study, a larger cohort could provide more robust data on measurement variability. Second, as a single-center study, our findings may have limited generalizability to other clinical settings, particularly those with less experienced examiners, different ultrasound equipment, or non-standardized measurement protocols. This restricts the external validity of our results and suggests that similar levels of agreement may not be universally achievable. Third, a potential limitation of the study lies in the timing of ultrasound examinations. While no infant was excluded due to clinical instability, some neonates were not examined during acute critical phases, and measurements were instead performed in a later, more stable condition. This approach reflects routine NICU practice, but it may limit the generalizability of interobserver agreement to clinically unstable patients. Fourth, intraobserver variability was not assessed, which may have provided additional insights into the overall reproducibility of the measurements. Finally, repeated measurements in seven infants were averaged for analysis, which may have reduced visible variance and influenced the observed agreement metrics.

Conclusion

This study confirms that ultrasound-based abdominal organ measurements in preterm infants are highly reproducible, supporting their continued use as a reliable bedside imaging tool. However, liver length in the MCL was the only plane with a statistically significant difference between the two examiners, likely due to the gallbladder’s variable position depending on its filling state and the physiological increase of liver length in this area. These factors make MCL, in contrast to MSL and AAL, a less reliable reference for liver length measurement. The observed variability in MCL measurements aligns with clinical experience and previous studies.

Footnotes

Acknowledgments

The authors would like to express their sincere gratitude to the infants and parents for taking part in the study. Parts of this work were translated from German into English using DeepL Translator version 25.1.11615133 (DeepL SE, Cologne, Germany).

Data Availability Statement

The datasets generated and analyzed during this study are not publicly available due to patient confidentiality regulations but are available from the corresponding author upon reasonable request.

Declaration of Conflicting Interests

The authors declared no conflict of interest with respect to the research, authorship, and/or publication of this article.

Ethical Approval and Informed Consent

The Ethics Committee of Hannover Medical School approved the study (09.04.2024, No. 11351_BO_K_2024), and informed consent was obtained from all legal guardians.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Alexandros Rahn

References

Clark

BJ 3rd.

Treatment of heart failure in infants and children. Heart Dis . 2000;2:354–361.

Fernandez-Pineda

, Cabello-Laureano

Differential diagnosis and management of liver tumors in infants. World J Hepatol . 2014;6:486–495. doi:10.4254/wjh.v6.i7.486

Moreira-Silva

, Maio

, Bandeira

, . Metabolic liver diseases presenting with neonatal cholestasis: At the crossroad between old and new paradigms. Eur J Pediatr . 2019;178:515–523. doi:10.1007/s00431-019-03328-5

Evans

, Siew

SM.

Neonatal liver disease. J Paediatr Child Health . 2020;56:1760–1768. doi:10.1111/jpc.15064

Davidovits

, Eisenstein

, Ziv

, . Unilateral duplicated system: Comparative length and function of the kidneys. Clin Nucl Med . 2004;29:99–102. doi:10.1097/01.rlu.0000109331.37224.5f

Verghese

, Miyashita

Neonatal polycystic kidney disease. Clin Perinatol . 2014;41:543–560. doi:10.1016/j.clp.2014.05.005

Phua

, Ho

Renal dysplasia in the neonate. Curr Opin Pediatr . 2016;28:209–215. doi:10.1097/MOP.0000000000000324

Pinkerton

, Holcomb

Jr , Foster

JH.

Portal hypertension in childhood. Ann Surg . 1972;175:870–886. doi:10.1097/00000658-197206010-00007

Moreira

, Bras

, Goncalves

, . Fetal splenomegaly: A review. Ultrasound Q . 2018;34:32–33. doi:10.1097/RUQ.0000000000000335

10.

Suttorp

, Classen

CF.

Splenomegaly in children and adolescents. Front Pediatr . 2021;9:704635. doi:10.3389/fped.2021.704635

11.

Pinto

, Paneth

, Kazam

, . Interobserver variability in neonatal cranial ultrasonography. Paediatr Perinat Epidemiol . 1988;2:43–58. doi:10.1111/j.1365-3016.1988.tb00179.x

12.

Corbett

, Rosenfeld

, Laptook

, . Intraobserver and interobserver reliability in assessment of neonatal cranial ultrasounds. Early Hum Dev . 1991;27:9–17. doi:10.1016/0378-3782(91)90023-v

13.

Resch

, Kaltenberger

, Resch

, . Interobserver reliability of neonatal cranial ultrasound scanning regarding white matter disease. Pediatr Neonatol . 2013;54:214–215. doi:10.1016/j.pedneo.2013.01.020

14.

Brusa

, Savoia

, Vergine

, . Neonatal lung sonography: Interobserver agreement between physician interpreters with varying levels of experience. J Ultrasound Med . 2015;34:1549–1554. doi:10.7863/ultra.15.14.08016

15.

Schlesinger

, Hernandez

, Zerin

, . Interobserver and intraobserver variations in sonographic renal length measurements in children. Am J Roentgenol . 1991;156:1029–1032. doi:10.2214/ajr.156.5.2017927

16.

Lee

, Roberts

, Chen

, . Estimation of spleen size with hand-carried ultrasound. J Ultrasound Med . 2014;33:1225–1230. doi:10.7863/ultra.33.7.1225

17.

Deeg

KH.

Sonographic and Doppler sonographic diagnosis of necrotizing enterocolitis in preterm infants and newborns. Ultraschall Med . 2019;40:292–318. doi:10.1055/a-0879-8110

18.

Hwang

, Tierradentro-Garcia

, Dennis

, . The role of ultrasound in necrotizing enterocolitis. Pediatr Radiol . 2022;52:702–715. doi:10.1007/s00247-021-05187-5

19.

Vathana

, Rust

, Mills

, . Intraobserver and interobserver reliability of two ultrasound measures of humeral head position in infants with neonatal brachial plexus palsy. J Bone Joint Surg . 2007;89:1710–1715. doi:10.2106/JBJS.F.01263

20.

Bonett

DG.

Sample size requirements for estimating intraclass correlations with desired precision. Stat Med . 2002;21:1331–1335. doi:10.1002/sim.1108

21.

Riccabona

, Schweintzger

, Leidig

, . Dokumentationsempfehlung: Standarddokumentation der Sonografie des kindlichen Abdomens . ÖGUM, DEGUM; 2006.

22.

Deeg

, Hofmann

, Hoyer

Ultraschalldiagnostik in Pädiatrie und Kinderchirurgie . Stuttgart: Georg Thieme Verlag KG; 2014. doi:10.1055/b-003-106490

23.

Ablett

, Coulthard

, Lee

, . How reliable are ultrasound measurements of renal length in adults. Br J Radiol . 1995;68:1087–1089. doi:10.1259/0007-1285-68-814-1087

24.

Emamian

, Nielsen

, Pedersen

JF.

Intraobserver and interobserver variations in sonographic measurements of kidney size in adult volunteers. A comparison of linear measurements and volumetric estimates. Acta Radiol . 1995;36:399–401.

25.

, Ying

, Chan

, . The reproducibility and short-term and long-term repeatability of sonographic measurement of splenic length. Ultrasound Med Biol . 2004;30:861–866. doi:10.1016/j.ultrasmedbio.2004.05.012

26.

Hajibonabi

, Riedesel

, Taylor

, . Ultrasound-estimated hepatorenal index: Diagnostic performance and interobserver agreement for pediatric liver fat quantification. Pediatr Radiol . 2024;54:1653–1660. doi:10.1007/s00247-024-06021-4

27.

Borchert

, Schuler

, Muche

, . Comparison of panorama ultrasonography, conventional B-mode ultrasonography, and computed tomography for measuring liver size. Ultraschall Med . 2010;31:31–36. doi:10.1055/s-2008-1109309

28.

Patzak

, Porzner

, Oeztuerk

, . Assessment of liver size by ultrasonography. J Clin Ultrasound . 2014;42:399–404. doi:10.1002/jcu.22151

29.

Riestra-Candelaria

, Rodriguez-Mojica

, Vazquez-Quinones

, . Ultrasound accuracy of liver length measurement with cadaveric specimens. J Diagn Med Sonogr . 2016;32:12–19. doi:10.1177/8756479315621287

30.

Lees

MH.

Heart failure in the newborn infant. Recognition and management. J Pediatr . 1969;75:139–152. doi:10.1016/s0022-3476(69)80116-2

31.

Ferreira

, Cassiman

, Blau

Clinical and biochemical footprints of inherited metabolic diseases. II. Metabolic liver diseases. Mol Genet Metab . 2019;127:117–121. doi:10.1016/j.ymgme.2019.04.002

32.

Perri

, Sbordone

, Patti

, . The future of neonatal lung ultrasound: Validation of an artificial intelligence model for interpreting lung scans. A multicentre prospective diagnostic study. Pediatr Pulmonol . 2023;58:2610–2618. doi:10.1002/ppul.26563

33.

Cai

, Pfob

Artificial intelligence in abdominal and pelvic ultrasound imaging: Current applications. Abdom Radiol (NY) . 2024. doi:10.1007/s00261-024-04640-x

34.

Siemens Healthineers. Siemens Healthineers Introduces Industry-first AI Abdomen as Part of ACUSON Sequoia 3.5 Ultrasound Release . 2024. Accessed February 11, 2025.

Tiny Torsos,Tight Consensus: A Prospective Interobserver Study on Abdominal Ultrasound in Preterm Infants

Abstract

Background

Methods

Results

Conclusion

Keywords

Introduction

Material and Methods

Study Design and Setting

Participant Flow Diagram: Overview of Screening, Exclusions, and Final Inclusion for Interobserver Ultrasound Analysis.

Variables

Ultrasound Device and Examination Procedure

Statistical Analysis

Results

Study Population

Demographic and Clinical Characteristics of the Study Population.

Comparison and Interobserver Agreement of Measurement Results.

Discussion

Clinical Implications and Perspectives

Strengths and Limitations

Conclusion

Footnotes

Acknowledgments

Data Availability Statement

Declaration of Conflicting Interests

Ethical Approval and Informed Consent

Funding

ORCID iD

References