Abstract
Background
Right iliac fossa (RIF) pain is a frequent and challenging presenting complaint in emergency departments, encompassing a wide spectrum of acute and chronic conditions.
Purpose
To compare effectiveness of ultrasound versus initial clinical and lab tests for diagnosing acute appendicitis in patients with RIF pain, while also evaluating the impact of ultrasound operator experience as well as portable ultrasound system.
Materials and methods
This retrospective study included 525 patients (aged ≥15 years) presenting with acute RIF pain to three emergency departments in Thi-Qar Governorate, Iraq (January 2024–January 2025). Sensitivity, specificity, predictive values (PPV and NPV), and accuracy for diagnosing acute appendicitis were calculated. Multivariable logistic regression identified independent predictors of diagnostic accuracy for both modalities.
Results
Among 525 patients, appendicitis was the final diagnosis in 273 (52.00%). For diagnosing acute appendicitis, ultrasound demonstrated significantly higher sensitivity (89.7% vs 67.4%), specificity (67.1% vs 46.4%), and overall accuracy (78.9% vs 57.3%) compared to clinical-laboratory assessment. Independent predictors of higher ultrasound accuracy included US operator experience (Senior EM Physician vs. Resident: aOR 3.15, 95% CI: 1.80–5.52) and presence of rebound tenderness (aOR 2.40, 95% CI: 1.35–4.27). For clinical-laboratory assessment, ED physician experience (Senior vs. Resident: aOR 1.48, 95% CI: 1.15–2.41) was one of the independent predictors of higher accuracy.
Conclusion
Ultrasound significantly outperforms initial clinical-laboratory assessment in diagnosing acute appendicitis among patients with RIF pain in this setting. US operator experience is a key determinant of ultrasound accuracy. Our findings support the effective use of portable ultrasound systems in the emergency setting.
Keywords
Introduction
Acute right iliac fossa (RIF) pain is a frequent cause for emergency department visits, with acute appendicitis being the most common surgical emergency requiring prompt diagnosis to prevent complications like perforation.1,2 While generalized abdominal pain can be difficult to localize, RIF pain often points towards specific underlying pathologies, with acute appendicitis being one of the most common and urgent considerations, accounting for a significant proportion of surgical emergencies.3,4 However, the differential diagnosis is broad, including gastrointestinal, gynecological (especially in women of reproductive age), and urological disorders, all of which can mimic appendicitis and complicate accurate diagnosis.1,5 This diagnostic uncertainty, particularly in females where conditions like ovarian cysts or ectopic pregnancy are prevalent, often leads to higher rates of negative appendectomy and of diagnostic delay.1,6
Clinical assessment, often supplemented by laboratory investigations, forms the initial step in evaluation. However, clinical signs and symptoms alone can have variable sensitivity and specificity. 2 Scoring systems like the Alvarado or Appendicitis Inflammatory Response score have been developed to aid diagnosis, but their performance can vary, and they may not encompass the full range of RIF pathologies. 1 Consequently, imaging plays a pivotal role in improving diagnostic accuracy, reducing negative appendectomy rates, and guiding appropriate management.4,7
Ultrasound (US) is often advocated as the first-line imaging modality, especially in younger patients and pregnant women, due to its non-invasive nature, lack of ionizing radiation, and ability to visualize gynecological structures.1,4 Standardized US criteria for appendicitis are well-defined.4,8 However, the visualization of the appendix can be challenging due to factors like bowel gas or patient habitus, leading to variable reported sensitivities and specificities.1,2,9 While computed tomography offers high diagnostic accuracy, its use involves radiation exposure and cost, necessitating judicious application.
In Iraq, like many regions, the optimal diagnostic pathway for RIF pain, integrating clinical judgment with imaging, continues to be an area of investigation. Ultrasound is heavily relied upon as the first-line imaging modality due to its high accessibility, safety profile, and lower cost compared to CT. The recent introduction of high performance, light-weight laptop computer-sized ultrasound machines had made possible a high-level availability of such machines in the emergency department. An ultrasound machine always within easy reach will also facilitate teaching and training. Connection through high capacity wireless network allows for rapid communication with hospital electronic records. Patients with a negative or equivocal ultrasound may proceed to CT scanning, though this is not universally applied and depends on the persistence of clinical suspicion. There is a need to understand the comparative effectiveness of readily available diagnostic tools to refine local protocols. This study, therefore, aimed to evaluate the diagnostic performance of ultrasound compared to initial clinical-laboratory assessment in patients presenting with acute RIF pain in a multi-center setting in Iraq, with a specific focus on the diagnosis of acute appendicitis and the application of a portable point-of-care system in this setting, and to identify factors influencing diagnostic accuracy.
Materials and methods
Study design and setting
This retrospective study was carried out in the emergency department (ED) of three Thi-Qar governorate hospitals (X-Y-Z). This retrospective study, analyzing existing clinical records without patient intervention or contact, was exempted from formal ethical approval. A waiver was granted by the Thi-Qar Health Directorate, the central authority overseeing the three involved institutions, under reference TQH-25918b on November 15, 2023, confirming compliance with local regulations on research ethics, confidentiality, and patient data protection. As data was collected retrospectively, no patient consent is needed. The study collected data from January 2024 to January 2025.
Patients
During the study period, a total of 756 patients presented to the emergency department with complaints of acute abdominal pain and tenderness of right lower quadrant (iliac fossa) region were initialed screened. Exclusion criteria were applied in a tiered manner (Fig. 1). First, patients under 15 years of age were excluded. The inclusion criteria specified that patients must be 15 years or older, present to the ED with RIF pain, and undergo an abdominal ultrasound examination. Of the remaining patients, 95 were excluded because they did not receive an abdominal ultrasound examination upon arrival. This was primarily because the presenting clinical signs (e.g., overt peritonitis) were deemed sufficient for an immediate surgical consultation, bypassing imaging. A further 49 patients were excluded due to incomplete or duplicate medical records. Ultimately, a total of 525 patients were included in this study (Fig. 1). The flowchart of the study (Author).
Data collection
Data were completely abstracted from hospital electronic medical records. Patient demographic information gathered included age in years and gender.
Clinical presentation results were well documented. These comprised a range of symptoms such as whether or not there was appetite loss, nausea, vomiting, fever, diarrhea, and migration of pain. Duration in hours from symptom onset to emergency department presentation was also documented. Important physical signs documented were RIF tenderness, rebound tenderness, Rovsing’s sign, guarding, and obturator sign. The temperature of the patient on presentation (°C) was also recorded. Relevant laboratory findings from the patient’s case were sought, including the white blood cell count (WBC count) (×109/L), percentage of neutrophils (%), and urinalysis, either normal or abnormal. The initial clinical diagnosis, the admitting ED physician’s presumptive diagnosis was also recorded. The original diagnostic categories were appendicitis, right lower ureteric stone, ileitis, ovarian cyst and torsion, ectopic pregnancy, and an “others” category which may contain conditions such as caecal tumor, appendicular mass, or intussusception. This also indicated that the ED physician’s level of experience, whether resident or senior emergency medicine, was noted.
Point-of-care ultrasound (POCUS) is to a considerable extent put into practice in our emergency departments. The examinations are (mostly) done by the emergency physicians. US findings were documented with caution. While a single standardized protocol was not used across all three institutions, all sonographic examinations were expected to follow established best practices for evaluating the right lower quadrant for appendicitis, including the use of graded compression with a high-frequency linear transducer. In our hospital systems, residents and less experienced staff have 24/7 access to a senior radiologist for consultation and second opinions, which could be sought at the discretion of the operator. The US diagnosis itself, with an M-Turbo small-size ultrasound device (FUJIFILM SONOSITE©, Bothell, WA, USA), was categorized in the same way as the original clinical diagnoses (i.e., appendicitis, right lower quadrant pathology, ileitis, ovarian cyst, ectopic pregnancy, or others). Examples of abnormal ultrasound images that are relevant to the study are shown in Fig. 2. Examples of ultrasound. (a) Distended fluid filled appendix around 6.6 mm in diameter, (b) appendix with thick wall, and (c) right ovarian cyst measure 77 × 68 mm.
Finally, the final diagnosis was rendered and categorized with the same categories (appendicitis, right lower, ileitis, ovarian cyst, ectopic pregnancy, and others). Confirmation of final diagnosis was achieved either by intraoperative findings in patients who were operated upon (e.g., appendectomy or gynecologic surgery) or by clinical follow-up and symptom resolution documented in patients who were treated conservatively by observation.
Operator and physician experience
The level of experience for both the assessing ED physician and the US operator was categorized. For ED physicians, “Resident” referred to a physician in their residency training program, while “Senior” referred to a board-certified emergency medicine physician. For US operators, the categories were defined as follows: “Resident” (a radiology or emergency medicine resident in years 1–3 of training), “Senior Radiologist” (a board-certified radiologist with >5 years of experience), and “Senior EM Physician” (a senior emergency medicine physician credentialed in point-of-care ultrasound).
Outcome measures and definitions
The primary outcome was the diagnostic accuracy of initial clinical assessment and ultrasound examination. The final diagnosis served as the reference standard. For the evaluation of a diagnosis: • True positive (TP): Clinical assessment or US positive for a diagnosis, and final diagnosis confirmed same diagnosis. • False positive (FP): Clinical assessment or US positive for a diagnosis, but final diagnosis was not same diagnosis. • True negative (TN): Clinical assessment or US negative for a diagnosis, and final diagnosis was not same diagnosis. • False negative (FN): Clinical assessment or US negative for a diagnosis, but final diagnosis confirmed same diagnosis.
Based on these, the following diagnostic performance metrics were calculated for both clinical assessment and ultrasound as presented by Bourcier et al.
2
and Nowikiewicz et al.
9
: • Sensitivity: TP/(TP + FN) • Specificity: TN/(TN + FP) • Positive predictive value (PPV): TP/(TP + FP) • Negative predictive value (NPV): TN/(TN + FN) • Accuracy: (TP + TN)/(TP + FP + TN + FN)
Statistical analysis
Descriptive statistics were used to summarize patient characteristics, clinical findings, laboratory results, and diagnostic categories. Frequencies (F) and percentages (%) were calculated for categorical variables. Means and standard deviations (Mean ± SD) were calculated for continuous variables.
Sensitivity, specificity, PPV, NPV, and accuracy, along with their corresponding 95% confidence intervals (CIs), were calculated for both clinical + laboratory assessment and ultrasound in diagnosis, using the final diagnosis as the gold standard. The relative accuracies of the ultrasound and clinical + laboratory diagnosis were compared using McNemar’s test, with a threshold for significance set at p < 0.05.
Univariable analysis was performed to identify factors associated with correct/incorrect diagnostic accuracy for both ultrasound and clinical + laboratory assessment in diagnosing appendicitis. For categorical variables, the Chi-square test or Fisher’s exact test was used as appropriate. For continuous variables, independent samples t-tests or Mann–Whitney U tests were used to compare means/medians between accurate and inaccurate diagnosis groups. A p-value <0.05 was considered statistically significant for univariable analyses.
To identify independent predictors of diagnostic accuracy for appendicitis, multivariable binary logistic regression models were developed. Separate models were created for ultrasound diagnostic accuracy and clinical + laboratory diagnostic accuracy. Variables found to be statistically significant (p < 0.05) in the univariable analysis were considered for inclusion in the multivariable models. For categorical predictors with two levels or more than (US operator experience and ED physician experience), the category deemed most appropriate as baseline (Resident) was used as the reference. Adjusted odds ratios (aORs) and their 95% CIs were calculated. Potential collinearity between predictor variables was assessed. A p-value <0.05 was considered statistically significant for the multivariable models. All data were managed and analyzed using SPSS.
Results
Patients characteristics (N = 525).
aMean ± SD.
The mean age of the patients was 28.4 ± 13.19 years, and a slight predominance of females was observed (289, 55.05%). The majority of patients were initially assessed by resident emergency medicine physicians (316, 60.19%).
Regarding clinical presentation, loss of appetite (499, 95.05%) and nausea (473, 90.10%) were the most frequently reported symptoms. Pain migration was present in 263 (50.10%) patients. The mean symptom duration was 34.8 ± 9.15 h. On physical examination, tenderness in the right iliac fossa was nearly universal (473, 90.10%), followed by rebound tenderness (378, 72.00%) and guarding/rigidity (365, 69.52%). The mean temperature on admission was 37.92 ± 0.79°C.
Laboratory findings indicated a mean WBC count of 11.6 ± 4.01 ×109/L and a mean neutrophil percentage of 73.2 ± 5.89%. Urinalysis was abnormal in 171 (32.57%) patients.
The initial clinical diagnosis by the ED physician most frequently suggested appendicitis (289, 55.05%), followed by ovarian pathology (110, 20.95%). Ultrasound examinations were most commonly performed by residents (257, 48.95%). Ultrasound diagnosed appendicitis in 315 (60.00%) cases.
The final diagnosis confirmed appendicitis in 273 (52.00%) patients. Other common final diagnoses included ovarian cyst and torsion (115, 21.90%) and right lower ureteric stone (69, 13.14%). The final diagnosis was confirmed by surgery and pathology in 403 (76.76%) cases, while the remainder were confirmed through observation and follow-up.
Diagnostic performance of ultrasound versus clinical assessment for acute appendicitis (N = 525).
CI: confidence interval; PPV: positive predictive value; NPV: negative predictive value.
Ultrasound demonstrated markedly higher sensitivity compared to clinical + laboratory assessment (89.7%, 95% CI: 85.6%–93.0% vs 67.4%, 95% CI: 61.6%–72.8%). Specificity was also superior for ultrasound (67.1%, 95% CI: 61.0%–72.7%) compared to clinical + laboratory assessment (46.4%, 95% CI: 40.2%–52.8%).
Reflecting these differences, the PPV for ultrasound was 74.7% (95% CI: 69.8%–79.1%), substantially higher than the PPV for clinical + laboratory assessment (57.7%, 95% CI: 52.2%–63.0%). Similarly, the NPV was notably better for ultrasound (85.8%, 95% CI: 80.3%–90.1%) than for clinical + laboratory assessment (56.8%, 95% CI: 50.1%–63.3%).
Overall, the diagnostic accuracy of ultrasound was 78.9% (95% CI: 75.1%–82.2%), significantly outperforming the accuracy of clinical + laboratory assessment, which was 57.3% (95% CI: 53.0%–61.6%). The non-overlapping confidence intervals for all metrics suggest statistically significant differences in diagnostic performance, favoring ultrasound.
The dominant final diagnosis was appendicitis (273, 52%). The overall diagnostic accuracy for appendicitis using ultrasound was 78.9% (414/525 correct diagnoses), while the overall accuracy of the initial clinical + laboratory assessment was 57.3% (301/525 correct diagnoses). Univariable analysis was performed to identify patient demographics, clinical presentation features, and laboratory findings associated with the accuracy of ultrasound diagnosis and clinical-laboratory assessment.
Univariate analysis of factors associated with clinical-laboratory diagnostic accuracy for appendicitis (N = 525).
Bolded represent significant p-values.
Patient age showed a borderline significant association, with younger patients having slightly less accurate clinical-laboratory diagnoses (mean 32.5 ± 13.0 years for correct vs 29.0 ± 10.8 years for incorrect, p = 0.048). Gender, symptom duration, and other individual symptoms like loss of appetite or nausea did not demonstrate a significant association with clinical-laboratory accuracy (p > 0.05 for all).
Univariate analysis of factors associated with ultrasound diagnostic accuracy for appendicitis.
Bolded represent significant p-values.
EM: emergency medicine.
No significant difference was found in mean patient age between those correctly diagnosed by ultrasound (mean 30.8 ± 11.5 years) and those incorrectly diagnosed (mean 31.5 ± 12.9 years; t-test = −0.85, p = 0.39). Similarly, gender, symptom duration, and the other symptoms did not show a statistically significant association with ultrasound accuracy in this univariable analysis (p > 0.05 for all).
To identify independent predictors of diagnostic accuracy for appendicitis, multivariable logistic regression models were developed. Separate models were created for ultrasound diagnostic accuracy and clinical-laboratory diagnostic accuracy. Variables that were found to be statistically significant (p < 0.05) in the univariable analysis were considered for inclusion in the multivariable models.
For the clinical-laboratory diagnostic accuracy model of appendicitis, predictors included age, ED experience, presence of pain migration, rebound tenderness, Rovsing’s sign, temperature, WBC count, and neutrophil percentage.
Multivariate logistic regression analysis of independent predictors for clinical-laboratory diagnostic accuracy.
Bolded represent significant p-values.
aOR: adjusted odds ratio; CI: confidence interval; WBC: white blood cell; ED: emergency department.
The multivariable logistic regression model for ultrasound diagnostic accuracy included US operator experience, presence of rebound tenderness, WBC count, and neutrophil percentage as potential predictors.
Multivariate logistic regression analysis of independent predictors for ultrasound diagnostic accuracy.
Bolded represent significant p-values.
aOR: adjusted odds ratio; CI: confidence interval; EM: emergency medicine; WBC: white blood cell.
Discussion
The findings of this large retrospective study involving 525 patients underscore the superior diagnostic utility of ultrasound compared to initial clinical-laboratory assessment alone for acute appendicitis in patients presenting with RIF pain in our Iraqi emergency department settings. With an overall accuracy of 78.9%, ultrasound significantly outperformed clinical-laboratory assessment (57.3%), primarily driven by its markedly higher sensitivity (89.7% vs 67.4%) and better specificity (67.1% vs 46.4%). These results align with literature suggesting that while clinical evaluation is fundamental, imaging significantly enhances diagnostic precision for RIF pain, particularly for appendicitis. 2
A key strength of ultrasound identified in our study is its high sensitivity. This is crucial in an emergency setting to correctly identify patients requiring surgical intervention, thereby minimizing delays in treatment for true appendicitis. The NPV of 85.8% for ultrasound, while good, indicates that a negative ultrasound does not entirely rule out appendicitis, a finding consistent with other studies acknowledging the limitations of US, such as operator variability or difficult visualization.1,9 Our multivariable analysis further highlighted that US operator experience was an independent predictor of ultrasound accuracy, with Senior EM Physicians achieving higher accuracy than residents. This finding emphasizes the importance of structured training and experience in optimizing US performance, a well-recognized factor in sonography.2,4 The presence of rebound tenderness also independently predicted higher US accuracy, suggesting that focused scanning in patients with clear peritoneal signs might yield more definitive results.
A key finding of our study is the successful application of a portable, point-of-care ultrasound (POCUS) device (SonoSite M-Turbo). This small-size system demonstrated diagnostic accuracy on par with that reported in studies using traditional, stand-alone machines.10,11 The portability and accessibility of such devices are crucial in a busy emergency department, allowing for rapid bedside assessment. Our results, which show even residents achieved a 73.9% accuracy rate, support the growing trend of integrating POCUS into the standard workflow of emergency physicians for evaluating RIF pain, which can expedite diagnosis and patient disposition.
The relatively lower specificity of clinical-laboratory assessment (46.4%) in our cohort indicates a higher rate of false positives when relying on these initial findings alone, potentially leading to unnecessary further investigations or even surgery. While ultrasound’s specificity (67.1%) was better, it still implies a considerable number of false positives. This can be attributed to the wide range of pathologies presenting with RIF pain, such as ovarian cysts (a common final diagnosis in 21.90% of our cohort) or ureteric stones, which can mimic appendicitis both clinically and sometimes sonographically.1,4 Indeed, ovarian pathology was the second most common initial clinical diagnosis after appendicitis, and also a frequent ultrasound and final diagnosis, particularly relevant given that 55.05% of our cohort were female.
Our study also explored factors influencing the accuracy of the initial clinical-laboratory assessment. The multivariable analysis showed that ED physician experience, presence of pain migration, rebound tenderness, higher WBC count, and higher neutrophil percentage were independent predictors of accurate clinical-laboratory diagnosis. This suggests that experienced clinicians, guided by classic signs and supportive laboratory markers, can achieve better diagnostic accuracy even before imaging. However, the overall lower performance compared to US suggests that these factors alone are insufficient for definitive diagnosis in a substantial proportion of cases.
The finding that residents performed a majority of the ultrasound scans (48.95%) and also conducted most initial ED assessments (60.19% by resident emergency medicine physicians) is an important contextual factor. While their US accuracy was lower than Senior EM Physicians, providing access to US performed by residents, potentially as a point-of-care tool under appropriate supervision or with clear protocols, could still be beneficial compared to relying solely on clinical-laboratory assessment, as indicated by the overall better performance of US. This aligns with studies exploring point-of-care ultrasound by emergency physicians. 2
This study’s use of a large sample size from three hospitals enhances the generalizability of our findings within the Iraqi context. The retrospective design is a limitation, relying on recorded data, which may be subject to documentation variability. Furthermore, the lack of a single, standardized ultrasound protocol across the three hospitals is a limitation, although accepted clinical guidelines were followed. The exclusion of 95 patients who did not undergo ultrasound, mostly due to proceeding directly to surgery, introduces a potential selection bias, as this cohort may have had a higher pre-test probability of appendicitis. Of the patients with a negative or equivocal ultrasound in our study, medical records indicated that only a small fraction (approximately 8%) proceeded to have a CT scan, reflecting the role of US as a definitive primary imaging tool in our setting. However, data was systematically collected using a standardized form. The definition of “clinical + laboratory assessment” encompassed initial ED physician diagnosis which inherently includes their interpretation of symptoms, signs, and basic labs; future studies could attempt to model these elements separately.
The current study did not evaluate CT scanning in direct comparison for the entire cohort, which is known to have high accuracy for appendicitis. However, our focus was on the initial workup comparing US with clinical assessment, as US is often the first-line imaging due to accessibility and safety. The relatively high rate of surgical confirmation (76.76%) provides a robust reference standard for a large portion of the cohort.
The challenges in diagnosing RIF pain, particularly in differentiating appendicitis from gynecological conditions in females, as highlighted by Ahmed et al. 1 and Standing et al., 6 are reflected in our data. Ultrasound, with its ability to visualize pelvic organs, proved superior to clinical assessment alone in this diverse patient group.
In conclusion, this study demonstrates that ultrasound has significantly better diagnostic performance than initial clinical-laboratory assessment for acute appendicitis in patients presenting with RIF pain in the studied Iraqi emergency settings. The higher sensitivity, specificity, predictive values, and overall accuracy of ultrasound underscore its role as a critical diagnostic tool. Factors such as US operator experience influence its accuracy, highlighting the need for continuous training and quality assurance. While clinical acumen, particularly with experienced ED physicians, remains vital, integrating ultrasound early in the diagnostic pathway for RIF pain can lead to more accurate and timely diagnoses, potentially reducing diagnostic uncertainty and improving patient management. Further research could explore standardized ultrasound protocols and the impact of integrating US findings with clinical scoring systems in this specific population to optimize diagnostic pathways.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Ethical considerations
Institutional review board approval was waived at all institutions to access the hospital electronic medical records on November 15, 2023 (Approval Number: O-2-2023).
Consent to participate
Informed verbal consent was obtained from all subjects involved in the study.
Data Availability Statement
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
