Abstract
Artificial intelligence (AI) is a rapidly growing field with significant implications for radiology. Acute abdominal pain is a common clinical presentation that can range from benign conditions to life-threatening emergencies. The critical nature of these situations renders emergent abdominal imaging an ideal candidate for AI applications. CT, radiographs, and ultrasound are the most common modalities for imaging evaluation of these patients. For each modality, numerous studies have assessed the performance of AI models for detecting common pathologies, such as appendicitis, bowel obstruction, and cholecystitis. The capabilities of these models range from simple classification to detailed severity assessment. This narrative review explores the evolution, trends, and challenges in AI applications for evaluating acute abdominal pathologies. We review implementations of AI for non-traumatic and traumatic abdominal pathologies, with discussion of potential clinical impact, challenges, and future directions for the technology.
This is a visual representation of the abstract.
Introduction
Radiology is experiencing a transformative shift due to advances in artificial intelligence (AI). AI algorithms have been employed for diverse tasks ranging from automated image interpretation to report generation. 1 The unique challenges and opportunities associated with acute abdominal pathologies, which often require rapid and accurate diagnosis, make them an ideal area for the integration of AI technologies. This narrative review aims to discuss the current landscape, emerging trends, and challenges of AI applications in acute abdominal imaging.
AI refers to the creation of algorithms and computational models that perform tasks typically requiring human intelligence, such as pattern recognition, decision making, and language understanding (Table 1). Deep learning is a subtype of AI involving neural networks which can automatically learn and improve without being explicitly programmed, based on multi-layer data structures that resemble the functioning of the human brain. In radiology, convolutional neural networks (CNNs) are the most commonly used architecture, designed primarily to process image data. These networks extract hierarchical image features to identify patterns, enabling tasks relevant to radiology like image classification and segmentation. 2
Definitions of Commonly Used Terms Related to AI.
Acute abdominal pain is a common yet complex clinical presentation, encompassing a wide spectrum of conditions ranging from benign to life-threatening. Current diagnostic approaches rely heavily on imaging including computed tomography (CT), ultrasound, and radiographs. The interpretation of these imaging modalities is time-intensive and dependent on the expertise of radiologists. In this context, AI, particularly deep learning, has emerged as a transformative tool, offering the potential to augment diagnostic accuracy, reduce time to diagnosis, and alleviate the workload of radiologists and referring providers.3,4
This review explores the role of AI in enhancing the interpretation of acute abdominal imaging, discussing how these technologies assist in the detection and characterization of acute abdominal conditions. Key areas of focus include the use of AI in improving the diagnostic accuracy of appendicitis, bowel obstruction, cholecystitis, and other emergent abdominal pathologies. Additionally, we explore the use of artificial intelligence in the assessment of abdominal trauma, including identification and grading of injuries. The review also discusses several potential clinical applications of AI in workflow optimization, including automated triaging of cases and as a clinical decision-making aid. The challenges and limitations inherent in the adoption of AI in clinical practice, such as the need for large, annotated datasets are also addressed.
Clinical Applications
Non-Traumatic Pathologies
Non-traumatic abdominal conditions, such as appendicitis, cholecystitis, and bowel obstructions, present numerous diagnostic challenges. These conditions often manifest with similar symptoms and physical examination findings, making accurate diagnosis based solely on clinical assessment difficult. Imaging, therefore, plays a crucial role in differential diagnosis. AI-assisted image analysis can help expedite the diagnostic process, potentially leading to quicker and more accurate decision-making and reduced time to treatment (Table 2). Most AI models are designed to assess a single pathology, however, there have been several recent models which have attempted to integrate diagnosis of multiple pathologies—a significant step toward creating a more comprehensive diagnostic system.
Summary of Selected AI Models for Acute Abdominal Imaging.
Note. Results reported with 95% confidence intervals or standard deviation (±) when available. AUC = area under the receiver operating characteristic curve; AAA = abdominal aortic aneurysm; CNN = convolutional neural network; CT = computed tomography; CV = cross-validation; DSC = dice similarity coefficient; EVAR = endovascular aneurysm repair; FAST = focused assessment with sonography in trauma; RNN = recurrent neural network; SVM = support vector machine; US = ultrasound; XR = radiography.
Pneumoperitoneum
The diagnosis of acute bowel pathologies, such as bowel perforation, obstruction, and inflammatory conditions, is a complex task that requires careful evaluation of imaging studies by radiologists. AI algorithms, particularly those based on deep learning, have shown promising results in identifying these conditions with high accuracy.
Abdominal radiographs are commonly used as a first-line imaging modality for screening of acute abdominal pathology, such as perforation. Pneumoperitoneum is a critical finding on abdominal radiograph that may indicate a perforated viscous, often requiring urgent surgical intervention. 5 Early detection of pneumoperitoneum is essential for prompt management, which can significantly reduce the risk of complications such as sepsis and peritonitis. 6 However, in cases where patients cannot stand due to pain, supine radiographs are used, albeit with reduced sensitivity for detecting small amounts of free air. Park et al developed a deep-learning model for identifying pneumoperitoneum which demonstrated high accuracy on both erect and supine radiographs compared to radiologists. 7 For supine radiographs, sensitivity and specificity on an external dataset were 84.9% (95% CI: 77.5%-90.7%) and 74.0% (95% CI: 68.4%-79.1%), respectively. For erect radiographs, sensitivity and specificity were 83.3% (95% CI: 75.7%-89.4%) and 93.4% (95% CI: 89.8%-95.1%), respectively.
CT is a more sensitive modality for detecting pneumoperitoneum and is used when clinical suspicion is high. Brejnebøl et al examined the diagnostic performance of an AI algorithm for detecting pneumoperitoneum on CT for patients presenting with acute abdominal pain. 8 Their model demonstrated high specificity (99%), but only moderate sensitivity (52%), with most false-negative cases having small volumes of free air (<0.25 mL). Because of the moderate sensitivity and potential critical implications of false negatives, the authors concluded that their algorithm was not suitable for stand-alone screening. However, due to its high specificity, the model could instead have applications for ruling in pneumoperitoneum, with potential for integration into existing workflows to expedite treatment for acutely ill patients.
Bowel Obstruction
Small bowel obstructions are another cause of acute abdominal pain, presenting with symptoms of abdominal distention and obstipation. In emergency settings, abdominal radiographs and CT scans are utilized to diagnose these obstructions. However, distinguishing them from non-surgical conditions such as ileus, which may appear similar on imaging, can be challenging due to overlapping radiologic features. Recent studies have explored the use of deep learning to identify small bowel obstruction on abdominal radiograph9-11 and CT.12-14 The models demonstrated overall high diagnostic accuracy, with sensitivity and specificity ranging from 83.8% to 91.4% and 68.1% to 93.0% respectively, for radiographs. For CTs, sensitivity and specificity ranged from 83.0% to 98.0% and 76.0% to 90.0% respectively. An important aspect in evaluating small bowel obstruction is identifying transition zones, which can indicate the obstruction’s cause, differentiate between open-loop and closed-loop patterns, and aid in surgical planning. However, locating these transition zones can be time-consuming and subject to inter-reader variability. 15 Vanderbecq et al developed a deep learning model to help radiologists identify small bowel transition zones. 13 The model successfully highlighted the region containing the transition zone in 92% of cases, indicating its potential to reduce search time and pave the way for fully automated localization algorithms in the future.
Acute Appendicitis
For diagnosing acute appendicitis, CT and ultrasound are typically the first-line imaging modalities used. AI-assisted diagnosis of acute appendicitis has been extensively researched: a systematic review identified 22 models to diagnose the condition. 16 However, of these, only 3 models used imaging as the input modality, with the remainder using demographic factors, clinical observations, laboratory data, or a combination thereof. The models that did use imaging data (CT scans) generally showed strong performance, with sensitivities ranging from 78.4% to 90.2% and specificities between 66.7% and 96.0%.17-19 One of the top performing models employed a 3D CNN algorithm to diagnose appendicitis on a manually extracted 4 cm3 appendix region from CT scans, demonstrating a sensitivity and specificity of 90.2% and 92.0%, respectively. 17 The model demonstrated comparable accuracy when tested with data from external institutions, indicating its potential generalizability.
Ultrasound is a frequently used tool for diagnosing acute appendicitis in paediatric patients, but the effectiveness of AI in this context is less studied due to the dynamic and operator-dependent nature of the modality. Hayashi et al assessed the performance of a model to identify the appendix on cine-images of paediatric appendicitis cases using a U-Net-based deep learning architecture. 20 Although this study was conducted on a relatively small dataset of 70 patients (50 for training, 20 for testing), the inflamed appendix was at least partially identified in 70% of cases. The AI assistance was particularly beneficial for ultrasounds with shallower scan areas, with performance declining when the scan depth exceeded 8 cm. The study also investigated how AI assistance impacted the diagnostic confidence of clinical staff. Notably, when the AI failed to accurately detect the appendix, it negatively affected the evaluators’ confidence in their diagnosis. This parallels findings in breast imaging, where computer-aided diagnosis (CAD) has shown varied impacts on reader confidence. 21 These results suggest that while AI can be valuable for assisted scanning techniques, its integration requires consideration of its limitations and strategies to minimize potential negative effects on clinical decision-making.
Necrotizing Enterocolitis and Intussusception
Deep learning methods have also been developed for diagnosis of paediatric-specific acute abdominal pathologies such as necrotizing enterocolitis (NEC) and intussusception.22-25 Due to the radiation dose, abdominal radiographs and ultrasound are preferentially used compared to CT. NEC is a critical diagnosis in neonatal infants given the high morbidity and mortality associated with the condition. 26 Gao et al used deep learning to assess abdominal radiographs for NEC, with sensitivity and specificity of 85.4% (95% CI: 81.0%-89.8%) and 80.7% (95% CI: 75.8%-95.6%), respectively. 22 When combined with clinical features, sensitivity and specificity increased to 94.3% (95% CI: 91.4%-96.5%) and 82.5% (95% CI: 77.7%-87.1%), respectively, underscoring the potential of AI to integrating multimodal data to enhance diagnostic accuracy. When compared to clinicians, the multimodal model was found to be equivalent to senior clinicians and superior to junior clinicians.
Ileocolic intussusception is an important cause of acute abdominal pain in younger paediatric patients, given its association with complications such as bowel ischaemia and perforation if left untreated. 27 While ultrasound is superior to radiography for detection of intussusception, radiographs are commonly used as an initial modality despite fairly low sensitivity for intussusception (45%). 27 Kim et al evaluated the performance of a deep learning model for identification of ileocolic intussusception on abdominal radiographs. 23 When compared to the average performance of two radiologists, the model sensitivity was significantly higher (76% vs 46%, P = .013), with comparable specificity (96% vs 92%, P = .32). The results suggest the potential for AI to be used as a support tool, such as in settings with limited access to specialized paediatric radiologists.
There are additional examples of applications for other acute bowel pathologies, such as colitis 28 and acute diverticulitis18,29 highlighting the diversity in pathologies and clinical presentations. Additionally, combined applications to identify and differentiate multiple acute abdominal pathologies have been proposed,18,30 demonstrating the potential for future integrated diagnostic models.
Cholecystitis
The applications of AI extend beyond identifying bowel pathologies to a range of other non-traumatic acute abdominal conditions. These include pathologies such as cholecystitis, pancreatitis, renal colic, and aortic aneurysms. Each of these conditions presents its own unique set of challenges for model development and implementation.
Cholecystitis is a common cause for acute abdominal pain. Delayed diagnosis can lead to complications including gangrenous cholecystitis, gallbladder perforation and hepatic abscess, 31 making timely diagnosis essential. Ultrasound and CT are the modalities of choice for diagnosis. On ultrasound, Yu et al demonstrated a CNN model for detection of cholelithiasis and cholecystitis with performance comparable to human readers even in cases where only two-thirds of the gallbladder was visible. 32 This could be applied to point-of-care ultrasound, where the model could assist emergency physicians with patient triage or determining the need for a diagnostic exam. Identifying complicated cholecystitis is also important for guiding appropriate surgical intervention and patient prognostication. A preliminary study assessing the accuracy of a CNN for differentiating pathologically proven gangrenous cholecystitis versus non-complicated cholecystitis on CT demonstrated sensitivity and specificity of 70% (95% CI: 44%-87%) and 93% (95% CI: 88%-96%) respectively, surpassing performance of experienced radiologist and surgeon reviewers. 33 These findings indicate that AI is useful not only in identifying diseases, but also in characterizing key clinical findings such as gallstones and gangrene, which are crucial for making informed treatment decisions.
Pancreatitis
Pancreatitis is another area where imaging plays an important role in diagnosis and assessing disease severity. However, imaging features for pancreatitis are varied and patients may have a normal-appearing pancreas on imaging, despite having positive biochemical and clinical features. 34 Radiomics, an emerging field in AI, leverages quantitative analysis of imaging features, potentially revealing details that may not be discernable to the human eye. 35 A study by Mashayekhi et al used a radiomics model to differentiate between functional abdominal pain, recurrent acute pancreatitis, and chronic pancreatitis, based on a small dataset of 56 CT studies. 36 Overall model accuracy was 82.1%, with sensitivity and specificity for the recurrent acute pancreatitis group of 95% and 78%, respectively. These findings underscore the potential for radiomics’ to enhance diagnostic precision beyond traditional methods. Pancreatitis complications such as necrosis are an important imaging finding with management and prognostic implications. A study by Lin and Lin estimated severity of complicated pancreatitis by segmenting the volume of peripancreatic fluid relative to normal parenchyma. 37 This model was accurate, identifying peripancreatic fluid collections with 89.6% accuracy. Such segmentation models are valuable for quantitative assessment and may aid in severity grading and improved prognostication. 38
Renal Colic
Renal colic is a common reason for presentation to the emergency department (ED). Several models have been developed for identification of ureteric calculi on CT, with sensitivity ranging from 88.0% to 97.0% and specificity from 91.0% to 98.9%.39-44 Parakh et al developed a cascading CNN model for detecting urinary stones on unenhanced CT images. 39 The model involved 2 CNNs, the first to identify relevant CT image slices containing the urinary tract, and the second to identify the presence of stones within the selected slices. The sensitivity and specificity of the model were 94.0% (95% CI: 87.4%-100%) and 96.0% (95% CI: 90.6%-100%) respectively, with calculi correctly identified in all cases where obstructive uropathy was present. Notably, models were able to correctly identify calculi irrespective of the presence of phleboliths, which can pose a diagnostic challenge. 45
Acute Aortic Syndrome
Acute aortic pathologies encompass a range of disorders, among which are critical conditions like ruptured aortic aneurysms, requiring immediate diagnosis and treatment. CT angiography is the first-line imaging modality for diagnosis. Multiple studies have evaluated automated assessment of the aorta on CT, most commonly for assessing abdominal aortic aneurysms (AAAs), 46 with one top-performing model demonstrating sensitivity and specificity for detecting AAA of 98% and 96%, respectively. 47 There is additionally one commercially available algorithm for AAA detection which recently received FDA-approval. 48 Particularly high-risk patients are those who have undergone endovascular aneurysm repair (EVAR), a common treatment for aortic aneurysms. Endoleaks, which may occur due to incomplete contact of the stent graft with the aortic wall or ruptures in the graft material, can cause the aneurysm sac to keep growing, thereby increasing rupture risk. 49 Talebi et al evaluated a model to detect endoleaks, achieving 90% precision and 100% recall, outperforming 3 general diagnostic radiologists and comparable to a cardiovascular subspecialty radiologist’s assessment. 50 The size of the aneurysm sac is also a crucial indicator of patient outcomes in post-EVAR patient, with aneurysm expansion being a potential marker for increased rupture risk. 51 Hahn et al developed a model for automatically calculating the volume of post-EVAR aneurysm sacs with an average segmentation accuracy of 91% ± 5%. 52 Use of automated volume estimation may improve accuracy compared to traditional diameter-based sizing, 53 allowing for more sensitive detection of smaller leaks.
Traumatic Pathologies
In the setting of abdominal trauma, imaging is essential for rapid identification of intra-abdominal injury to direct resuscitation and management. Blunt or penetrating abdominal trauma can produce varying injury patterns, such as organ laceration, bowel perforation, and hemoperitoneum. Numerous AI models have been developed to identify injuries at multiple steps along the trauma imaging pathway, with a recent scoping review showing 212 algorithm prototypes and 10 FDA-approved tools. 54
Hemoperitoneum
Focused assessment with sonography in trauma (FAST) is a widely used initial diagnostic tool in clinical settings for injury screening. FAST exams are quick, non-invasive, and provide immediate insights at the bedside into the presence of organ injury or secondary signs like hemoperitoneum. However, a notable limitation of FAST is that it is operator dependent, leading to varying interpretation accuracy. AI may assist in standardizing FAST performance by offering a more reliable interpretation, reducing variability caused by different levels of operator expertise. Cheng et al developed a CNN model for identifying free fluid in Morrison’s pouch, 55 a common finding indicative of intraabdominal injury. The study, which used a training dataset of 324 patients and a test dataset of 36 patients, achieved sensitivity of 97.6% and specificity of 94.7%, comparable to or exceeded that of emergency medicine residents. 56 Other studies have achieved similarly strong results for identifying splenic trauma and hemoperitoneum on FAST exams.57-59
On CT, several models have also been developed for identification of hemoperitoneum.60,61 Dreizin et al developed a method for detection and quantification of hemoperitoneum using an attention-based CNN. 61 Model performance for predicting clinically significant hemoperitoneum, defined using a composite outcome of haemostatic intervention, transfusion and in-hospital mortality were significantly better than volume estimation using traditional subjective methods, with a sensitivity of 82% and specificity of 93%. 61 Quantitative assessment is an area where artificial intelligence has the potential to improve on current methods, with enhanced decision-making processes based on predictive modelling.
Solid Organ Injury
CT is the workhorse for imaging of acute abdominal trauma, with diagnostic imaging integrated into most trauma workflows. Several models have been developed for automated identification of acute abdominal traumatic injury, including solid organ injury to the liver, spleen, and kidneys.62-64 Several of these models have additionally evaluated automated injury grading. Hamghalam et al evaluated use of a 3D CNN for automatically grading the severity of splenic injuries, based on the American Association for the Surgery of Trauma (AAST) spleen injury scale. 64 Their model demonstrated high sensitivity (91.1% ± 2.3%) and specificity (94.7% ± 1.8%) for high-grade injuries with vascular involvement (AAST grade IV and V).
The Radiological Society of North America (RSNA) hosted an Abdominal Trauma Detection AI Challenge in 2023. 65 Researchers were tasked with developing models to detect and classify traumatic injuries across multiple organs, including the liver, spleen, kidneys, and bowel. The winning solution used a multi-step approach involving 3D segmentation for organ masking and a combination of 2D convolutional and recurrent neural network techniques. 66 Code competitions represent a novel approach for innovation, where large public source datasets allow community members to create tools to advance the specialty.
Discussion
Clinical applications of AI in acute abdominal imaging are rapidly evolving, with potential to aid in diagnosis of various pathologies. AI algorithms have shown strong accuracy in identifying and characterizing acute traumatic and non-traumatic abdominal pathologies. These technologies have the potential to improve diagnostic accuracy, expedite ED turnaround and offer clinical decision support beyond currently used pathways.
The development of advanced image interpretation models, driven largely by the popularity of deep learning and CNNs, will likely have significant implications for clinical practice. Current evidence supports the view that AI-assisted radiologists can operate more efficiently, including in emergency settings where AI can enhance workflow and consequently reduce reading time in certain applications. 67 There are already a large number of commercially available, Health Canada and FDA-approved algorithms for radiology, for tasks including lesion detection, organ segmentation, and classification of findings. 46 However, few of these algorithms are directly related to tasks involving acute abdominopelvic imaging, possible due to the heterogeneity in image appearances across patients and organ systems. The continual evolution of model architectures and training techniques will likely facilitate the development of more diverse clinical applications in abdominal imaging.
Clinical decision support models which combine imaging features, laboratory results, and clinical symptoms may also assist in providing a comprehensive assessment of a patient’s condition by integrating multiple data sources. For instance, Reismann et al developed a diagnostic algorithm for predicting complicated appendicitis based on a combination of US, laboratory, and clinical findings. 68 Other clinical decision algorithms for ovarian torsion, 69 bowel obstruction, 70 abdominal trauma 71 have demonstrated accurate performance for both diagnostic and prognostic applications. These integrated models may offer individualized data beyond the capabilities of current ED pathways.
Beyond image interpretation, AI may also assist in the emergency radiologist’s workflow in tasks such as study protocolling and worklist prioritization. 72 Wang et al found that the CT workflow interval, defined by the time of initial CT request to provision of first report, accounted for 29% of patient’s average total ED length of stay. 73 Of this, radiology turnaround time, defined by CT completion to provision of first report, accounted for 32% of the CT workflow interval. 73 There has been emerging research exploring how natural language processing models, which enable computers to understand human language, can be used for automated triage and protocol of radiology studies.74,75 By integrating AI into non-interpretative workflows, patient flow through the ED can be improved through quicker protocolling and interpretation of critical studies.
Despite these advancements, several challenges limit the broader adoption of AI in clinical practice for acute abdominal imaging. There is significant variability in the size and quality of datasets used for model development (Table 2), which leads to challenges when evaluating model performance in real-world scenarios. Large, multi-institutional, annotated datasets are required to develop robust and generalizable models. While there are several publicly available image datasets for other radiology subspecialties such as chest and musculoskeletal imaging, 76 the availability of high quality public data in abdominal radiology is limited. Furthermore, the lack of standardization and interoperability across different healthcare systems poses challenges for model implementation. There is additionally skepticism amongst radiologists of the utility and reliability of AI, in part due to the complexity of the models and lack of transparency. 77 While some of these reservations may be addressed as more clinically validated models become available, ongoing work to increase model interpretability is needed to improve trust in the results. 78
In conclusion, the application of AI in acute abdominal imaging is an emerging area of innovation with significant potential to transform diagnostic processes and patient care. Although there are ongoing challenges, the combined efforts of the medical and technological sectors are progressing toward a future where AI can improve image interpretation and clinical decision-making in acute care environments.
Footnotes
Abbreviations
AAA Abdominal aortic aneurysm
AI Artificial Intelligence
CAD Computer-aided diagnosis
CNN Convolutional neural network
CT Computed tomography
ED Emergency department
EVAR endovascular aneurysm repair
FAST Focused assessment with sonography in trauma
NEC Necrotizing enterocolitis
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
