Abstract
Artificial intelligence (AI) is increasingly integrated into point-of-care ultrasound (POCUS) to enhance its utility in critical care settings. This manuscript explores the current state of AI applications in POCUS, focusing on key domains such as image acquisition, image interpretation, education, task automation, procedural guidance, program development, and quality assurance. AI-driven tools can potentially improve image quality, provide real-time feedback, and assist in the interpretation of ultrasound images, thereby democratizing the use of POCUS across varying levels of operator expertise. This narrative review highlights relevant studies demonstrating the clinical utility of AI in POCUS, discusses the challenges that remain, and provides insights into future developments. The goal is to equip intensivists with a comprehensive understanding of how AI can support POCUS practice today and what advancements are on the horizon.
Introduction
It's in your phone, your car, your thermostat, and now your ultrasound machine. You didn’t ask for it, but there it is - primed and ready to opine on whether your patient really needs that extra bolus of fluid. Artificial intelligence (AI) is quietly permeating our everyday lives, and the Intensive Care Unit (ICU) is not excluded. Point-of-care ultrasound (POCUS) is an indispensable tool to the modern-day intensivist, bringing real-time diagnosis and monitoring to the bedside. However, ultrasound quality and interpretation are highly operator-dependent, posing challenges in the ICU where operator experience varies widely. 1 AI is positioned to bridge this expertise gap and democratize the use of POCUS, with handheld and cart-based devices now routinely incorporating AI-driven features for image optimization, anatomical labelling, and automated measurements.
AI is a term that broadly describes any task performed by a computer meant to mimic human intelligence. 2 Machine learning (ML) is a more defined subset of AI that mimics not only human intelligence, but human learning. Recall the last time you were studying for an exam using a bank of practice questions. You would make an educated guess, then flip to the back of the textbook to check the answer; if incorrect, you would update your understanding of the concept being tested, and only after the exam would you know whether the question bank was worth the cost you paid. In this analogy, the practice questions are the training data that engineers use to build a ML model. The exam itself is a subset of data set aside that the model has not seen, used to test the model. Whether or not the question bank was worth it is referred to as the ‘fit’ of the model - a good fit means the question bank was reflective of the content in the exam. Finally, imagine going to another school to write their version of the exam. How well you perform here represents the ‘generalizability’ of the knowledge you gained from your practice questions. 2
AI is presently able to enhance POCUS across a spectrum of functional domains relevant to critical care. This paper will examine the current state of AI for critical care ultrasound, organized by key domains: image acquisition, image interpretation, education, task automation, procedural guidance, program development, and quality assurance. We highlight current capabilities, relevant studies demonstrating clinical utility, and remaining challenges. 3 We aim to provide intensivists with a comprehensive understanding of how AI can support POCUS practice today, and what future developments are on the near horizon.
Applications of Artificial Intelligence in Point-of-Care Ultrasound
Image Acquisition
We all remember struggling to obtain respectable cardiac windows, and AI-guided image acquisition tries to help us forget by providing live feedback on transducer movement and orientation. 4 Deep learning systems for cardiac ultrasound currently exist, recognizing anatomical structures in real time and displaying on-screen prompts to help probe alignment for standard echocardiographic views. 5 Beyond this particular tool there have been numerous studies with novice scanners in domains spanning from echocardiography, deep vein thrombosis (DVT) protocols, and Focused Assessment with Sonography with Trauma (FAST) that show improved image acquisition and thereafter increased use of POCUS for routine clinical assessments.6–10 These systems come timely as critical care fellowships are increasingly expected to include ultrasound training, but faculty resources are limited. For better or for worse, the development of AI applications for POCUS are quickly outpacing the establishment of much needed formal POCUS educational programs.11–13
These studies underscore that AI guidance can “skill-enable” clinicians, extending ultrasound to situations when human experts are not available. Early evidence supports AI functioning as a capable bedside coach, helping novice users to acquire high-quality images.
Image Interpretation
The most active area of research for AI use in POCUS is image interpretation, and we highlight specific examples by organ system. The selected examples described below are promising tools that can help automate interpretation of POCUS images, whether as a primary assessment for novice scanners or a second pair of eyes for those more experienced. The benefit of consistent performance by AI models cannot be understated: they are immune to fatigue and are not subject to various cognitive biases. However, overreliance on AI tools must be avoided, just as over reliance on POCUS findings without clinical integration is dangerous. 14 As these models continue to mature with more diverse datasets, they exemplify how pattern recognition by AI can flag critical findings to expedite additional investigations and management.
Cardiovascular Ultrasound
AI models have been calculating left ventricular ejection fraction (EF) since at least 2020, and have been shown to agree within 5% of experts in most cases. 15 Such technology has been incorporated into POCUS devices, with handheld models from different companies demonstrating expert-level EF predictions even with images obtained by non-experts. 16 Beyond EF estimation, AI algorithms are interpreting other cardiac findings including pericardial effusions, valvular abnormalities, and diastolic dysfunction.17–19 AI also has the potential to automate and standardize pulse checks during Advanced Cardiovascular Life Support (ACLS). Knowing the unreliability of pulse checks among healthcare providers, 20 a model was trained to use carotid artery compressibility to predict return of spontaneous circulation (ROSC) 21 and was found to be 96% accurate.
Lung Ultrasound
The COVID-19 pandemic was an impetus to develop automated lung ultrasound (LUS) solutions. B-line detection was an early candidate given its visually simplistic appearance, with highly accurate models cropping up that performed well on multi-institutional data, and some even with the ability to differentiate between COVID and non-COVID pneumonia with Area Under the Curve (AUC) of >0.9.22,23 Identifying ultrasonic ‘fingerprints’ to differentiate causes of respiratory failure can serve as a low-cost, low-risk, rapid screening tool in any setting, but the jury is out on the utility and accuracy of these models given extensive variability in agreement scores when compared to experts.24,25 The heterogeneity of model performance using similar AI tools highlights the need for highly diverse, multi-center data to produce more generalizable models.
For pneumothorax detection, one LUS model demonstrated similar diagnostic accuracy to chest x-ray (CXR) but shorter time to diagnosis, providing valuable lead time to intervention. 26 Another pneumothorax detection model was validated prospectively on ICU patients with sensitivity and specificity of 92% and 80%, respectively, 27 further pushing the envelope to forgo burdensome x-ray machines during times where rapid diagnosis is needed.
With POCUS being the superior tool for effusion and consolidation diagnosis, 28 automating detection can provide an accurate, low-cost solution. Existing models are already able to identify and localize both consolidations and effusions with accuracies approaching 90%,29,30 and a future state combining these kinds of models with guided image acquisition algorithms (as discussed above) would allow accurate LUS in both resource and expertise limited settings to become a reality.
Trauma Ultrasound
A positive FAST exam in the correct clinical context can mobilize resources like few other imaging results, and AI can assist by highlighting subtle anechoic fluid slices that a rushed human may overlook, with 98% accuracy. 31 The Extended FAST (eFAST) exam is a multi-organ trauma evaluation that adds pneumothorax and pericardial effusion assessment, both of which are currently being studied from an AI perspective. A future state combining these models would result in an all-in-one trauma assessment AI suite for rapid and automated identification of life-threatening injuries.
Vascular Ultrasound
In the undifferentiated patient in shock or respiratory failure, the diagnosis of pulmonary embolism is never far from the mind. Investigation of deep vein thrombosis (DVT) should be routine in these situations, 32 and manual (non-AI) 2-point POCUS DVT studies are a cheap and non-invasive screening method with excellent performance characteristics. 33 With respect to AI options, one automated DVT detection algorithm using images obtained by non-experts achieved sensitivities and specificities as high as 96% and 85%, respectively. 34 Again, consider the power of guided image acquisition with automated interpretation to democratize DVT exams for non-experts or in areas where radiology expertise is not available.
Neurological Ultrasound
Intensivists caring for patients with acute neurologic injuries must maintain a high degree of suspicion for elevated intracranial pressures (ICP). 35 Aside from the gold standard ICP monitors, clinicians rely on bedside examination and computerized tomography (CT) imaging as surrogate markers for ICP, neither of which is sensitive or specific. 36 Optic nerve sheath diameter (ONSD), measured using POCUS, is a non-invasive method of evaluating elevated ICP which has significant correlation with invasive measurements. 37 Automating ONSD measurement could be a helpful tool to screen for elevated ICPs without need for transport or radiation, and one model demonstrated significant correlation (r = 0.7) with expert measurements. 38
Task Automation
Deriving quantitative measurements from POCUS can potentially enrich understanding of deranged physiologic states. Tracing Doppler waveforms or measuring distances are a subjective skill - deciding which part of the envelope to include or optimal positioning for linear measurements. Additionally, measurement errors are compounded by formulae that double or square the measured values. AI can potentially reduce intra- and inter-operator variability, while also saving time.
Non-invasive hemodynamic measurements are increasingly used for personalized shock and volume management, 39 and ultrasound manufacturers have responded by offering automated measurement tools, such as auto-velocity time integral (VTI) and auto-inferior vena cava (IVC). Accurate and consistent measurement of LVOT VTI is crucial, especially if it is being serially trended to assess volume responsiveness (VR). 40 Several studies have demonstrated good agreement between these tools when tested against expert users (κ = 0.498-0.655),41,42 however conflicting studies suggest poor agreement and systematic underestimation of measurements.42,43
Intensivists are ever in pursuit of an accurate measure of VR. Although fraught with challenges, measuring IVC collapsibility and distensibility is a commonly used method,44,45 but despite its seeming simplicity, the inter-rater reliability is poor. 46 Early attempts to automate this process with AI suggest excellent agreement with manual expert measurements, in addition to near perfect intra- and inter-observer reliability (ICC 0.96-0.99), 47 but conflicting studies also exist. 41
The dizzying amount of variability in these highlighted studies that test the same automated measurement tool suggests there is: 1) room for improvement in these FDA-approved devices and 2) still a degree of operator dependency that requires solid fundamental image acquisition. When paired with automated image guidance algorithms as above, these task automation tools have the potential to minimize variability in measurements while saving time. It also opens the door to those less familiar with quantitative assessments. Clinicians must still be vigilant about the caveats that come with quantitative assessments. At the same time there are other quantitative POCUS techniques that could benefit from AI automation, including the Venous Excess Ultrasound Score (VExUS) that is gaining popularity to assess solid organ congestion and fluid tolerance. 48
Procedural Guidance
POCUS has tremendous safety benefits related to performing procedures in the ICU.49,50 AI can potentially act as a heads-up display, highlighting pertinent anatomic structures on a live ultrasound image, and such algorithms can successfully identify vessels, bones, tendons, and nerves on upper extremity ultrasound clips for vessel cannulation, for example. 51 Handheld devices also offer visualization tools that improve operator awareness during vascular access, 52 and AI systems have been shown to improve cannulation time in difficult targets. 53
Other procedural applications include localizing safe pockets for tube thoracostomy or paracentesis. Algorithms have already been developed to map normal anatomy in LUS images, such as rib shadows and pleurae in addition to pathologic findings such as consolidations.29,54 This concept can be easily applied for locating safe ascitic pockets for paracentesis, with proof-of-concept models already developed. 55
At the time of this review, many assistive features currently include clinicians “in the loop”, meaning they help the proceduralist rather than fully automating the action. As these AI tools become more vigorously vetted against diverse patient populations, they provide an opportunity for safer, more accessible avenues for procedures in the ICU. As with diagnostic interpretation, rigorous clinical trials and careful implementation will determine how quickly and broadly these tools are adopted.
Program Development and Quality Assurance
Ongoing establishment of POCUS use has necessitated reliable archiving, quality assurance, and reporting. 56 An expert who oversees a POCUS program will assume responsibility for reviewing the quality of images, confirming appropriate interpretation, and accurate documentation of findings. AI can improve the structure and delivery of POCUS programs by automating existing workflows. Many models have been developed to accurately identify various organs and specific views in scanning protocols.57–59 Automated view classification not only enables powerful filtering of a POCUS archiving database for educational or research purposes, but can also improve AI workflow efficiencies. 58 This has the potential to accelerate data labelling in AI research by providing clinicians with preliminary labels, similar to an ECG computer interpretation, that are to be reviewed and adjusted if necessary.
The various applications of AI in POCUS discussed up until this point sets the stage for automated report generation. A preliminary report that includes image quality, image interpretation, and automated measurements for expert review can improve the efficiency of the documentation process. Automated report generation has been developed for CXRs, with one AI model producing up to 77% reports deemed equivalent or preferable to clinician reports, increasing to 94% for cases without abnormal findings. 60 These models can also be deployed as a triage tool; for example a prospective CXR AI model was able to triage urgent cases with 99% specificity to reduce turnaround times by 77%. 61 Automating report generation also has the added benefit of reducing the administrative burden for billing. 62 Given the multi-organ nature of POCUS, automated report generation can weave together insightful cardio-pulmonary data - combining information about cardiac function, pulmonary pathology, and VExUS scores to present an ultrasonic profile of a patient that can guide resuscitative measures.
AI can improve workflow efficiencies in existing POCUS program infrastructure by intelligently archiving clips, identifying quality issues, generating preliminary reports, triaging abnormal scans, and improving documentation and billing. Offloading POCUS program leaders from these tasks can free them to coach scanning at the bedside and continue innovating their programs. The promise of this technology can also facilitate the development of remote POCUS programs without local expertise.
Discussion and Future Directions
With the potential to revolutionize POCUS, AI technologies must undergo an important vetting process prior to real-world deployment. The power of AI algorithms depends on the quality of data they are trained on, and just as with clinical research having a diverse representation of patients in training data is crucial for AI models to perform without bias. The generalizability (or external validity) of an AI model refers to how well it can perform on data it has never seen before, typically from different institutions. 63 This is particularly important for POCUS given a high degree of variability in images owing to a wide range of ultrasound machines, probes, image presets, and operator scanning preferences. 64 Only 6–7% of studies investigating AI algorithms for conventional radiology imaging have validated their models on external datasets,65,66 of which 81% report a performance degradation. 67 Some AI algorithms for POCUS have already been pressure tested against multi-center data, 68 but this remains the exception rather than the rule. Multi-center validation and fine-tuning of lung ultrasound models have also been reported.69,70
Public datasets of deidentified medical imaging from Picture Archiving Communication Systems (PACS) have accelerated AI development by democratizing data for algorithm development.69–72 With POCUS archiving solutions being inconsistent or absent across hospital systems, 73 public datasets have been crucial to crowdsource precious images that capture the variability inherent in point-of-care imaging. The COVID-19 pandemic ignited efforts to gather lung ultrasound data for development of AI screening tools.70,74 Ongoing contribution to these public datasets for POCUS images will be central to developing highly generalizable AI models.
Using patient data to build algorithms or databases requires attention to data privacy. A major step in data preparation for AI research includes deidentification of images- removing sensitive information including name, medical record number, and date of birth. 75 National policies, such as the U.S. Health Insurance Portability and Accountability Act (HIPAA), outline identifiers that must be removed to preserve patient privacy. Imaging formats must be considered as they may have embedded patient information, such as the Digital Imaging and Communication in Medicine (DICOM) standard that ultrasound images may be stored as. 76 Deidentification pipelines for ultrasound DICOM images already exist, however a significant bottleneck in AI research is encountered once investigators must navigate data sharing agreements requiring legal and IT expertise.77,78 A novel solution that maintains patient privacy by removing the need for inter-institutional data transferring is federated learning. Instead of multiple hospitals sending data to a single institution for AI development, an AI model is sent to participating hospitals to be trained on local data, after which the individual models are amalgamated into a final model. 79 Although this will require local IT expertise to implement, this decentralized method of AI training can avoid sharing patient data and time delays relating to data sharing agreements, ultimately accelerating AI research.
An inevitability of AI algorithms is performance degradation over time given changes in the data it will see, such as introduction of new ultrasound machines, probes, software preprocessing, and changes in disease patterns. This concept is referred to as data drift. 80 This was illustrated by analyzing CXRs before and after the emergence of COVID-19, detecting global, measurable changes in CXR features. 81 Anticipation and defending against data drift should be undertaken by surveilling model performance and routinely re-training AI models. Machine learning operations (MLOps) is an existing framework that defines a continual learning pipeline for AI models, providing guiding principles for model monitoring, data management, model versioning, and deployment. 82 Just as clinicians must keep up to date with new evidence and guidelines, AI models must keep up with the constant evolution of data.
An emerging technique called multi-modal AI addresses the old adage of ‘clinical correlation required’. These models ingest multiple types of data (imaging, text, laboratory, waveform, etc) to generate conclusions, 83 just as an intensivist would evaluate a patient in shock with ultrasound, perfusion makers, and vital signs. One multi-modal model, for example, was able to differentiate between Alzheimer's disease, mild cognitive impairment, and normal controls using MRI, clinical notes, patient demographic data, and clinical prediction scores. 84 Compared to unimodal models that use only images or text, superior performance was demonstrated across five separate public datasets. 84 Development of multi-modal models that include POCUS will help intensivists navigate complex multi-organ interdependencies when assessing shock states or volume status.
Despite the bright future of AI, it is not meant to replace clinical reasoning. A misinterpretation of AI's role could lead to over-reliance, and clinicians must be able to contextualize AI findings using their domain expertise. Thus, there will still be value in knowing the principles and caveats behind POCUS applications. The value of these algorithms lies not only within the technology itself but also the deliberate, expert-led evaluation by POCUS experts and educators to ensure clinically relevant and safe application. In critical care, the stakes are high, so dual validation (AI and clinician) is likely to remain the standard approach.
Conclusion
We appear to be at an inflection point where AI in POCUS transitions from novelty to routine assistant. In the ICU of tomorrow, AI may guide a resident's hand to get the ultrasound view, analyze the image, fill the report, and even suggest the next intervention – but the skilled intensivist will always be at the helm, interpreting those suggestions in light of the whole patient. The synergy of clinician expertise and AI precision holds great promise for improving critical care outcomes. The path forward will require continuing to rigorously evaluate these technologies and ensuring they are implemented in a way that maximizes benefit and minimizes risk. With careful oversight, AI will become an integral part of the critical care ultrasound landscape, helping fulfill the potential of POCUS as a tool for faster, smarter, and more equitable care at the bedside.
Footnotes
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
