Abstract
Gastroscopy, a critical tool for the diagnosis of upper gastrointestinal diseases, has recently incorporated artificial intelligence (AI) technology to alleviate the challenges involved in endoscopic diagnosis of some lesions, thereby enhancing diagnostic accuracy. This narrative review covers the current status of research concerning various applications of AI technology to gastroscopy, then discusses future research directions. By providing this review, we hope to promote the integration of gastroscopy and AI technology, with long-term clinical applications that can assist patients.
Keywords
Introduction
Gastroscopy, in combination with pathology, is widely recognized as the most reliable method for diagnosing upper gastrointestinal diseases. Its clinical applications have been steadily expanding, and it has even been included in the national cancer screening programs of regions with high rates of gastric cancer (e.g., Japan and Korea). 1 However, the increased number of gastroscopic examinations has led to a corresponding increase in the number of adverse events, including incomplete examinations, inadequate observations, missed diagnoses, and misdiagnoses. Unfortunately, attempts to mitigate these issues have been hindered by the large number of examinations, the diversity of examination protocols, and the shortage of highly qualified endoscopists. 2 Nevertheless, the rapid development of artificial intelligence (AI) technology in recent years offers hope for resolving these challenges. Previous studies have demonstrated that the utilization of AI technology can help to standardize endoscopic operations and facilitate endoscopic diagnosis. 3 Before exploring research concerning AI-assisted gastroscopy, a thorough review of the current state of the field is necessary. This narrative review aims to provide an overview of existing research, assess the limitations of previous studies, and explore promising directions for future investigations.
Search strategy
The search strategy used the following combinations of keywords: artificial intelligence, machine learning, deep learning, artificial neural network, computer-assisted, digestive endoscopy, endoscopy, esophagogastroduodenoscopy, quality control, diagnostic imaging, diagnosis, detection, target detection, and image classification/recognition. Articles were obtained from PubMed, Cochrane Library, Web of Science, Embase, China National Knowledge Infrastructure (CNKI), and Wanfang Database, using the cutoff date of 15 November 2023. Finally, we obtained 338 articles across all databases. After the removal of duplicates, reviews, comments, author responses, and articles with irrelevant content, the review included 52 articles. Some articles published in non-medical journals focusing on computerized algorithm design were also retained or discarded based on the strength of clinical relevance.
Applications of AI in Gastroscopy Quality Control
The anatomical structure of the upper gastrointestinal tract considerably varies, and observation sites are frequently overlooked during gastroscopy. 4 To improve the quality of gastroscopy, AI technologies must be capable of accurately distinguishing among various regions of the upper gastrointestinal tract. Relevant studies typically require the collection of representative endoscopic images of the gastroscopy procedure, including areas susceptible to disease. These images are subsequently categorized and annotated by experts, then used to train the AI model’s recognition capabilities. With the exception of the study conducted by Takiyama et al., 5 which aimed to train an AI model to recognize anatomical regions of the upper gastrointestinal tract, most studies have focused on exploring the potential for AI to reduce blind spots during examinations and improve examination quality. Choi et al. 1 trained an AI model to recognize endoscopic images of eight specific regions of the upper gastrointestinal tract: proximal esophagus, gastroesophageal junction, cardia fundus, gastric body, gastric horns, gastric sinus, duodenal bulb, and descending duodenum. In that study, the examination was considered finished and blinding was removed when the images captured during an examination fully overlapped with the eight images recognized by the model. The test results indicated that the model could accurately identify 97.58% of images; examination completeness accuracy was 89.20%. An AI model developed by Wu et al. 6 recognizes 26 types of gastroscopy images. Additionally, the model has been trained to automatically retain images of the target area and time according to clinical needs. The results of parallel and multi-group tests 7 demonstrated a significant reduction in the blind spot rate of the AI-assisted gastroscopy examination group, compared with the control group. Advances in research are allowing neural networks to be trained with increasing numbers of images that enable recognition of wider ranges of sites, thereby enhancing visual field detection ability. 8 Importantly, endoscopy quality control involves three key indicators: the structural composition of the endoscopy center (e.g., personnel and equipment), the examination process, and the examination results. 9 Quality control studies should investigate methods to comprehensively optimize these indicators by implementing AI technology, which can help to avoid narrowly focusing on quality control during the examination process. Careful analysis is needed concerning the pre-examination assessment of anesthesia risk,10,11 as well as the rates of lesion detection during the examination and adverse reactions after the examination. 12 Exclusive reliance on manual methods to review data and conduct statistical analyses constitutes an inefficient approach. The use of artificial intelligence to construct an organized template system for digestive endoscopy that selects relevant text based on predetermined keywords, automatically conducts statistical analysis in the background, and ultimately transmits the outcomes to the management platform can streamline analysis and produce more precise scientific findings through the robust computing power of AI. 9
Applications of AI in Upper Gastrointestinal Tract Lesion Detection and Disease Diagnosis
Applications of AI in the Detection of Early Esophageal Cancer and Assessment of Its Infiltration Depth
Upper gastrointestinal malignant tumors constitute a widespread and serious public health problem. These tumors include esophageal and gastric cancers, which affect more than 1.6 million additional people annually worldwide. 13 The prognosis for patients with advanced esophageal cancer is considerably worse than the prognosis for patients with early-stage disease. Early diagnosis and treatment are essential to improve prognosis. Currently accepted diagnostic criteria for esophageal malignancies include endoscopy manifestations of typical/atypical lesions with characteristic histopathologic findings. However, the endoscopy-mediated detection of early malignant esophageal lesions can be challenging; it demands a high level of expertise and extensive clinical experience. In some studies, AI technologies have assisted with the detection of malignant esophageal lesions. Cai et al. 14 used AI technology to facilitate the detection of esophageal squamous cell carcinoma (ESCC). They trained a deep neural network with 1332 images of ESCC, captured under ordinary white light endoscopy. This training enabled the network to identify and localize cancerous areas in images; the trained model outperformed the control group of experts in terms of accurately identifying ESCC. Some AI studies have focused on Barrett’s esophagus, where abnormal columnar epithelial hyperplasia is associated with the development of esophageal adenocarcinoma. In these studies, AI models have been trained to assist with efforts to identify severe heterotrophic hyperplasia and/or carcinoma in situ in Barrett’s esophagus. The training images have consisted of white light images acquired in normal mode, along with images obtained in modes such as narrow-band imaging and magnification.15–19 Hashimoto et al. 15 used images acquired through various gastroscopic modes (white light, narrow-band, and confocal) to train an AI model. They demonstrated sensitivity, specificity, and accuracy of 96.4%, 94.2%, and 95.4%, respectively, in the detection of early esophageal cancerous lesions. de Groof et al. 17 established an artificial control group comprising 53 specialized endoscopists across four countries. Their analysis indicated that an AI-assisted detection system trained by the research team had greater accuracy in the detection of early esophageal cancer, compared with all endoscopists in the control group. Both of the aforementioned studies solely used image datasets. However, using dynamic videos, Ebigbo et al. constructed an AI-assisted system with 89.9% accuracy in the detection of early esophageal adenocarcinoma. 19
The endoscopic phenotypes of esophageal malignancies are generally assumed to correspond to their infiltration depth, enabling AI to aid in the assessment of tumor infiltration depth. AI models trained by the research teams of Shimamoto and Nakagawa achieved prediction accuracies of 89.2% and 91.0%, respectively, when predicting the infiltration depth of ESCC (e.g., SM1 or SM2/3).20,21 Although Nakagawa et al. achieved slightly better results in tests involving high-quality image sets, Shimamoto et al. expanded the model to enable detection using dynamic video. Notably, most similar studies have focused on ESCC because of its high incidence. The training phases of AI models often require large numbers of samples with similar characteristics. Low numbers of training samples can hinder such studies; AI models trained on small sample sizes may display overfitting, weakening the reliability of the findings.
Applications of AI in the Detection of Early Gastric Cancer and Assessment of Its Infiltration Depth
Endoscopic manifestations of early gastric cancer are challenging to identify because of their atypical nature. Several studies revealed that routine white light endoscopic screening for gastric cancer has a leakage rate of approximately 4.6% to 25.8%. Similar to the detection of esophageal malignancies, applications of AI may help to reduce the leakage rate of early gastric cancer and partially improve the lesion detection rate. For example, in the study by Wu et al., 22 the researchers used more than 9000 gastroscopic images of early gastric cancer to train an AI model intended to recognize lesions at a preliminary stage. Their results demonstrated that the trained AI model had up to 92.5% accuracy in the detection of early gastric cancer (with 94.0% sensitivity and 91.0% specificity). The model’s performance was superior to all physicians in the control group. Gastric cancer may resemble an ulcer and is therefore likely to be misdiagnosed in clinical practice. Lee et al. 23 sought to train AI models to differentiate among normal gastric mucosa, mucosal ulcers, and cancerous mucosa. Their ResNet-50 model achieved 90.0% accuracy when distinguishing normal gastric mucosa from mucosal ulcers and normal gastric mucosa from cancerous mucosa. Although the model demonstrated low accuracy when distinguishing between mucosal ulcers and cancerous mucosa, the findings highlight the potential for AI to assist endoscopists with the identification and diagnosis of complex gastric diseases.
For patients with early gastric cancer, in which lesions are limited to the mucosa (M) and the partial submucosal infiltration depth is less than 500 μm (SM1), gastroscopic treatments (e.g., endoscopic submucosal dissection) can provide radical results. However, it remains challenging to accurately determine the depth of lesion infiltration before treatment. An AI model developed by Zhu et al. 24 exhibited 89.16% accuracy when differentiating stage M/SM1 gastric cancer from other gastric cancers with deeper infiltration, significantly outperforming endoscopists in the study. Furthermore, Nagao et al. 25 trained an AI model using images of gastric cancers captured under white light, narrow-band imaging, and indigo rouge staining. Their model demonstrated >90% accuracy in predicting the depth of gastric cancer infiltration. Additionally, they showed that when gastroscopy images from different modalities were used to train the AI model, the final results did not considerably differ in terms of depth prediction accuracy. Thus, it is clear that AI technologies can aid in the detection of gastric cancer lesions and assist with efforts to identify treatment options for gastric cancer patients, thereby decreasing the likelihood of unnecessary gastrectomy procedures and enhancing quality of life.
Applications of AI in the Detection of Helicobacter pylori Infection
Applications of AI in the Detection of Pre-Cancerous Gastric Conditions
The development of gastric cancer is believed to comprise a long and multi-step process. Most gastric cancers develop through a series of histological lesions known as the Correa cascade: non-atrophic gastritis, chronic atrophic gastritis (CAG), intestinal metaplasia (IM), and dysplasia. CAG, IM, and dysplasia are pre-cancerous conditions associated with progressively increasing risks of gastric cancer development. 35 Early detection and proper clinical treatment of these pre-cancerous conditions can effectively reduce the incidence of gastric cancer, but examiners must possess theoretical expertise and extensive clinical experience. There is evidence that AI can assist with the identification of pre-cancerous gastric conditions.
On endoscopy, CAG typically manifests as gastric mucosa thinning, transparent submucosal vessels, and diminished color. In some countries, endoscopic diagnosis of CAG exhibits 42% sensitivity. 36 In their efforts to establish an AI model for atrophic gastritis detection, Guimaraes et al. 37 achieved 93.0% accuracy by training the model with 200 images. Lin et al. 38 replicated this approach with greater accuracy, achieving 96.4% accuracy in the detection of atrophic gastritis; they also detected intestinal metaplasia with 97.6% accuracy. Furthermore, researchers in China demonstrated that a trained AI model could review dynamic video and detect CAG with 92.37% accuracy 39 ; the key aspect of their work was its demonstration that an AI model could identify CAGs by assessing clinical videos. The extent of mucosal atrophy can indicate lesion severity and potentially influence the physician’s clinical judgment. Thus far, no studies have focused on identifying the boundaries of mucosal atrophy. By convening clinical experts to outline atrophic regions in CAG images and then using those labeled images for AI training, the resulting models can achieve better focus on atrophic boundaries during the recognition of CAG.
Gastric mucosal IM is a pre-cancerous gastric condition that is more severe than atrophic gastritis because of prolonged exposure to various harmful factors. Several studies have shown that deep learning models can achieve 87% to 99.18% accuracy in the recognition of IM.38,40,41 As noted above, most of this research has been focused on image recognition; the clinical value of existing models remains unclear. A few test videos have shown that the models require demanding environmental conditions to achieve the desired recognition effect. 40 Feature extraction and matching are essential for AI image recognition; any factors affecting the clinical presentation will influence AI recognition performance. Such factors include light, lens angle, and distance from the lens to the gastric mucosa. Rather than expanding current datasets, it may be more helpful to optimize model algorithms, with a focus on enhancing recognition performance in unknown environments.
The management of pre-cancerous gastric conditions requires measures tailored to each patient’s own characteristics; thus, research should not be limited to image recognition. Multimodal learning strategies can allow other patient-related information (e.g., sex and age) to be integrated into the gastroscopy process; the “intelligence” aspect of AI can be more fully realized by combining image and text information to facilitate comprehensive analysis.
Applications of AI in the Detection of Submucosal Tumors in the Upper Gastrointestinal Tract
Submucosal tumors in the upper gastrointestinal tract comprise various conditions, such as gastric mesenchymal tumors, smooth muscle tumors, nerve sheath tumors, neuroendocrine tumors, and ectopic pancreas. Gastroscopic ultrasound plays a key role in identifying these conditions. However, diagnostic accuracy strongly depends on the examiner’s clinical experience and degree of knowledge. Among all submucosal lesions, gastric mesenchymal tumors are more prevalent and can develop characteristics of malignancy. 42 The accurate identification of gastric mesenchymal tumors is important in clinical practice. However, these tumors are frequently misdiagnosed because of their similarity to other conditions. Diagnostic accuracy can be increased by training AI models to assist physicians during diagnosis.43–47 Hirai et al. 44 acquired 16,110 images of the above-mentioned five submucosal tumors via gastroscopic ultrasound, which they used to train an AI model. The trained model was superior in terms of tumor recognition and classification compared with clinical endoscopists; for example, its submucosal tumor classification accuracy was 86.1%. Using an AI model trained with gastroscopic ultrasound images, Kim et al. 46 achieved 79.2% accuracy when distinguishing gastric mesenchymal tumors from non-mesenchymal tumors. Additionally, their model could be used in the diagnosis and differentiation of smooth muscle tumors and nerve sheath tumors. Another study showed that physicians could improve their ability to distinguish gastric mesenchymal tumors from smooth muscle tumors by approximately 15% through the use of AI. 47
Applications of AI in the Detection of Gastroesophageal Varices
Two multicenter studies in China investigated the value of AI in efforts to detect gastroesophageal varices.48,49 Their results showed that, in the detection of gastroesophageal varices, AI and physicians demonstrated comparable accuracy. Furthermore, in the diagnosis of bleeding risk factors such as the red color sign, AI demonstrated diagnostic ability superior to physicians. Finally, the AI model provided appropriate treatment recommendations based on the current test results.48,49
Conclusion
As research concerning AI-assisted gastroscopy continues to emerge, it is clear that AI is enhancing conventional gastroscopy in multiple ways. Overall, the use of AI technology by endoscopists can compensate for limited stamina, thereby reducing clinical leakage and mitigating fatigue-related misdiagnosis. Additionally, AI models trained with large numbers of characteristic samples can assist physicians with diagnoses during gastroscopy procedures. Because of variations in clinical experience and education, some regions and countries have few expert endoscopists. Additionally, some regions have high numbers of patients requiring gastroscopy and limited time for each gastroscopy procedure; the resulting high-intensity and repetitive mechanical work over short intervals can result in visual fatigue. All of these factors have led to blind spots, missed diagnoses, misdiagnoses, and other unfavorable outcomes in conventional gastroscopy. The implementation of AI technology can help to alleviate some of these issues.
Thus far, studies of AI-assisted gastroscopy have provided preliminary evidence that the use of AI technology can improve examinations in various ways, enhancing the potential benefits. Rapid advances in computer technology continuously result in new models with improved performance and smaller size. Thus, it is important to consider whether AI intervention is suitable for clinical implementation in gastroscopy procedures. We argue that although applications of AI in clinical gastroscopy are plausible, some issues must be resolved before the technology can be fully embraced. First, existing studies have used datasets from gastroscopy image databases stored in research centers; because data cannot be shared among centers, the screening process is time-consuming and labor-intensive. AI model training typically requires a large number of characteristic samples; insufficient sample size can lead to model overfitting. Considering the absence of industry standards and guidelines, the criteria for inclusion and exclusion of research data currently vary among organizations. Additionally, frequent updates have led to substantial generational discrepancies in AI models among studies. These factors greatly limit the generalizability of the findings. The construction of an extensive open-source endoscopic image database, “EndoNet,” could potentially alleviate the data acquisition problem. 3 Indeed, a public database of images is available for intestinal polyps. The development of an image database for all clinical conditions of the upper gastrointestinal tract could require extensive effort and time because of the inherent diversity. Second, most current research has solely focused on classifying and identifying image data without considering the ongoing evolution of clinical examination processes. Tools that lack real-time analytical capabilities may perform well in some context, but they have limited clinical value. The recognition of videos, rather than images, requires comprehensive model optimization and meticulous training dataset design. Furthermore, the implementation of AI technologies in endoscopy workflows will entail careful consideration of numerous aspects, such as patient safety and efficient integration into operating procedures.
Finally, advances in this field should focus on evolving from the recognition of single images/videos to the analysis of multimodal data. For example, some studies have integrated imaging data, such as chest computed tomography scans, with laboratory findings. However, no studies have integrated endoscopic images with other clinical parameters.
Each new technology experiences a long journey from its theoretical conceptualization to its eventual clinical application. We are encouraged by the collaborative endeavors of scientists from various disciplines, which are accelerating this process. Despite the challenges mentioned above, we believe that AI-assisted gastroscopy will soon become prevalent, allowing physicians to more efficiently manage their patients’ conditions.
Footnotes
Acknowledgements
We thank Haihan Zhang from the Department of Gastroenterology at the Affiliated Municipal Hospital of Xuzhou Medical University for providing valuable feedback concerning the structure of the manuscript. We also thank Professor Zhaolin Lu from the School of Information and Control Engineering at China University of Mining and Technology who shared theoretical knowledge regarding artificial intelligence.
Author contributions
Conceptualization: HC, SL, and GC; Investigation: HC and GC; Data Curation: SH and ML; Writing – Original Draft Preparation: HC, SL, and SH; Writing – Review & Editing: GC; Supervision: GC; Funding Acquisition: GC and SL.
Data availability statement
The authors confirm that all data supporting the findings of this study are available within the article.
Declaration of conflicting interests
The authors declare that there is no conflict of interest.
Funding
This study was supported by the Xuzhou Municipal Health and Health Commission Medical Leading Talents Training Program (No. XWRCHT20210025), the Xuzhou Key R&D Program (Social Development) Project (No. KC22095), and the Xuzhou Health and Health Commission Youth Innovation Science and Technology Project (No. XWKYHT20220073).
