Abstract
The gap between endoscopy and histology is getting closer with the introduction of sophisticated endoscopic technologies. Furthermore, unprecedented advances in artificial intelligence (AI) have enabled objective assessment of endoscopy and digital pathology, providing accurate, consistent, and reproducible evaluations of endoscopic appearance and histologic activity. These advancements result in improved disease management by predicting treatment response and long-term outcomes. AI will also support endoscopy in raising the standard of clinical trial study design by facilitating patient recruitment and improving the validity of endoscopic readings and endoscopy quality, thus overcoming the subjective variability in scoring. Accordingly, AI will be an ideal adjunct tool for enhancing, complementing, and improving our understanding of ulcerative colitis course. This review explores promising AI applications enabled by endoscopy and histology techniques. We further discuss future directions, envisioning a bright future where AI technology extends the frontiers beyond human limits and boundaries.
Keywords
Introduction
Endoscopic remission remains a primary endpoint in the management of inflammatory bowel disease (IBD), but it is not enough, as a remarkable proportion of patients continue to relapse despite a normal-appearing mucosa. This implies that endoscopy may underestimate the true activity of the disease, especially in ulcerative colitis (UC), because subtle microscopic alterations in the context of apparently normal mucosa are associated with an increased risk of clinical relapse in up to 30% of patients. Conversely, patients who achieve histological remission have a greater likelihood of improved long-term clinical outcomes.1,2
Accordingly, treatment goals for UC are evolving to deeper level of healing that incorporates both endoscopic and histologic healing. 3 Hence, recently a composite endpoint of endo-histology mucosal healing (MH) encompassing both endoscopic remission/improvement and histologic remission has been proposed as a more “complete” and stringent outcome measure in clinical trials.4–6 A recent meta-analysis has further confirmed the magnitude of the potential benefit of the composite endpoint of endo-histologic MH, which is associated with a significantly reduced risk of clinical relapse. 7 However, significant barriers and limitations remain, including the heterogeneity of histologic definitions in the various published studies, the absence of a standardized endoscopic and histologic scoring system, variability in biopsy interpretation, and differences in the duration of follow-up.
Emerging advanced endoscopic techniques have reduced the discrepancies between endoscopy and histology, moving toward a deep definition of MH aiming to restore the function of the colon’s mucosal barrier and disease clearance.8,9 A recent international multicenter study showed that a single measure of endoscopic remission defined using electronic virtual chromoendoscopy performed similarly to combined endoscopic and histologic remission for predicting specified clinical outcomes at 12 months. 3 Hence, electronic virtual chromoendoscopy can aid in simplifying the assessment of histo-endoscopic remission, facilitating better understanding and reducing complexity in evaluation.
Furthermore, advancements in artificial intelligence (AI) technologies hold promise for enhancing endoscopic image interpretation and aiding in the quantification of vascular and mucosal patterns to provide a deeper definition of remission. AI moves beyond human vision of endoscopy and ultimately assists in real-time histologic assessment during endoscopy. 10 Remarkably, utilization of AI as an ideal adjunct to existing tools can revolutionize the way therapeutic decisions are made by supporting diagnosis, monitoring, and predicting the course of disease in patients with UC. 8
In this current narrative review, we explore promising AI applications in IBD endoscopy and histology. We further discuss future directions, envisioning a bright future in which AI technology adds value to precision medicine.
AI for endoscopy scoring disease activity assessment and prediction of clinical outcomes
AI for scoring disease activity assessment
Mayo endoscopic subscore (MES) and Ulcerative Colitis Endoscopic Index of Severity (UCEIS) are the most widely used endoscopy scoring for disease activity assessment in clinical trials and in practice.11,12 However, in both cases, it is known that there are variabilities among examiners imposing significant costs on image interpretation for clinical trials. 13 Additionally, in the context of the treat-to-target (T2T) approach11,12 which has gained recent traction, achieving endoscopic remission as a treatment goal varies among examiners and institutions, thereby posing a challenge in establishing a common treatment objective. To the best of our knowledge, the first physician-initiated study of a computer-aided diagnosis (CAD) system in patients with UC was conducted by Sasaki et al. 14 in 2003. Based on Bayesian decision theory, they established a computer-aided grading system for endoscopic severity (Matts grade).
The emergence of convolution neural network (CNN) and machine learning (ML) models have been attracting much attention to effectively learn patterns from extensive image datasets, automatically identify abnormal regions, and provide corresponding scores.15–18 This technology is expected to streamline the labor-intensive and costly process of central image interpretation in clinical trials, 19 diminish variations in interpretation among examiners, and promote more consistent evaluations.
Similarly, in 2019, Ozawa et al. 15 succeeded in using a CNN powered by deep learning to develop CAD. Using expert diagnosis as the gold standard, their retrospective analysis showed that their CAD system correctly classified 73% of the MES0 images, 70% of the MES 1 images, and 63% of the MES 2–3 images with an appropriate MES. At the same time, Stidham et al. 16 also developed a CNN model to output 4-class prediction (MES 0–3). In a retrospective comparison with experts’ diagnosis, their CAD system had good predictive value for differentiating an MES of 0–1 from an MES of 2–3 as shown by sensitivity of 83%, specificity of 96%, positive predictive value of 86%, and negative predictive value of 94%.
Further, Takenaka et al. 17 constructed a deep neural network for evaluation of UC (DNUC) algorithm, which was designed to extract features from conventional white-light still images and predict endoscopic remission (defined as a UCEIS of 0) with 90% accuracy.
However, these were validations for still images, and there was concern about selection bias in selecting images for validation. To address some of these issues, video instead of still images is recommended for assessments to reduce subjectivity related to image acquisition sites.20,21
Subsequently, Yao et al., 22 from the same group as Stidham, investigated the application of their CAD system into unaltered full motion videos and still images and then automatically output a summary MES for each colonoscopy. Notably, their new algorithm automatically excluded inadequate frames (including those in proximity to the mucosa and those with light reflection, debris and blurring, and motion blur) from the analysis for a final score of endoscopic severity.
Likewise, Takenaka et al. 21 updated their DNUC for application to video-based analysis. In their multicenter cross-sectional study, the DNUC correctly evaluated the presence or absence of histological inflammation in 81% of cases. They also showed that the DNUC result was strongly correlated with the UCEIS provided by central readers: with an intraclass correlation coefficient of 0.93 and good accuracy in determining endoscopic remission, defined as a UCEIS of 0 or 1 (sensitivity of 82% and specificity of 95%). Notably, through this study, they found that the main reason for discrepancy of two or more points between the DNUC and the central evaluator was poor bowel preparation or the presence of inflammatory polyps. Further improvement in accuracy is expected by incorporating an algorithm that automatically excludes such images.
Since then, several AI models have been reported that employ the ability to automatically exclude inappropriate images of similar concepts from the analysis.20,23 However, the accuracy with which inappropriate images can be extracted has not yet been verified. Furthermore, the extent to which these additional functions ameliorate differences in performance among examiners remains to be tested.
AI has been expected to address the issue of requiring human central readers for endoscopic disease activity scoring, which makes clinical trials costly and time-consuming. Gottlieb et al. 19 used the data set from a phase II Mirikizumab clinical trial 24 as the training and validation of a CAD system. Their CAD output a final score representing the endoscopic severity (MES of 0–3 and UCEIS of 0–8) for each full-length video in a fully autonomous manner. In their retrospective analysis, their CAD system achieved excellent accuracy of 96% for endoscopic remission when defined as an MES of 0 and 97% when defined as a UCEIS of 0.
AI-enabling inflammatory distribution
AI-assisted colonoscopy has also been used to visualize and evaluate the severity and distribution of inflammation more accurately in the entire colon. Currently, endoscopic scores, such as MES and UCEIS, evaluate the grade of inflammation in the most severe segment; the distribution of inflammation is not considered, even though it also significantly impacts drug selection. While there have been efforts to develop scoring systems that account for the extent of inflammation, they have not gained widespread adoption in clinical practice.25,26
Takabayashi et al. 27 developed a user interface in which Ulcerative Colitis Endoscopic Gradation Scale using a ranking-CNN, results diagnosed were incorporated into the schema for the colon. Visualization in their novel user interface facilitates the comprehension of the extent and grading of inflammation throughout the colon. Fan et al. 28 have introduced a new AI-driven scoring system that incorporates the distribution of inflammation in the output obtained from video-based AI-driven scoring. They divided the colon into a fixed number of “areas” (cecum, 20; transverse colon, 20; descending colon, 20; sigmoid colon, 15; rectum, 10). The scoring system automatically assessed the inflammatory severity of 85 areas from each video and generated a visualized result depicting the full-length intestinal inflammatory activity.
Most recently, Stidham et al. 29 introduced the automated Cumulative Disease Score (CDS) system which sums the MES-squared values for all 50 evenly spaced increments in the left colon, including the rectum. Notably, the CDS focuses solely on the left-sided colon and does not provide an evaluation of the entire colon. In a clinical trial cohort including 748 induction and 348 maintenance patients (UNIFI study 5 ), CDS exhibited a significant correlation with the MES (p < 0.001) and all clinical components of the partial Mayo score (p < 0.001). When stratified by pretreatment CDS, it was observed that ustekinumab was more effective than placebo (p < 0.001), with a more pronounced effect in severe versus mild disease (−85.0 vs −55.4; p < 0.001). Furthermore, compared to the MES, the CDS revealed endoscopic differences between ustekinumab and placebo (Hedges’ g, 0.743 vs 0.460) and demonstrated greater sensitivity to change, requiring 50% fewer participants to show endoscopic differences between ustekinumab and placebo (Hedges’ g, 0.743 vs 0.460).
The AI-assisted assessment of inflammation distribution may offer more comprehensive descriptions of UC activity, contributing to a detailed evaluation of treatment effectiveness. This new concept enhances, complements, and improves our understanding of the disease.
AI-enabled advanced endoscopic technology
Another highlight in this field is the integration of AI and advanced endoscopic technology.8,10,30,31 Specifically, the use of advanced endoscopic technology to predict histologic remission is anticipated to offer a more precise assessment of inflammation and activity, addressing the unmet need for biopsies. Currently, histology stands as the most reliable predictor of sustained clinical remission in UC. However, it has not been widely applied in daily practice 32 because of the burden of taking biopsies, the additional time required for a pathological diagnosis, and the increased cost.
Bossuyt et al. utilized a red density algorithm (Pentax Medical, HOYA Corp., Tokyo, Japan) to predict histological activity. When the red density score cutoff was ⩽60, this approach exhibited a 96% sensitivity and 80% specificity in predicting histologic remission, defined as a Robarts histopathology index (RHI) of ⩽6. 33
Likewise, the same group explored in a pilot study the CAD system that utilized a single short-wavelength monochromatic Light Emitting Diode light illumination (Fujifilm, Tokyo, Japan). 34 This system enabled real-time evaluation of mucosal architecture, including the crypts, peri-cryptal capillaries, and bleeding. The algorithm successfully identified histologic remission, defined as a Geboes score of <2B.1, with high performance, boasting a sensitivity of 79% and specificity of 90%.
Similarly, Iacucci et al.35,36 have developed an AI tool capable of generating the Paddington International Virtual Chromoendoscopy Score using iSCAN (Pentax) short-length videos. They conducted their analysis based on histologic remission criteria, specifically RHI ⩽ 3, Nancy histological index (NHI) ⩽ 1, and PICaSSO Histological Remission Index (PHRI) = 0. The accuracy values for these criteria were reported as 83%, 81%, and 83%, respectively. 37
Furthermore, a CAD system was developed to predict histological healing using a commercially available 520-fold ultra-magnifying contact microscope (Endocyto: CF-H290EC; Olympus Corp., Tokyo, Japan) images, which allows real-time evaluation of the microvessels, crypts, and goblet cells when used with narrow-band imaging.38–42 When histological remission was defined as a Geboes score of <3.1, the diagnostic accuracy of the CAD system was 74% for sensitivity, 97% for specificity, and 91%. 43 The updated version of this CAD system received regulatory approval in Japan and has been on the market since February 2021 (EndoBRAIN-UC; Cybernet Systems Corp., Tokyo, Japan). The system is the only commercially available CAD system powered by AI during colonoscopy in patients with UC. However, it is not widely used because the system requires specific expertise in using the ultra-high magnified colonoscope. To address this issue, the same team developed another system that can be applied using a wide range of endoscopes to provide an objective two-class diagnosis of “AI-based vascular-healing” versus “AI-based vascular-active.” 44
With the most recent, Akiyama et al. 45 reported a novel hypoxia imaging algorithm (Fujifilm), which provides colonic tissue oxygen saturation and correlates with clinical, endoscopic, and histological activities. Notably, they have shown that rectal oxygen saturation levels correlate with bowel urgency, which may have opened the door to a new diagnostic of objective endoscopic assessment of dysfunction in patients with UC.
AI-assisted endoscopy to predict clinical outcomes
AI-enabled precise scoring has also been demonstrated to predict future relapse. Stratifying relapse risk through objective measures is anticipated to facilitate early therapeutic intervention and assist patients in sustaining long-term remission.
Takenaka et al. 17 conducted a prospective follow-up study involving 875 patients, as outlined in their initial research. The patients were categorized into groups diagnosed by AI: the AI-diagnosed mucosal activity group and the AI-diagnosed MH group. The hazard ratios yielded by their algorithm for various outcomes, such as hospitalization, colectomy, steroid use, and relapse (defined as a partial Mayo score of ⩾3 and C-reactive protein/calprotectin positivity (⩾3 mg/L/⩾250 μg/g)), were 48.4, 46.4, 10.2, and 8.8, respectively. 46
Moreover, Maeda et al. performed colonoscopy using real-time AI in patients with UC who were in clinical remission. They prospectively monitored these patients (n = 134) for 12 months following colonoscopy. Based on the AI outputs, the patients were categorized into the AI-active and AI-healing groups. Clinical relapse, defined as a partial Mayo score of ⩾3, occurred in 28.4% (21/74) of the patients in the AI-active group and in 4.9% (3/61) in the AI-healing group. 47 Encouraging results stem from their recently CAD-developed system to identify “vascular healing” using commercially available magnifying colonoscopy. In the study, the clinical relapse rate during 12 months after colonoscopy was significantly higher in the AI-based vascular-active group (23.9% (16/67)) compared with the AI-based vascular-healing group (3.0% (1/33)) (p = 0.01). 44
Similarly, Iacucci et al. 37 followed up with 232 patients after AI-assisted colonoscopy and showed that the hazard ratios of their AI-assisted algorithms for adverse clinical outcomes (UC-related hospitalization, colectomy, and change in UC treatment owing to relapse) were 2.9 for the high-definition white-light endoscopy model and 4.0 for the iSCAN model (Table 1).
Summary of artificial intelligence for endoscopy in ulcerative colitis.
AI, artificial intelligence; AUC, area under the curve; CAD, computer-aided diagnosis; DNUC, deep neural network for evaluation of ulcerative colitis; ER, endoscopic remission; HR, histological remission; IBD, inflammatory bowel disease; LED, light emitting diode; MES, Mayo endoscopic subscore; NHI, Nancy histological index; PHRI, PICaSSO Histological Remission Index; PICaSSO, Paddington International Virtual Chromoendoscopy Score; pts, patients; RHI, Robarts histopathology index; UC, ulcerative colitis; UCEIS, ulcerative colitis endoscopic index of severity; WLE, white light endoscopy.
In conclusion, AI-assisted endoscopic scoring can offer not only endoscopist-independent objective tool analysis but also automatically exclude inadequate frames, the distribution of inflammation, optical prediction of histological remission, and on-site prediction of clinical prognosis (Figure 1). Offering objective and precise treatment targets would pave the way for personalized medicine, as well as enhance the efficiency of clinical trials.

The role and potential of artificial intelligence for endoscopy scoring disease activity.
AI systems for digital histological assessment of UC and therapeutic response
A potential application of AI is to assist in real-time histological evaluation. ML approaches are increasingly being developed to aid pathologists in accurately and reproducibly scoring histology, enabling precise quantification of clinically relevant features. Furthermore, promising applications in digital pathology can include the quantification of the expression of molecular targets.
The advent of digital pathology with the integration of digital slides into whole-slide images (WSIs) has extended the frontiers of the pathologist’s view beyond a microscopic slide and enabled the true utilization and integration of knowledge that in the future could surpass human limits and boundaries. 48 Noteworthy, the application of computer vision methods involves converting glass slides containing tissue samples into high-resolution digital images that can be viewed and analyzed using computer software. However, in the field of UC pathology, few AI models have been developed so far. 48
In a pioneering study, Vande Casteele et al. 49 developed the first deep learning algorithm designed to identify eosinophils as markers of active UC. Eosinophils (N = 3480) were manually annotated in digitized images of colonic biopsies to train a CNN. When applied to colonic biopsies from 88 patients with active UC, this algorithm exhibited perfect agreement with manual eosinophil counts performed by pathologists (intraclass correlation coefficients = 0.81–0.92). Nevertheless, eosinophil density did not show a good correlation with histological activity.
However, several papers have proposed that goblet cells play important roles in the onset and progression of UC. In a retrospective cohort study, Ohara et al. 50 developed a deep learning-based model to quantitatively assess goblet cell mucus in patients with UC who were in clinical and endoscopic remission. This model was designed to detect goblet cell mucus and epithelial cells in digital histological images of biopsy specimens and calculate their area ratio. Remarkably, rectal goblet cell ratio could be a valuable marker for assessing UC activity, and the use of deep learning-based models to quantify goblet cell mucus area holds promise for predicting the future risk of clinical relapse.
In a recent study, Najdawi et al. 51 developed an innovative ML-based strategy employing CNN models for segmenting tissue and cells, aimed at analyzing histologic aspects of UC from hematoxylin and eosin-stained WSIs. This model accurately assessed tissue and cell characteristics, exhibiting strong correlation with disease severity and pathologist-assigned scores. Notably, the extracted features closely mirrored pathologist annotations and strongly correlated with both disease severity and NHI scores. Based on the use of a random forest classifier on 13 human-interpretable features (HIF) extracted from these models, NHI score was accurately predicted, with substantial alignment demonstrated with pathologist consensus scores. Of significance, certain HIFs directly measured neutrophil features, acknowledged as pivotal indicators of disease activity, while the absence of neutrophil infiltration defined histologic remission.
Correspondingly, the Paddington International Virtual Chromoendoscopy Score (PICaSSO) team pioneered a multiple instance learning approach employing CNNs to identify neutrophils in WSIs and classify them as indicative of either histological remission or predictive of adverse outcomes. 52 Subsequently, the same group developed and validated an AI system CAD system trained and tested on a large set of digitized biopsies to detect UC disease activity as defined by different histologic indices and predict prognosis. 53 In this study, the CAD system trained and tested on a large set of digitized biopsies had a strong diagnostic performance to detect disease activity assessed by using the PICaSSO Histologic Remission Index (PHRI), 54 based on the presence of neutrophils from all areas (PHRI > 0) with an overall Area Under the Receiver Operating Characteristic Curve of 0.87, a sensitivity of 89%, and a specificity of 84%. This means that this new system exhibited high sensitivity and specificity in distinguishing and predicting histologic activity and remission, as well as stratifying the risk of flare at 12 months, similarly to human pathologists (hazard ratio 4.6 vs 3.6, respectively). 53
Furthermore, Rymarczyk et al. 55 developed an AI/ML model designed to automate the assessment of histological disease activity in UC. These models were trained using a vast collection of imaging data obtained from a company’s late-stage clinical trial. They were specifically trained on scanned histology slides taken from the intestinal mucosa of adult patients and are capable of predicting the Geboes histopathology score for UC. Despite certain limitations such as imbalanced training data for certain Geboes subscores, as well as variations in the quality of histopathology slides, the models perform well in predicting Geboes scores.
More recently, Liu et al. 56 trained on 1,87,571 informative patches from rectal hematoxylin and eosin biopsy samples from 292 treatment-naïve pediatric patients with UC in a multicenter inception cohort (PROTECT2) study with an external validation test cohort included 113 pediatric patients followed in the Canadian Children Inflammatory Bowel Disease Network inception cohort study at the Hospital for Sick Children (SickKids) to predict treatment response in pediatric UC. The authors identified the set of 18 histomic features that, when incorporated in a ML model and digital histopathology, can be used to identify, before the initiation of treatment, pediatric patients who cannot achieve steroid-free long-term remission with mesalamine alone. This model represents a promising approach to integrating digital histopathology-based histomic features and ML algorithms aiming at identifying therapeutic responses (Table 2).
Summary of artificial intelligence for histology in ulcerative colitis.
AUROC, area under the receiver operating characteristic curve; CI, confidence interval; HR, histological remission; ICC, interclass correlation coefficient; NHI, Nancy histological index; PHRI, PICaSSO Histological Remission Index; PICaSSO, Paddington International Virtual Chromoendoscopy Score; pts, patients; RHI, Robarts histopathology index; WSI, whole-slide images.
In conclusion, AI-assisted histologic analysis is poised to become an integral component in advancing precision medicine for patients with IBD. In the near future, AI can serve as an assisting tool, holding the potential to facilitate decentralized trials, similar to the demonstrated utility in the evaluation of endoscopic images. We are approaching a new era of the shift to routinely incorporate biopsy collection and histologic read-out for most UC drug development programs (Figure 2).

The role and potential of artificial intelligence for digital histological assessment.
The role of AI in clinical trials
Histo-endoscopic mucosal improvement and remission are becoming crucial therapeutic endpoint in UC trials since they may provide a more accurate and deep evaluation of the disease healing. Yet, factors such as interobserver variability, endoscopist inexperience, and the variance in definitions can contribute to misjudging disease severity, potentially leading to inappropriate patient selection or incorrect assignment to treatment groups. Consequently, recruitment challenges may arise, potentially diminishing the pool of eligible participants and necessitating larger cohorts, thereby elevating the costs of clinical trials. 57
One potential strategy to break the therapeutic ceiling and reach 100% target could be to change the way to perform clinical trials in IBD. 58 In this context, AI is expected to raise the bar on study design, including patient selection, enrollment, and improving endoscopy quality through validated site reads, as well as facilitating stratification and re-randomization. 58
Firstly, central reading of endoscopic video plays a critical role in IBD clinical trials, to overcome the subjective variability in endoscopic scoring. Indeed, local readers tend to overscore the screening endoscopy and underscore the outcome of endoscopy. 57 Feagan et al., 59 investigated the impact of centralized image review on mitigating discrepancies in placebo response rates in UC trials. Interestingly, 31% of participants initially deemed eligible based on a UCDAI sigmoidoscopy score ⩾2 by site investigators were subsequently deemed ineligible by the central reader’s independent assessment due to insufficient endoscopic disease. Removing these patients who had been entered based on discrepantly high sigmoidoscopic scores not only increased the difference between remission rates for mesalamine 4.8 g/day and placebo (29% and 13.8%, respectively, with a now widened difference of 15%; 95% confidence interval, 3.5%–26.0%; p = 0.011), but also led to the trial achieving its primary and all secondary endpoints.
Subsequently, Gottlieb et al. 19 assessed the ability of a recurrent neural network (RNN) for predicting scores assigned by central readers and compared its performance with MES and UCEIS scoring systems. Remarkably, they revealed that an RNN algorithm, trained on full-length videos from an industry-sponsored phase II clinical trial in UC where a human central reader assigned only one video-level MES and UCEIS score, achieved an accuracy of 97.0% for UCEIS and 95.5% for MES.
Hence, AI has the potential to transform the assessment of clinical trials by providing more reliable and efficient endpoint assessment through image-based endpoint detection. By implementing standardized imaging reading and scoring methods, AI can significantly enhance the speed and accuracy of disease assessment, offering a more precise identification of treatment response and non-response. Additionally, AI-assisted disease activity and remission assessment are expected to decrease variability and minimize the need for second reader/adjudication. Of note, cost savings are a further consideration to bear in mind, with AI potentially reducing central reading costs, a major component of trial budgets. 57
In conclusion, AI has the potential to revolutionize various aspects of IBD clinical trials, from expediting central reading to improving endpoint assessment and defining remission, ultimately leading to more effective decision-making and efficient management of IBD. Integration of data from multiple sources, including clinical symptoms, endoscopic read-outs, histopathology, and gene expression values, through AI can hold potential in predicting treatment outcomes and informing prognosis and therapeutic response.
The role of AI in clinical practice
The role of AI in clinical practice is expected to encompass education, enhance quality, and facilitate visualization. With the global rise in the number of patients with UC, colonoscopists are not always sufficiently equipped with expertise in UC. Furthermore, given the evolution of new endoscopic scoring systems and the growing sophistication of endoscopic technology, the proficiency expected from endoscopists has also increased. Histological scoring is in a similar situation. As the concept of T2T becomes more prevalent in routine practice, the existence of inter-examiner differences in endoscopic and pathological scores highlights the necessity of establishing definitive treatment goals. To tackle these issues, AI is anticipated to play a pivotal role as an educator, helping to improve physicians’ ability.
In CAD systems for colorectal neoplasia, which have been already undergoing clinical implementation, reports have shed both positive and negative light on the use of AI in its characterization,60–62 and no conclusions have yet been reached. Unfortunately, the effectiveness of AI as an educator in the UC field has not yet been tested.
Next, the utilization of AI is anticipated for the quality improvement of colonoscopy. 63 Even if the AI serves as an accurate image reader, its capabilities may not be effectively demonstrated if the images acquired by the endoscopist are of low quality. The function to automatically extract scoring ineligible images has already been reported in several papers. Verification of whether real-time use of that function improves the quality of colonoscopy is required.
Lastly, regarding the visualization of endoscopic examinations: in conventional clinical decisions, physicians typically review reports provided by endoscopists, mentally visualize the distribution of inflammation, and formulate a treatment plan. However, there is a risk that both parties’ interpretations may not always align. If computer vision, facilitated by AI, enables the visualization of the degree and distribution of inflammation, it will not only simplify the comprehension of pathophysiology but also serve as a robust tool for sharing the disease status with the entire medical team.
In conclusion, AI holds the potential to address several unmet needs in UC clinical practice, resulting in increased efficacy and objective, precise management of UC. Moreover, by integrating endoscopic, pathological, and clinical data through AI, it is anticipated that more detailed predictions of future relapse and assessments of treatment efficacy will become feasible. There is a strong hope for the expeditious launch of a product suitable for clinical practice.
Currently, the limitations and challenges of AI
Although numerous reports exist in the research field, only a few of them have received regulatory approval, resulting in a limited level of clinical implementation. Consequently, there is a scarcity of real-time validation for AI. The question of whether AI can achieve the anticipated accuracy in actual clinical practice remains unclear. Furthermore, the value added by the use of AI to clinician diagnoses has not been substantiated. For instance, it remains to be determined if employing pathology AI enhances the accuracy of pathologists’ scoring. Additionally, the impact of AI on clinicians’ judgments, who bear ultimate responsibility, has not been verified.
Crucially, it is imperative to ascertain whether AI contributes to improvements in patients’ quality of life and long-term prognosis and whether it possesses the potential to influence the standard of care.
The primary concern associated with the utilization of AI in endoscopy and histology for UC lies in the interpretation of AI’s diagnostic output by physicians. There is apprehension that physicians lacking an integrated understanding of various factors, including clinical symptoms, laboratory results, and patient history, may overly depend on AI’s diagnostic output, potentially resulting in inadequate care. While AI has the potential to assist physicians in offering a more standardized diagnosis, especially in the case of UC patients, shared decision-making tailored to individual preferences and lifestyle habits is crucial. AI-based practices, masquerading under the banner of standardization, need to be vigilantly monitored to mitigate the risk of interference with optimal personalized medicine. Therefore, striking a balance between leveraging AI for uniform diagnoses and ensuring personalized care for UC patients remains a critical consideration.
In conclusion, the utilization of AI in the management of UC patients is an expanding field of research with high expectations. However, it is crucial to objectively acknowledge both the positive and negative implications that AI may introduce to UC management.
Future directions and conclusion: new algorithms of AI endoscopy and histology to assess and manage UC
The potential of automating the extraction of endoscopic disease activity, histologic indices, and disease phenotyping using ML image analysis applications is expected to revolutionize the standard of care for UC in the near future. AI can play a crucial role in defining remission by providing comprehensive assessments beyond human capabilities, such as quantifying vascular and mucosal patterns and subtle details. It can also assist in real-time histological evaluation, predicting histological remission, with positive implications for clinical trials and practice. Therefore, integrating AI with endoscopy and histology extends beyond diagnosis to predict disease trajectories and clinical outcomes in colitis and offers the promise of identifying new treatment target goals leading to better long-term outcomes for patients.
Most studies employing AI in UC have been post hoc evaluations or single-center, prospective studies. While one of exception is the PICaSSO multicenter, international, prospective study in which PICaSSO endoscopy and histology AI were developed based on numerous AI endoscopy videos and histological slides in a large patient cohort. 53 In the future, the developed AI needs to be used in real-time in real clinical practice and randomized controlled trials are required to better clarify the benefits and limitations. In the context of clinical trials, an AI/ML algorithm could be used to select patients accurately and thus reduce reliance on human central readers. However, since the training of CNNs for specific findings depends on human judgments, AI currently cannot surpass humans, as humans serve as the point of reference. Besides, AI systems that integrate disease activity beyond human cognitive limitations would need validation through disease outcomes, both short-, mid-, and long-term. While AI has the potential to assist physicians in offering a more standardized diagnosis, especially in the case of UC patients, shared decision-making tailored to individual preferences and lifestyle habits remains crucial. Integrating large datasets from multiple sources, including clinical symptoms, endoscopic read-outs, histopathology, and gene expression values, through AI can provide further insight into IBD and lead to the discovery of novel biomarkers. This fusion of AI with new advanced endoscopy technologies and histology has the potential to redefine our understanding of disease progression, response assessment, and treatment planning. This will require the cooperation and commitment of endoscopists, specialists, and societies. Thus, in the future, new AI algorithms for the clinical management of IBD patients by integrating innovative endoscopic tools with AI systems and molecular endoscopy hold the potential to deliver precision medicine in IBD, turning what was once a vision into an achievable reality.
Footnotes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
