Artificial intelligence to revolutionize IBD clinical trials: a comprehensive review

Abstract

Integrating artificial intelligence (AI) into clinical trials for inflammatory bowel disease (IBD) has potential to be transformative to the field. This article explores how AI-driven technologies, including machine learning (ML), natural language processing, and predictive analytics, have the potential to enhance important aspects of IBD trials—from patient recruitment and trial design to data analysis and personalized treatment strategies. As AI advances, it has potential to improve long-standing challenges in trial efficiency, accuracy, and personalization with the goal of accelerating the discovery of novel therapies and improve outcomes for people living with IBD. AI can streamline multiple trial phases, from target identification and patient recruitment to data analysis and monitoring. By integrating multi-omics data, electronic health records, and imaging repositories, AI can uncover molecular targets and personalize trial strategies, ultimately expediting drug development. However, the adoption of AI in IBD clinical trials encounters significant challenges. These include technical barriers in data integration, ethical concerns regarding patient privacy, and regulatory issues related to AI validation standards. Additionally, AI models risk producing biased outcomes if training datasets lack diversity, potentially impacting underrepresented populations in clinical trials. Addressing these limitations requires standardized data formats, interdisciplinary collaboration, and robust ethical frameworks to ensure inclusivity and accuracy. Continued partnerships among clinicians, researchers, data scientists, and regulators will be essential to establish transparent, patient-centered AI frameworks. By overcoming these obstacles, AI has the potential to enhance the efficiency, equity, and efficacy of IBD clinical trials, ultimately benefiting patient care.

Plain language summary

Artificial intelligence in IBD clinical trials

Inflammatory Bowel Disease (IBD), including Crohn’s disease and ulcerative colitis, poses significant challenges for clinical trials, such as difficulties in recruiting participants, variations in disease presentation, and inconsistent treatment responses. Artificial intelligence (AI) is increasingly recognized as a solution to these challenges, improving recruitment, data analysis, personalized care, and trial design. AI can enhance recruitment by analyzing medical records to match patients to trials efficiently. AI tools can automate this process, improving both efficiency and diversity. Additionally, AI can predict dropout risks, helping researchers plan better and maintain trial integrity. IBD trials generate complex datasets that require advanced analysis. AI can process these large datasets to identify patterns in disease progression and treatment efficacy, also improving the accuracy of endoscopic and histological assessments, providing deeper insights into the disease. AI can enable personalized treatments by predicting responses based on genetics, biomarkers, and medical history. Real-time monitoring through wearable devices supports early interventions, improving patient outcomes and disease management. Adaptive trial designs might also benefit from AI, allowing protocols to adjust based on interim results. This enhances trial efficiency, ethical standards, and participant safety, while ensuring accurate data collection. However, implementing AI requires addressing data privacy, fairness, and regulatory compliance. Transparent, secure, and inclusive AI models are essential to build trust and ensure equitable benefits across all patient populations. AI is transforming IBD clinical trials by streamlining recruitment, improving data analysis, personalizing care, and optimizing trial design. By addressing challenges proactively, we can unlock AI’s full potential, leading to more efficient trials and better outcomes for patients.

Keywords

artificial intelligence (AI)clinical trials inflammatory bowel disease machine learning (ML)patient recruitment

Introduction

Inflammatory bowel disease (IBD), encompassing Crohn’s disease (CD) and ulcerative colitis (UC), is a chronic and relapsing condition that presents significant challenges for patient management and in conducting clinical research in an efficient manner.^1,2 Recruitment rates for pharmaceutical IBD clinical trial continue to decline with average global recruitment rates of 0.1 patient/center/trial. The reasons for poor recruitment are complex and multifactorial and include factors related to patient characteristics, overengineered protocols, use of placebo and increasing availability of commercial agents. In addition, heterogeneity in disease presentation, variability in treatment response, and the requirement for long-term management necessitate innovative approaches in clinical trials. Traditional methodologies have often failed to address these complexities, highlighting the need for advanced technologies to provide more precise and efficient solutions at all stages of a clinical trial including patient recruitment, endpoint assessment, and trial monitoring.

In recent years, artificial intelligence (AI) and machine learning (ML) have emerged as powerful medical tools, offering unprecedented data analysis, pattern recognition, and predictive modeling capabilities. These technologies have the potential to enhance aspects of IBD clinical trials and may provide solutions to challenges that have historically impeded progress in this field.³ However, the promise of AI can only be fully realized if these technologies are scientifically rigorous, clinically valid, and aligned with regulatory standards to ensure patient safety and improve outcomes.^4,5

This review explores the potential impact that AI could have on IBD clinical trials, focusing on key areas such as patient recruitment, data analysis, personalized medicine, and trial design. Additionally, we discuss the ethical, regulatory, and practical considerations that must be addressed to ensure the responsible integration of AI in clinical trials for IBD.

A summary of definitions of commonly used terminology is included in Table 1.

Table 1.

Summary of definitions of commonly used terminology.

Term	Abbreviation	Definition	Ref
Artificial intelligence	AI	A field of computer science that involves the simulation of human intelligence in machines, enabling them to perform tasks that typically require human cognition, such as learning, reasoning, and decision-making.	6
Machine learning	ML	A subtype of AI that focuses on developing algorithms that allow computers to learn from and make decisions based on data without being explicitly programmed for specific tasks.	6,7
Deep learning	DL	A subtype of ML that uses neural networks with multiple layers (hence “deep”) to analyze large datasets and extract high-level features for tasks such as image and speech recognition.	6,7
Neural networks	NN	Computational models inspired by the human brain that are composed of interconnected nodes (neurons) and are used in ML and DL to identify patterns and make predictions.	7
Natural language processing	NLP	A branch of AI that enables computers to understand, interpret, and generate human language, often used for tasks like text analysis, language translation, and chatbot interactions.	8
Large language models	LLMs	A type of AI model trained on vast amounts of text data to understand and generate human-like language, often used for tasks like summarization, translation, and conversational AI.	9,10
Shapley additive explanations	SHAP	A framework used in ML to explain the output of predictive models by assigning importance values to each input feature, helping users understand how a model arrives at its predictions.	11
Explainable artificial intelligence	XAI	AI systems designed to provide clear, understandable explanations for their predictions and decisions, improving trust and accountability in their applications.	12
Artificial intelligence fairness 360	AIF360	Toolkit developed by IBM to detect and mitigate bias in AI models, ensuring equitable and inclusive outcomes in AI-driven decision-making processes, such as clinical trial recruitment.	13

AI in patient recruitment: Streamlining and enhancing recruitment for IBD clinical trials

Recruiting patients for IBD clinical trials remains a significant challenge.¹⁴ Despite some advances, issues such as overestimating eligible populations, limited patient awareness, logistical difficulties, and competition for participants still persist.¹⁵ Traditional recruitment approaches are often resource-heavy and struggle to efficiently identify and enroll appropriate candidates. AI offers promising solutions by analyzing large datasets, including electronic health records (EHRs) and patient-reported outcomes. These technologies may help to more efficiently and accurately match eligible participants with trial criteria, improving recruitment criteria.^16,17

Screening activities

AI-driven tools like natural language processing (NLP) are improving clinical trial recruitment by analyzing clinical notes and unstructured data to identify potential participants that might be missed by traditional screening methods. Leveraging NLP and ML, generative AI can automate the evaluation of eligibility criteria against medical histories, drastically reducing the need for manual reviews. This allows researchers and clinical staff to potentially pre-screen hundreds of candidates in just minutes, speeding up pre-screening activities. One recent advancement in this field is TrialGPT, a model designed to improve patient-trial matching.¹⁸ Using large language models, TrialGPT analyzes patient medical records and compares them with trial eligibility criteria. Trained on data from 184 patients with complex conditions—predominantly cancer and other chronic diseases such as cardiovascular disease, diabetes, and rare genetic conditions—and 18,238 annotated clinical trials, TrialGPT not only determines patient suitability but also provides detailed explanations for its decision.¹⁸ When tested on a larger dataset of patients across oncology and various chronic disease populations, TrialGPT demonstrated strong performance. Its explanations aligned closely with those of human experts, effectively ranking trials and excluding those for which patients were ineligible. However, some errors were noted due to limitations in the underlying language models.

Generative AI, using tools such as chatbots and virtual assistants, can also reduce the screening burden on clinical trial sites by handling initial participant screening and communication. AI-enabled platforms such as myTrialsConnect enhance participant interactions, gather trial-specific data, and even schedule appointments, thus improving accessibility and workload management.¹⁹

Enhancing diversity in enrollment

In IBD clinical trials, recruitment bias can lead to the underrepresentation of minority populations, which impacts the generalizability of findings. AI Fairness 360 (AIF360) developed by IBM Research might tackle this issue by ensuring that AI-driven recruitment algorithms do not disproportionately exclude these groups, fostering more inclusive and equitable study populations.¹³ By addressing bias, the toolkit allows researchers to increase the likelihood of fair and representative samples that reflect the diversity of the population affected by the disease being studied, and to ensure that eligibility criteria are applied equitably to all groups, avoiding bias that could exclude minority populations.

Furthermore, AI can predict patient dropout rates and adherence to trial protocols, enabling proactive management of these issues, which is crucial for maintaining trial integrity.^20,21

Enhancing data analysis through AI: Utilizing ML to analyze complex IBD trial outcome data

IBD clinical trials generate large and complex datasets, including clinical, imaging, biomarker data, and genomic data. AI, particularly ML and deep learning algorithms, can process high-dimensional data to uncover hidden patterns or correlations critically for understanding disease progression, treatment responses, and patient subgroups.^22,23

EHR data analysis using ML methods

A key application of AI in IBD research involves leveraging ML techniques to analyze EHR-derived data. These methods allow for integrating patient demographics, physiological measurements, disease history, clinical questionnaires, histology, serum biomarkers, and drug exposure to uncover insights that traditional analyses may overlook. ML-based models, such as XGBoost and deep learning approaches, can identify complex, nonlinear relationships that influence disease progression and therapeutic outcomes.

Predicting response to therapy

A recent study by Harun et al. shed light on the role of AI in shaping future clinical trial designs, particularly by identifying stratification factors that can optimize treatment effectiveness and improve patient outcomes.²⁴ The authors conducted a post hoc analysis of four randomized controlled trials (RCTs) of etrolizumab in patients with UC, using advanced ML techniques to assess which patient factors impact remission. XGBoost ML models were used to evaluate the effect of various patient-level data on the likelihood of achieving remission. To interpret the complex predictions, the SHAP (SHapley Additive exPlanations) framework clarified which factors were most influential. The data analyzed included demographics, physiological measurements, disease history, clinical questionnaires, histology, serum biomarkers, and drug exposure. The models performed well, achieving an area under the receiver operating characteristic curve (AUROC) of 0.74 ± 0.03 for induction and 0.75 ± 0.06 for maintenance. By using AI techniques, the study was able to analyze a large, complex dataset and reveal nonlinear relationships and interactions that traditional methods might miss. The use of XGBoost improved the predictive accuracy of remission based on diverse variables, offering deeper insights into patient outcomes. The SHAP framework further enhanced understanding by identifying key factors influencing remission, aiding in patient stratification and optimizing treatment strategies for future trials.

Endoscopy

Endoscopic assessment is the cornerstone to establish patient eligibility for IBD trial participation and to estimate the efficacy of trial interventions. Blinded central endoscopic reading is the current standard which, compared to local endoscopic reading, increases objectivity, minimizes variability, reduces placebo rates, and consequently maximizes effect sizes.^25–29 Nonetheless, even agreement between expert central readers is imperfect,^25,30 and several scoring conventions were added to existing evaluative indices with the purpose of harmonizing scores.³¹ Disagreement between central readers further complicates the scoring process by introducing the need for outcome adjudication, where multiple read algorithms to resolve disagreement and assign a final score are possible, each with their advantages and disadvantages.^32,33

Several AI models have been developed to deliver reliable and accurate readings of endoscopic videos in UC. In a recently published meta-analysis, 12 studies were included, with 9 studies evaluating the Mayo endoscopic score (MES)³⁴ and 3 the Ulcerative Colitis Endoscopic Index of Severity (UCEIS)³⁵ as the reference standard.³⁶ Overall, the sensitivity and specificity of AI for endoscopic assessment was high; both for still images (sensitivity 91%, specificity 89%) and videos (sensitivity 86%, specificity 91%). A notable finding was the high heterogeneity between studies with I² values exceeding 90%.

Several aspects of study design should however be considered to correctly contextualize these findings and identify future research priorities. All studies used a human expert reader as the reference standard. As the training of convolutional neural networks, the AI tool used in endoscopy, depends on the human reader reference, they, by definition, cannot yet surpass human performance in terms of accuracy, whereas gains in efficiency and throughput may be considerable. A further potential use of AI is using it by default to screen all endoscopies, performed at a given site with the purpose of identifying patients with endoscopically active disease (MES 2 or 3) and flagging them for potential inclusion in trials. Studies included in the meta-analysis all used dichotomized outcomes, for example, MES 0–1 versus 2–3. Whilst endoscopic remission was defined as a MES 0–1 in the past,³⁷ the two scores are no longer conflated in contemporary trials: a score of 0 denotes endoscopic remission and a score of 1 endoscopic improvement, reflecting two different outcomes.³⁸ The distinction between a score of 2 and 3 is also not insignificant as baseline endoscopic activity may serve as a stratification factor for randomization. It should be acknowledged that only three of the studies supported their models with external validation cohorts.^39–41 Finally, the high heterogeneity persisted even in sensitivity analyses separating studies based on still image versus video assessment and based on the numbers of images evaluated. The high variability between studies could therefore be the result of discrepancies in image annotation, image pre-processing, and training algorithms.⁴² Developing AI algorithms for endoscopic assessment of CD remains an unmet research need, studies thus far have focused on video capsule endoscopy, which does not feature in regulatory clinical trials.

Using AI-based algorithms in clinical trials has been shown to be feasible as a recurrent neural network model performed favorably compared to human readers for the evaluation of full-length endoscopy videos in a phase II trial of mirikizumab.⁴³ It should be noted, however, that this trial utilized a single central reader paradigm, and it remains unknown how to best integrate AI algorithms in multiple central reader paradigms which require outcome adjudication. Currently, we are lacking studies to inform optimum positioning of AI-based algorithms in reading paradigms: it is unknown, whether the algorithm should replace the local reader, the central reader, or perhaps even both readers. An often-cited limitation of the MES is the fact that it defaults to the worst affected area of the colon visualized, regardless of potential changes in disease extent.⁴⁴ An AI-based solution integrating both endoscopic disease severity and disease extent is the cumulative disease score (CDS).⁴⁵ A notable advantage of this system, tested on the ustekinumab trial dataset is its superior ability to discriminate between the ustekinumab arm and the placebo arm—a simulated sample size calculation indicated that 50% fewer patients would be needed to demonstrate a difference with CDS compared to the MES. Although the latter remains the regulatory standard, CDS could be used in early drug development programs to detect between-arm differences with smaller numbers of patients. AI could conceivably recognize endoscopic lesions and patterns, which are not part of established endoscopic indices and therefore be potentially more sensitive to change. This could be particularly helpful in early drug development to guide decisions whether to continue the clinical program.

Histology

Histological remission is currently understood as an adjunct to endoscopic remission indicating a deeper level of healing.⁴⁶ It is not yet considered a treatment target, but its potential role is being evaluated in a randomized trial.⁴⁷ Similarly to endoscopy, histological assessment depends on scoring indices, which face challenges similar to those of endoscopic indices: inter-rater reliability is imperfect, scoring can be time-consuming and requires expertise.⁴⁸ In UC histology, AI models have been used to develop novel scoring indices,^49,50 to replace human readers for established indices,^51,52 and to evaluate individual histological features, such as the presence of eosinophils⁵³ and basal cell plasmacytosis.⁵⁴

Overall, models developed to evaluate biopsies using existing indices have shown encouraging sensitivity and specificity to detect histological remission.^51,52 One of the systems was developed using clinical trial data, demonstrating feasibility in this setting.⁵¹ Analogously to AI models for endoscopic assessment, currently developed algorithms predict histological remission as a binary outcome, but cannot provide grading of inflammatory activity. Arguably, this is less of a limitation for histology than it is for endoscopy as precise grading of histologically active disease is less relevant and has little impact on the interpretation of clinical trial results.

A further area of development of AI is also the deployment of algorithms to help guide human pathologists identify the main areas of interest within a given biopsy fragment.⁵¹ Novel algorithms also promise to detect histological features beyond those included in established histological scoring indices, which could be more informative for predicting subsequent treatment outcomes. In a study of 114 patients with UC achieving endoscopic improvement (MES ⩽1), a deep learning model successfully quantified the ratio between the goblet cell mucus area and epithelial cells, a lower ratio was associated with an increased rate of disease relapse within the subsequent 12 months.⁵⁵ Even more impressively, a ML-based algorithm was able to identify 18 histomic features, which were able to predict which patients with pediatric UC would not respond to treatment with mesalamine alone.⁵⁶ These features, discovered in an inception cohort of 292 patients, were later tested in an external validation cohort with almost identical performance (AUROC 0.89 in the development cohort and 0.88 in the validation cohort). More recently, Ohara et al.⁵⁷ developed an advanced AI system incorporating semantic segmentation and object detection models to identify neutrophils in hematoxylin and eosin-stained WSIs. This system not only detects neutrophils in the epithelium and lamina propria but also predicts components of the Nancy Histological Index and the PICaSSO Histologic Remission Index.⁴¹ Notably, the AI-predicted histological scores correlated well with pathologists’ assessments (Spearman’s ρ = 0.68–0.80; p < 0.05).

In another study, Peyrin-Biroulet et al.⁵⁸ utilized automated image analysis combined with ML to evaluate histological disease activity based on the Nancy index in 200 histological images from UC patients. The AI system’s performance was compared to assessments by four independent histopathologists. Despite limitations due to the small annotated dataset required for AI training,⁵⁹ the study reported high correlations both among histopathologists (89.33) and between the AI system and histopathologists (87.20).

Radiology

Radiological assessment is likely to have an increasingly prominent role in clinical trials in IBD. Transmural healing is defined as an adjunct to endoscopic remission, reflecting a deeper level of healing in CD,⁴⁶ its potential advantage over contemporary treatment goals is under evaluation in an ongoing randomized trial (NCT06257706). Fibrostenosing CD is an area of unmet therapeutic need, and the development of potential antifibrotic agents is a research priority.⁶⁰ The recent publication of dedicated indices^61,62 extends the role of cross-sectional imaging beyond evaluating inflammation.

Detection and characterization of strictures is an area well-suited to AI models. Good concordance has been shown between (semi)-automated measurements and expert radiologist assessment for key elements, such as bowel wall thickness, pre-stenotic dilation, and minimum luminal diameter.^63,64 Notably, AI was able to quantify intestinal fibrosis with an AUROC exceeding 0.800 compared to the reference standard of histopathological assessment of the resection specimen.⁶⁴ Automated assessment was non-inferior to expert radiological assessment and considerably faster. AI has not yet been evaluated for radiological assessment of perianal fistulizing CD, which is expected to be quite challenging given the complex morphology and heterogeneity of this disease phenotype.

Emerging research also indicates that convolutional neural networks are able to accurately identify abnormal bowel wall thickening on images obtained with intestinal ultrasound, although this remains to be proven in studies with larger sample sizes and also using cine loops as opposed to still images.^65,66

A summary of endoscopic, histologic and radiology AI modalities can be found in Table 2.

Table 2.

Summary of endoscopic, histologic, and radiology AI modalities.

Modality	Advantages	Disadvantages	Examples of use
Endoscopy	Reliable and accurate AI scoring of endoscopic disease activity in ulcerative colitis compared to human readers Potentially more efficient than human central readers	High heterogeneity between studies Not available for Crohn’s disease Unknown optimal reading paradigm (local reader vs central reader vs AI) Most algorithms provide only binary outcomes (remission vs no remission)	Use of the Cumulative Disease Score in the ustekinumab development program—appears more sensitive in detecting differences between the drug arm and placebo arm, potentially requiring smaller sample sizes⁴⁵ An AI model performed well on full-length trial videos from the mirikizumab program—both for the Mayo score and the Ulcerative Colitis Endoscopic Index of Severity⁴³
Histology	Good concordance between AI scoring and human pathologists Supporting human reading by highlighting regions of interest Potential for discovering novel histological features, which are not part of established indices, but are potentially associated with subsequent outcomes	Algorithms provide only binary outcomes (remission vs no remission)	Feasibility demonstrated on trial datasets for both Crohn’s disease and ulcerative colitis⁵¹
Radiology	Encouraging initial results for characterizing strictures in Crohn’s disease, compared to human readers Potential for use in intestinal ultrasound and perianal fistulizing Crohn’s disease	Limited research available	Good concordance between AI algorithms and human readers to characterize strictures^63,64
Multimodal trial data	Potential to overcome limitations of traditional analytical approaches to better predict treatment outcomes Potential to analyze large biomarkers datasets and data from wearable devices	Algorithmic bias	Predicting response to treatment with infliximab based on an array of 92 proteins⁶⁷ Integrating clinical and microbiome data to predict clinical remission with vedolizumab in Crohn’s disease⁶⁸

AI, artificial intelligence.

Synthesizing multimodal clinical trial data

Possibly the greatest opportunity for harnessing the power of AI in clinical trials lies in the analysis of complex multimodal data to predict outcomes. Existing endoscopic indices have limited predictive capability for subsequent disease evolution and histological indices focus, perhaps unduly, on neutrophils. An algorithm quantifying red pixels in endoscopy videos of UC, the red-density index, performed acceptably for predicting 5-year clinical remission.^69,70 An algorithm based on endoscopy, supported by endocytoscopy, classified patients by risk of clinical relapse in real-time.⁷¹ The AI-based PICaSSO Histologic Remission Index successfully estimated the likelihood of a flare of UC at 1 year.⁵² AI approaches are also well suited to the analysis of complex proteomic and microbiomic data. An AI algorithm supported classification of patients based on the relative abundance of 92 inflammatory protein, which was associated with subsequent response to treatment with infliximab.⁶⁷ A neural network algorithm integrating clinical and microbiome data, demonstrated an acceptable predictive capability for clinical remission after 14 weeks of treatment with vedolizumab in CD.⁶⁸ The use of wearable devices, such as smart watches, for monitoring IBD and predicting subsequent disease flares is under active investigation. The abundance of clinical and biomarker data gathered through these devices could optimally be analyzed using AI-based methods to develop novel digital outcomes and predict future disease evolution.^72,73

Additionally, AI could integrate multi-omics data to identify novel biomarkers, which can serve as surrogate endpoints in clinical trials, thereby accelerating drug development.⁷⁴ AI-driven analytics could also enhance the understanding of patient heterogeneity in IBD, enabling the identification of distinct disease subtypes. This stratification informs the development of targeted therapies, ultimately leading to more personalized treatment approaches.⁷⁵

Personalized medicine and AI: Tailoring treatment strategies to individual patients

The current management of IBD is challenging, as the disease varies widely in severity and response to treatment among patients. With the availability of several classes of advanced therapies, choosing a suitable therapeutic agent which would result in optimal response is challenging. Currently there are no tools that can accurately predict response to any given agent. Traditional treatment protocols often involve the best clinical judgment based on available evidence, clinical records, social factors, and local institutional policies. Personalized medicine is an emerging approach that moves away from a one-size-fits-all paradigm in healthcare, instead tailoring medical treatment to the individual characteristics of each patient. For chronic, heterogeneous conditions such as IBD, this approach holds significant potential. Patients with IBD differ widely in their genetic profiles, disease severity, and response to medications.

The advent of AI presents a revolutionary opportunity to optimize and personalize treatment strategies for IBD, tailoring care to individual patient characteristics. AI can analyze large, complex datasets that include genetic information, clinical records, environmental factors, and treatment responses, enabling clinicians to make more precise and effective decisions in IBD management.^76,77 This process significantly reduces the time spent on trial-and-error treatments, improves clinical outcomes, and minimizes the risk of adverse effects from ineffective therapies.

AI models can continuously learn and adapt based on new data, enabling real-time personalization of treatment strategies.⁷⁸ These adaptive strategies could be particularly beneficial in managing IBD, where disease activity can fluctuate over time.^79–81 AI can assist physicians in several stages of management of patients with IBD including choosing appropriate agent early at the time of diagnosis, predicting disease progression and exacerbations allowing early intervention.

AI in choosing appropriate therapeutic agent

There are several factors that determine response to a given agent. AI has been shown to be useful in predicting treatment response to various drugs in cancer therapy and antibacterial therapy.^82,83 Similarly, AI can predict which patients are more likely to respond to certain biologic therapies based on their genetic and microbiome composition.^77,84 Several studies demonstrated predictive ability of AI models in predicting response to various advanced therapies such as anti-TNFs, vedolizumab, and ustekinumab in patients with CD.^85–88 Some of these studies used clinical and laboratory data, while others used genotype data. Recent research has identified hundreds of genetic loci associated with IBD, yet these genetic insights have not been fully integrated into clinical practice due to their complexity. AI can synthesize this genomic information with other patient-specific data to create predictive models that anticipate how a patient’s disease will progress and how they will respond to different treatments. Therefore, the use of AI models can potentially predict best treatment option for a given patient. In the coming years, AI-based histopathological studies are expected to make significant contributions to IBD management. Preliminary data⁸⁹ showed that computational pathology algorithms can identify cytokines, such as IL-23 signaling activity, from H&E images. This could significantly enhance our understanding of disease path mechanisms and optimize treatment options for patients with IBD.

AI in predicting disease progression

The unpredictable course of IBD is one of the most challenging aspects of managing the disease. Patients often alternate between periods of active inflammation and remission, with some experiencing frequent complications, such as fistulas, strictures, or the need for surgery. Predicting when a patient will experience a disease exacerbation or develop complications is essential for timely intervention and disease management. ML algorithms can be trained on large datasets that include clinical records, laboratory results, imaging studies, and lifestyle factors to identify patterns and predictors of disease progression.⁹⁰ These algorithms can then generate risk profiles for individual patients, estimating the likelihood of a flare-up, complication, or need for surgery. For example, AI can assess inflammatory biomarkers, such as C-reactive protein or fecal calprotectin, alongside clinical symptoms and patient-reported outcomes, to predict when a patient is at high risk of a non-response to therapy.⁹¹ Such predictive models allow physicians to modify treatments proactively, such as increasing medication doses or initiating new therapies before the patient experiences a relapse. For example, in a study from Korea, the ML model for prediction of IBD-related outcomes at 5 years after diagnosis yielded an area under the curve of 0.86 (95% CI: 0.82–0.92). This model performed consistently across a range of other datasets, enabling physicians to perform close follow-up based on the patient’s risk level.⁹² A novel ML model based on data of 20,368 veteran health administration patients substantially improved ability to predict future IBD-related hospitalization and steroid use.⁹³ Furthermore, AI can predict long-term outcomes in IBD patients, guiding decisions about the intensity of treatment. For example, patients at high risk of developing complications might benefit from early, aggressive therapy with biologics or immunosuppressants, while those with a lower risk profile could be managed with less intensive treatments. This individualized approach to disease management can reduce overtreatment, minimize side effects, and improve patient quality of life. Recently, Wang et al.⁹⁴ developed a deep learning framework aimed at predicting postoperative recurrence in CD. The model automatically analyzed the muscular layer and myenteric plexus, integrating clinical data to evaluate myenteric plexitis severity and recurrence risk. This approach sheds light on the mechanisms underlying postoperative recurrence and offers potential for enhancing long-term disease management.

Wearable devices, mobile health applications, and home-based diagnostic tools can collect continuous data on a patient’s symptoms, biometrics, and lifestyle factors. AI can analyze these data streams in real time, identifying subtle changes that may indicate an impending flare-up or treatment failure. For instance, fluctuations in inflammatory markers detected through home-based stool tests or blood samples can signal worsening disease activity. By integrating this data with AI algorithms, clinicians can be alerted to intervene early, preventing a full-scale relapse or the need for hospitalization. AI-powered applications can provide patients with personalized feedback based on their symptoms, response to questionnaires. These applications can remind patients to take their medications, track their symptoms, and alert them to seek medical attention if necessary. AI can further enhance these platforms by analyzing patterns in patient-reported outcomes, detecting early warning signs of non-adherence or treatment failure, and suggesting adjustments to the treatment plan. Real-time monitoring supported by AI not only improves disease management but also empowers patients to take a more active role in their care. This proactive approach has the potential to reduce the burden of IBD, both in terms of physical symptoms and the psychological toll of living with a chronic illness.

Utilizing multi-omics data and AI to aid personalized medicine

In recent years omics data analysis has gained importance and has helped in understanding pathogenesis of IBD. The integration of multi-omics and clinical data has been enhanced, leading to breakthroughs in disease diagnosis, drug discovery, and precision medicine. ML-based methods offer significant advantages in handling large-scale datasets and can reveal patterns among a high number of features that traditional methods may fail to identify. Multi-omics analysis can help in holistically understanding pathogenesis, simultaneous changes in microbiome, biological processes which in turn can help the researchers in identifying potential targets. For instance, in a study by Lloyd-Price et al. extensive multi-omics molecular profiling was performed on 132 IBD patients.⁹⁵ The authors observed significant alterations in microbiota composition and function based on disease activity states. In another study, remission-associated multi-omic profiles were unique to each therapeutic class.⁹⁶ Recently integration of endoscopic, histological data with multi-omics has also been proposed.⁹⁷ Moreover, AI models offer the opportunity to identify a pioneering gut barrier-protective agents for IBD and forecasts the potential success of candidate agents in phase III trials. Sahoo et al. developed an AI-assisted approach for target identification and validation. This ML path has demonstrated the ability to predict epithelial barrier-related genes, such as PRKAB1, the β1 subunit of the metabolic master regulator, AMPK, which might represent a novel target for gut barrier-protective therapies.⁹⁸

However, AI application in multi-omics is still in infancy and major challenges of multi-omics AI models include the lack of generalization when applied to independent validation cohorts. The primary limitation of the clinical applicability of AI lies precisely in the high heterogeneity of the disease and its variations over time, leading to inadequate reproducibility and generalizability of predictive results and a possible overestimation of prediction accuracy. In the near future, the AI approaches are expected to be especially valuable in classifying already diagnosed patients into disease sub-phenotypes, predicting disease progression, and evaluating response to treatment.

AI in trial design and monitoring: Enhancing adaptive trial designs and real-time participant monitoring

The FDA has highlighted the importance of integrating AI and ML into drug and biological product development, particularly in clinical trial designs. Their discussion paper highlights the potential of AI to streamline the development process by enhancing the design and execution of clinical trials through adaptive methodologies, real-time monitoring, and predictive modeling.^4,5,99 This framework not only enhances the efficiency and accuracy of clinical trials but also aligns with regulatory requirements to ensure patient safety and data integrity.¹⁰⁰

One particular advance is the integration of AI for advancing adaptive trial designs, which allow for modifications based on interim data, and have the potential to improve trial efficiency and ethical integrity by minimizing the probability of being randomized to potentially less effective or safe therapies.¹⁰¹ AI-driven adaptive trials can lead to more flexible studies by enabling real-time adjustments based on patient responses and emerging data.^102–104 For instance, whereas traditional frequentist trial designs have pre-specified endpoints where formalized hypothesis testing is carried out to evaluate efficacy and/or safety, AI-driven adaptive designs can potentially identify earlier signals of treatment efficacy or safety, enabling researchers to adjust dosing regimens, modify inclusion criteria, or in the most extreme situation, terminate trials early if necessary.¹⁰⁵ Multiple applications of AI in adaptive trial designs can be implemented. For example, AI-driven predictive analytics or simulation can be used for unbiased interim data analysis, AI-driven outcome prediction modeling can be used for sample size estimation (or re-estimation), AI-driven ML models can inform covariate- or response-adaptive randomization processes, and AI-driven models can be used to generate valid external control arms to help reduce the likelihood of placebo randomization.

In addition to trial design, AI may also facilitate more efficient and actionable real-time monitoring of participants to ensure safety in trial participation. Wearable devices and mobile apps integrated with AI can continuously collect and analyze patient data, providing insights into adherence, disease progression, and adverse events. This real-time monitoring could be essential for proactively managing patient safety and trial integrity.^106,107

Moreover, AI has the potential to improve risk-based monitoring (RBM). Historically, RBM has involved a multifaceted approach for identifying, assessing, monitoring, and subsequently mitigating risks that pose threats to quality or safety in an RCT. Given the tremendous amount of data collected within a clinical trial, AI-driven data monitoring systems can detect subtle changes in patient or site-level data that might indicate an adverse event, lack of efficacy, protocol deviations, or potentially site-related concerns that can prompt timely interventions. These systems have the potential to integrate much more data than centralized human monitors and can also help manage large-scale, decentralized trials by coordinating data from multiple sites and ensuring consistency in trial conduct.^107,108

Challenges and ethical considerations: Addressing ethical, regulatory, and practical challenges

The integration of AI in IBD clinical trials offers transformative possibilities but also presents significant ethical, regulatory, and practical challenges that must be addressed proactively. One primary ethical concern relates to data privacy, especially given the sensitive nature of health data used in AI models, which includes genetic, biomarker, and multi-omic information. To protect patient confidentiality while enabling AI models to function effectively, robust data security measures such as encryption, anonymization, and strict access protocols must be rigorously implemented. These measures should comply with regulatory standards like General Data Protection Regulation (GDPR) or the Health Insurance Portability and Accountability Act (HIPAA), which are essential for safeguarding patient information throughout AI/ML-driven clinical trials.¹⁰⁹

Transparency and interpretability of AI algorithms are other critical ethical priorities. Many AI models, particularly deep learning systems, operate as “black boxes,” making predictions without easily explainable reasoning.¹¹⁰ This lack of interpretability can hinder clinicians’ trust in AI-driven recommendations, posing risks of automated decisions that may not align with the best interests of patients. Ensuring transparency and accountability in AI decision-making processes is essential.^111,112 To address this, ongoing research and model development should focus on explainable AI frameworks and interpretable ML techniques, which can help clinicians understand and evaluate how models arrive at their conclusions, thereby enhancing clinical justification and maintaining clinician accountability.^113,114

The development of attention-based models and post hoc explainability methods such as SHAP and Local Interpretable Model-Agnostic Explanations (LIME) are active areas of research aimed at improving the transparency of AI-driven decisions in IBD trials.¹¹⁵

Algorithmic bias is another significant concern. AI models trained on historically imbalanced datasets may inadvertently perpetuate existing biases, leading to the underrepresentation or misrepresentation of specific patient groups, such as racial or ethnic minorities, in clinical trials. Studies have shown that IBD trials have historically underrepresented minorities, which can limit the model’s effectiveness in identifying eligible participants from diverse racial or ethnic backgrounds.^116–118 Furthermore, this bias can also affect underrepresented subgroups of IBD, such as individuals with fistulizing CD, pouchitis, and extraintestinal manifestations, among other underserved conditions. This further marginalizes these populations and restricts our understanding of these complex and less common disease presentations. Addressing bias requires both methodological and policy-level interventions, such as diversifying training datasets and implementing bias-detection frameworks, adversarial debiasing methods, and fairness-aware ML algorithms.^119,120 These strategies can help ensure that AI recruitment algorithms equitably represent the diversity of the IBD patient population, promoting more inclusive and equitable clinical trials.^{116–118,121} Foundation models, which leverage vast pre-trained datasets, have also been explored as a means to mitigate bias by improving generalizability and reducing the impact of imbalanced training samples.¹²²

On the regulatory front, the FDA and other regulatory agencies have issued preliminary guidelines on the use of AI in clinical research; however, comprehensive regulations are still evolving.^4,5 The lack of standardized regulatory frameworks creates uncertainties for AI developers and researchers, especially concerning model validation, continuous learning protocols, and real-time adjustments in adaptive trials.¹²³ To navigate these complexities, ongoing collaboration among industry stakeholders, researchers, and regulatory agencies is crucial for establishing clear guidelines on the development, deployment, and monitoring of AI in clinical trials.¹²⁴ Regulatory bodies, including the FDA, emphasize the importance of model validation and explainability, calling for regular algorithmic validation, retraining, and the establishment of processes to monitor updates and improvements in AI systems throughout the course of a trial.^4,5,102 The emergence of foundation models has further encouraged regulatory discussions, as their broad pre-training across multiple domains can reduce the frequency of retraining and improve transferability across diverse patient populations.¹²⁵

From a practical standpoint, implementing AI-driven technologies in clinical trials requires substantial resources, including technical infrastructure, specialized personnel, and ongoing oversight to maintain model performance and data integrity.¹²⁶ For example, effective implementation in decentralized trials requires robust digital platforms for seamless data collection and integration across multiple sites.¹²⁷ Integrating AI models into clinical workflows also presents logistical challenges; clinicians and trial coordinators need training to understand, trust, and effectively utilize these tools. Practical guidelines for AI implementation should emphasize user-friendly interfaces, interoperability with existing systems, and consistent support to facilitate successful integration into clinical practice.^4,126

Lastly, as AI increasingly plays a role in real-time patient monitoring through wearable devices and remote data collection,^128,129 patient autonomy and informed consent become critical ethical considerations.¹³⁰ Patients should be fully informed about how their data will be used, including the role of AI in monitoring their health and influencing treatment pathways. Informed consent processes must adapt to address AI-driven data analysis, providing clear assurances regarding data security, the purpose of data collection, and the limits of AI’s predictive capabilities.¹³¹ By prioritizing patient autonomy, transparency, and robust security measures, researchers can foster trust in AI-enabled IBD trials, thereby promoting equitable and ethical innovation in clinical research.

Discussion

AI represents a paradigm shift in the landscape of IBD clinical trials and has a potential to expedite the drug development process, thereby making safe and effective drugs available for the patient faster. AI can help at several stages of drug development process including but not limited to identification of potential molecular targets by integrating and analyzing multi-omics data, streamlining patient recruitment, enhancing data analysis, personalizing strategies, thereby optimizing trial design and monitoring^107,132 and post-marketing surveillance.

Realizing the potential of AI in clinical trials requires careful consideration of the ethical, regulatory, and practical challenges associated with its integration.¹²³ Despite its promise, AI technology is still in the early stages, and several obstacles must be overcome before its widespread implementation in drug development processes. Integration of data across various systems, including EHRs, laboratory databases, and imaging repositories, and integration of these databases can be a challenging task. Moreover, variability in data formats, terminologies, and quality across institutions can lead to inaccuracies in AI predictions. Additionally, privacy concerns and proprietary restrictions often limit access to the data needed to train and validate AI models.

Bias in AI models is another critical issue. If the datasets used to train AI systems are not representative of diverse populations, AI algorithms may produce biased predictions, leading to inaccurate outcomes.^133,134 The development of AI-driven tools for IBD must prioritize inclusivity, ensuring that these technologies benefit all patient populations, particularly those that have been historically underrepresented in clinical research.^81,116 By addressing these challenges, AI can fulfill its promise of transforming IBD clinical research, leading to more effective and equitable healthcare outcomes. For example, underrepresentation of minority groups or women could result in skewed trial results, while AI designs and implementations may unintentionally favor certain demographics, contributing to inequities in clinical trial recruitment and treatment recommendations. Furthermore, the use of AI systems requires careful attention to ethical considerations around sensitive personal data. Ensuring robust informed consent processes that address how AI technologies are employed is essential. Moreover, anonymizing and securing data while maintaining its utility for AI presents considerable challenges.

AI in clinical trials is still relatively novel, and regulatory frameworks remain under development. There is currently no universally accepted standard for validating AI tools in clinical trials, making it difficult to guarantee their reliability and reproducibility. Integrating AI into clinical trials necessitates significant adjustments to established workflows. Clinicians and trial staff often lack the technical expertise needed to effectively operate and interpret AI tools. This requires extensive training and collaboration with data scientists. Additionally, resistance to AI adoption may arise from stakeholders skeptical about the accuracy of these systems or concerned about job displacement. The development and implementation of AI systems in clinical trials can be resource-intensive, with high initial investment requirements. Running advanced AI algorithms demands substantial computing infrastructure, which may not be available in all clinical trial settings. Additionally, continuous updates, validation, and monitoring of AI systems are necessary to ensure their ongoing accuracy and relevance, contributing to long-term costs. AI models also require rigorous validation to ensure they can be applied effectively across diverse trial populations. However, many models are tested on limited datasets, raising concerns about their generalizability. Differences in data sources, trial protocols, and patient demographics can lead to inconsistent results when applying the same AI model across multiple settings.

While AI holds tremendous potential to transform clinical trials, its implementation faces numerous challenges. Overcoming these obstacles requires a multidisciplinary approach that involves collaboration between researchers, clinicians, data scientists, ethicists, and regulators. Key strategies include enhancing data standardization, ensuring transparency in AI systems, establishing clear regulatory guidelines, and fostering education and collaboration among stakeholders. Addressing these challenges will enable AI to become a powerful tool for making clinical trials more efficient, equitable, and effective. The FDA’s guidance provides a comprehensive framework for the responsible use of AI in drug and biological product development, emphasizing the need for transparency, validation, and patient-centric approaches.^4,5,135 A recent report from World Health Organization also highlights the ethics and recommendations for governance of AI in health care.¹³⁶ As the field continues to evolve, ongoing collaboration between researchers, regulators, and industry stakeholders will be essential to use AI’s full potential in IBD research.

Conclusion

The integration of AI into clinical trials for IBD represents a significant advancement in gastroenterology research and patient care ushering in a new era of precision medicine. AI technologies, including ML and predictive analytics, are revolutionizing trial design, patient recruitment, endpoint assessment, data analyses, personalized treatment strategies, and the monitoring and prediction of treatment responses. The ability to assess deeper levels of healing, such as barrier healing, will enhance therapeutic strategies and potentially organ-sparing approaches.

By leveraging large datasets, AI enhances the accuracy and diversity of participant selection while providing deeper insights into disease mechanisms. Its ability to customize treatment plans for individual patients promises improved outcomes and reduced side effects, which is crucial in IBD management due to the variability in patient responses. However, despite its potential, the adoption of AI presents critical challenges that require careful consideration. For instance, AI systems generated on supervised learning inherently depend on the assumption that the input data—often derived from physician diagnoses or clinical observations—is accurate. This reliance underscores the importance of high-quality, well-annotated datasets to minimize errors and biases. In the context of IBD, where diagnosis and disease characterization can be complex, ensuring the reliability of input data is essential to avoid perpetuating inaccuracies through AI-driven analyses.

Another important consideration is the cost associated with incorporating AI into clinical trials. Developing and maintaining AI systems requires significant investment in infrastructure, including high-performance computing capabilities, data integration platforms, and skilled personnel such as data scientists and bioinformaticians. Moreover, the ongoing need for algorithm validation, retraining, and compliance with regulatory standards adds to the financial burden. While these costs may be prohibitive for some institutions, they must be weighed against the potential long-term benefits, such as more efficient trials, personalized treatments, and reduced healthcare expenditures resulting from improved disease management.

In conclusion, while AI represents a paradigm shift in IBD research and clinical trials, its successful implementation will depend on addressing these foundational challenges. Collaboration between researchers, clinicians, regulatory authorities, and industry stakeholders will be crucial to ensuring that AI technologies are accurate, transparent and accessible. By prioritizing data quality, cost-effectiveness, and ethical standards, the integration of AI has the potential to significantly enhance treatment outcomes and advance the IBD and gastroenterology field.

Footnotes

Acknowledgements

None.

Declarations

ORCID iDs

Rocio Sedano

Virginia Solitano

Christopher Ma

References

Le Berre

Honap

Peyrin-Biroulet

. Ulcerative colitis. Lancet 2023; 402: 571–584.

Dolinger

Torres

Vermeire

. Crohn’s disease. Lancet 2024; 403: 1177–1191.

Solitano

Zilli

Franchellucci

, et al. Artificial endoscopy and inflammatory bowel disease: welcome to the future. J Clin Med 2022; 11: 569.

FDA. U.S. Food and Drug Administration. Artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) action plan, vol. 2021. U.S. Food and Drug Administration, https://www.fda.gov/media/145022/download (2021, accessed 19 October 2024).

FDA. U.S. Food and Drug Administration. Using artificial intelligence & machine learning in the development of drug & biological products: discussion paper and request for feedback. FDA, https://www.fda.gov/media/167973/download (2021, accessed 19 October 2024).

Soori

Arezoo

Dastres

. Artificial intelligence, machine learning and deep learning in advanced robotics, a review. Cogn Robot 2023; 3: 54–70.

Kufel

Bargiel-Laczek

Kocot

, et al. What is machine learning, artificial neural networks and deep learning?—examples of practical applications in medicine. Diagnostics (Basel) 2023; 13: 2582.

Soysal

Wang

Jiang

, et al. CLAMP—a toolkit for efficiently building customized clinical natural language processing pipelines. J Am Med Inform Assoc 2018; 25: 331–336.

Zhang

Yang

Wang

, et al. Artificial intelligence in drug development. Nat Med 2025; 31: 45–59.

10.

Ntinopoulos

Rodriguez Cetina Biefer

Tudorache

, et al. Large language models for data extraction from unstructured and semi-structured electronic health records: a multiple model performance evaluation. BMJ Health Care Inform 2025; 32: e101139.

11.

Rodriguez-Perez

Bajorath

. Interpretation of compound activity predictions from complex machine learning models using local approximations and shapley values. J Med Chem 2020; 63: 8761–8777.

12.

Ali

Abuhmed

El-Sappagh

, et al. Explainable Artificial Intelligence (XAI): what we know and what is left to attain trustworthy artificial intelligence. Inf Fusion 2023; 99: 101805.

13.

Bellamy

RKE

Dey

Hind

, et al. AI Fairness 360: an extensible toolkit for detecting and mitigating algorithmic bias. IBM J Res Dev 2019; 63(4): 1–4:15.

14.

Honap

Jairath

Danese

, et al. Navigating the complexities of drug development for inflammatory bowel disease. Nat Rev Drug Discov 2024; 23: 546–562.

15.

Solitano

Prins

Archer

, et al. Toward patient centricity: why do patients with inflammatory bowel disease participate in pharmaceutical clinical trials? A mixed-methods exploration of study participants. Crohns Colitis 360 2024; 6: otae019.

16.

Ahmad

East

Panaccione

, et al. Artificial intelligence in inflammatory bowel disease endoscopy: implications for clinical trials. J Crohns Colitis 2023; 17: 1342–1353.

17.

Ismail

Al-Zoubi

El Naqa

, et al. The role of artificial intelligence in hastening time to recruitment in clinical trials. BJR Open 2023; 5: 20220023.

18.

Jin

Wang

Floudas

, et al. Matching patients to clinical trials with large language models. ArXiv 2024.

19.

ClinicalTrialsArena.com. Elligo and Avallano launch AI-powered clinical trials platform, vol. 2024, https://www.clinicaltrialsarena.com/news/elligo-and-avallano-launch-ai-powered-clinical-trials-platform/ (2023, accessed 20 October 2024).

20.

Obermeyer

Emanuel

. Predicting the future—big data, machine learning, and clinical medicine. N Engl J Med 2016; 375: 1216–1219.

21.

Yin

Ngiam

Teo

. Role of artificial intelligence applications in real-life clinical practice: systematic review. J Med Internet Res 2021; 23: e25759.

22.

Char

Shah

Magnus

. Implementing machine learning in health care—addressing ethical challenges. N Engl J Med 2018; 378: 981–983.

23.

Stafford

Gosink

Mossotto

, et al. A systematic review of artificial intelligence and machine learning applications to inflammatory bowel disease, with practical guidelines for interpretation. Inflamm Bowel Dis 2022; 28: 1573–1583.

24.

Harun

Kassir

, et al. Machine learning-based quantification of patient factors impacting remission in patients with ulcerative colitis: insights from Etrolizumab Phase III Clinical Trials. Clin Pharmacol Ther 2024; 115: 815–824.

25.

Feagan

Sandborn

D'Haens

, et al. The role of centralized reading of endoscopy in a randomized controlled trial of mesalamine for ulcerative colitis. Gastroenterology 2013; 145: 149–157 e2.

26.

Duijvestein

Jeyarajah

Guizzetti

, et al. Response to placebo, measured by endoscopic evaluation of Crohn’s disease activity, in a pooled analysis of data from 5 randomized controlled induction trials. Clin Gastroenterol Hepatol 2020; 18: 1121–1132.e2.

27.

Almradi

Sedano

Hogan

, et al. Clinical, endoscopic, and safety placebo rates in induction and maintenance trials of Crohn’s disease: meta-analysis of randomised controlled trials. J Crohns Colitis 2022; 16: 717–736.

28.

Sedano

Hogan

Nguyen

, et al. Systematic review and meta-analysis: clinical, endoscopic, histological and safety placebo rates in induction and maintenance trials of ulcerative colitis. J Crohns Colitis 2022; 16: 224–243.

29.

Solitano

Hogan

Singh

, et al. Placebo rates in Crohn’s disease randomized clinical trials: an individual patient data meta-analysis. Gastroenterology 2025; 168(2): 344–356.

30.

Khanna

Zou

D'Haens

, et al. Reliability among central readers in the evaluation of endoscopic findings from patients with Crohn’s disease. Gut 2016; 65: 1119–1125.

31.

Khanna

Hogan

, et al. Standardizing scoring conventions for Crohn’s disease endoscopy: an international RAND/UCLA Appropriateness Study. Clin Gastroenterol Hepatol 2023; 21: 2938–2950 e6.

32.

Reinisch

Mishkin

, et al. Impact of various central endoscopy reading models on treatment outcome in Crohn’s disease using data from the randomized, controlled, exploratory cohort arm of the BERGAMOT trial. Gastrointest Endosc 2021; 93:174–182 e2.

33.

Jairath

Zou

, et al. Routine incorporation of the local read in Crohn’s disease clinical trials? Not so fast. Gastrointest Endosc 2021; 93: 183–186.

34.

Schroeder

Tremaine

Ilstrup

. Coated oral 5-aminosalicylic acid therapy for mildly to moderately active ulcerative colitis. A randomized study. N Engl J Med 1987; 317: 1625–1629.

35.

Travis

Schnell

Krzeski

, et al. Reliability and initial validation of the ulcerative colitis endoscopic index of severity. Gastroenterology 2013; 145: 987–995.

36.

Rimondi

Gottlieb

Despott

, et al. Can artificial intelligence replace endoscopists when assessing mucosal healing in ulcerative colitis? A systematic review and diagnostic test accuracy meta-analysis. Dig Liver Dis 2024; 56: 1164–1172.

37.

Feagan

Rutgeerts

Sands

, et al. Vedolizumab as induction and maintenance therapy for ulcerative colitis. N Engl J Med 2013; 369: 699–710.

38.

Collaborators

C-I

Hanzel

, et al. CORE-IBD: A multidisciplinary international consensus initiative to develop a core outcome set for randomized controlled trials in inflammatory bowel disease. Gastroenterology 2022; 163: 950–964.

39.

Yao

Najarian

Gryak

, et al. Fully automated endoscopic disease activity assessment in ulcerative colitis. Gastrointest Endosc 2021; 93: 728–736 e1.

40.

Liu

Bendtsen

, et al. High accuracy in classifying endoscopic severity in ulcerative colitis using convolutional neural network. Am J Gastroenterol 2022; 117: 1648–1654.

41.

Iacucci

Cannatelli

Parigi

, et al. A virtual chromoendoscopy artificial intelligence system to detect endoscopic and histologic activity/remission and predict clinical outcomes in ulcerative colitis. Endoscopy 2022; 55: 332–341.

42.

Wang

Jin

, et al. A comprehensive survey on deep active learning in medical image analysis. Med Image Anal 2024; 95: 103201.

43.

Gottlieb

Requa

Karnes

, et al. Central reading of ulcerative colitis clinical trial videos using neural networks. Gastroenterology 2021; 160: 710–719.e2.

44.

Lobaton

Bessissow

De Hertogh

, et al. The Modified Mayo Endoscopic Score (MMES): a new index for the assessment of extension and severity of endoscopic activity in ulcerative colitis patients. J Crohns Colitis 2015; 9: 846–852.

45.

Stidham

Cai

Cheng

, et al. Using computer vision to improve endoscopic disease quantification in therapeutic clinical trials of ulcerative colitis. Gastroenterology 2024; 166: 155–167 e2.

46.

Turner

Ricciuto

Lewis

, et al. STRIDE-II: an update on the Selecting Therapeutic Targets in Inflammatory Bowel Disease (STRIDE) Initiative of the International Organization for the Study of IBD (IOIBD): determining therapeutic goals for treat-to-target strategies in IBD. Gastroenterology 2021; 160: 1570–1583.

47.

Jairath

Zou

Wang

, et al. Determining the optimal treatment target in patients with ulcerative colitis: rationale, design, protocol and interim analysis for the randomised controlled VERDICT trial. BMJ Open Gastroenterol 2024; 11: e001218.

48.

Sedano

Almradi

, et al. An International Consensus to standardize integration of histopathology in ulcerative colitis clinical trials. Gastroenterology 2021; 160: 2291–2302.

49.

Gui

Bazarova

Del Amor

, et al. PICaSSO Histologic Remission Index (PHRI) in ulcerative colitis: development of a novel simplified histological score for monitoring mucosal healing and predicting clinical outcomes and its applicability in an artificial intelligence system. Gut 2022; 71: 889–898.

50.

Najdawi

Sucipto

Mistry

, et al. Artificial intelligence enables quantitative assessment of ulcerative colitis histology. Mod Pathol 2023; 36: 100124.

51.

Rymarczyk

Schultz

Borowa

, et al. Deep learning models capture histological disease activity in Crohn’s disease and ulcerative colitis with high fidelity. J Crohns Colitis 2024; 18: 604–614.

52.

Iacucci

Parigi

Del Amor

, et al. Artificial intelligence enabled histological prediction of remission or activity and clinical outcomes in ulcerative colitis. Gastroenterology 2023; 164: 1180–1188 e2.

53.

Vande Casteele

Leighton

Pasha

, et al. Utilizing deep learning to analyze whole slide images of colonic biopsies for associations between eosinophil density and clinicopathologic features in active ulcerative colitis. Inflamm Bowel Dis 2022; 28: 539–546.

54.

Furlanello

Bussola

Merzi

, et al. The development of artificial intelligence in the histological diagnosis of Inflammatory Bowel Disease (IBD-AI). Dig Liver Dis 2025; 57: 184–189.

55.

Ohara

Nemoto

Maeda

, et al. Deep learning-based automated quantification of goblet cell mucus using histological images as a predictor of clinical relapse of ulcerative colitis with endoscopic remission. J Gastroenterol 2022; 57: 962–970.

56.

Liu

Prasath

Siddiqui

, et al. Machine learning-based prediction of pediatric ulcerative colitis treatment response using diagnostic histopathology. Gastroenterology 2024; 166: 921–924 e4.

57.

Ohara

Maeda

Ogata

, et al. Automated neutrophil quantification and histological score estimation in ulcerative colitis. Clin Gastroenterol Hepatol 2024: S1542-3565(24)00668-2.

58.

Peyrin-Biroulet

Adsul

Stancati

, et al. An artificial intelligence-driven scoring system to measure histological disease activity in ulcerative colitis. United European Gastroenterol J 2024; 12: 1028–1033.

59.

Iacucci

Maeda

Ghosh

. Artificial intelligence enabled histological scoring in ulcerative colitis: are we ready yet? United European Gastroenterol J 2024; 12: 1000–1001.

60.

Bettenworth

Baker

Fletcher

, et al. A global consensus on the definitions, diagnosis and management of fibrostenosing small bowel Crohn’s disease in clinical practice. Nat Rev Gastroenterol Hepatol 2024; 21(8): 572–584.

61.

Rieder

Hanzel

, et al. Reliability of CT enterography for describing fibrostenosing Crohn disease. Radiology 2024; 312: e233038.

62.

Rieder

Baker

Bruining

, et al. Reliability of MR enterography features for describing fibrostenosing Crohn disease. Radiology 2024; 312: e233039.

63.

Stidham

Enchakalody

Waljee

, et al. Assessing small bowel stricturing and morphology in Crohn’s disease using semi-automated image analysis. Inflamm Bowel Dis 2020; 26: 734–742.

64.

Meng

Luo

Chen

, et al. Intestinal fibrosis classification in patients with Crohn’s disease using CT enterography-based deep learning: comparisons with radiomics and radiologists. Eur Radiol 2022; 32: 8692–8705.

65.

Chang

Carter

, et al. Radiomics-based analysis of intestinal ultrasound images for inflammatory bowel disease: a feasibility study. Crohns Colitis 360 2024; 6: otae034.

66.

Carter

Albshesh

Shimon

, et al. Automatized detection of Crohn’s disease in intestinal ultrasound using convolutional neural network. Inflamm Bowel Dis 2023; 29: 1901–1906.

67.

Jongsma

MME

Costes

LMM

Tindemans

, et al. Serum immune profiling in paediatric Crohn’s disease demonstrates stronger immune modulation with first-line infliximab than conventional therapy and pre-treatment profiles predict clinical response to both treatments. J Crohns Colitis 2023; 17: 1262–1277.

68.

Ananthakrishnan

Luo

Yajnik

, et al. Gut microbiome function predicts response to anti-integrin biologic therapy in inflammatory bowel diseases. Cell Host Microbe 2017; 21: 603–610.e3.

69.

Sinonquel

Bossuyt

Sabino

JPG

, et al. Long-term follow-up of the red density pilot trial: a basis for long-term prediction of sustained clinical remission in ulcerative colitis? Endosc Int Open 2023; 11: E880–E884.

70.

Bossuyt

Nakase

Vermeire

, et al. Automatic, computer-aided determination of endoscopic and histological inflammation in patients with mild to moderate ulcerative colitis based on red density. Gut 2020; 69: 1778–1786.

71.

Maeda

Kudo

Ogata

, et al. Evaluation in real-time use of artificial intelligence during colonoscopy to predict relapse of ulcerative colitis: a prospective study. Gastrointest Endosc 2022; 95: 747–756 e2.

72.

Hirten

Lin

Whang

, et al. Longitudinal monitoring of IL-6 and CRP in inflammatory bowel disease using IBD-AWARE. Biosens Bioelectron X 2024; 16: 100435.

73.

Hirten

Lin

Whang

, et al. Longitudinal assessment of sweat-based TNF-alpha in inflammatory bowel disease using a wearable device. Sci Rep 2024; 14: 2833.

74.

Ivanisevic

Sewduth

. Multi-omics integration for the design of novel therapies and the identification of novel biomarkers. Proteomes 2023; 11(4): 34.

75.

Christou

Tsoulfas

. Challenges involved in the application of artificial intelligence in gastroenterology: the race is on! World J Gastroenterol 2023; 29: 6168–6178.

76.

Hinton

. Deep learning—a technology with the potential to transform health care. JAMA 2018; 320: 1101–1102.

77.

Cohen-Mekelburg

Berry

Stidham

, et al. Clinical applications of artificial intelligence and machine learning-based methods in inflammatory bowel disease. J Gastroenterol Hepatol 2021; 36: 279–285.

78.

Rajkomar

Dean

Kohane

. Machine learning in medicine. N Engl J Med 2019; 380: 1347–1358.

79.

Gubatan

Levitte

Patel

, et al. Artificial intelligence applications in inflammatory bowel disease: Emerging technologies and future directions. World J Gastroenterol 2021; 27: 1920–1935.

80.

D’Amico

Danese

Peyrin-Biroulet

. Adaptive designs: lessons for inflammatory bowel disease trials. J Clin Med 2020; 9: 2350.

81.

Ahmad

East

Panaccione

, et al. Artificial intelligence in inflammatory bowel disease: implications for clinical practice and future directions. Intest Res 2023; 21: 283–294.

82.

Dercle

Fronheiser

, et al. Identification of non-small cell lung cancer sensitive to systemic cancer therapies using radiomics. Clin Cancer Res 2020; 26: 2151–2162.

83.

Corbin

Sung

Chattopadhyay

, et al. Personalized antibiograms for machine learning driven antibiotic selection. Commun Med (Lond) 2022; 2: 38.

84.

Kroner

Engels

Glicksberg

, et al. Artificial intelligence in gastroenterology: a state-of-the-art review. World J Gastroenterol 2021;27: 6794–6824.

85.

Tang

, et al. Machine learning gene expression predicting model for ustekinumab response in patients with Crohn’s disease. Immun Inflamm Dis 2021; 9: 1529–1540.

86.

Park

Kim

, et al. Development of a machine learning model to predict non-durable response to anti-TNF therapy in Crohn’s disease using transcriptome imputed from genotypes. J Pers Med 2022; 12: 947.

87.

Waljee

Liu

Sauder

, et al. Predicting corticosteroid-free biologic remission with vedolizumab in Crohn’s disease. Inflamm Bowel Dis 2018; 24: 1185–1192.

88.

Con

van Langenberg

Vasudevan

. Deep learning vs conventional learning algorithms for clinical prediction in Crohn’s disease: a proof-of-concept study. World J Gastroenterol 2021; 27: 6476–6488.

89.

Qaiser

Hamidinekoo

Daniel

, et al. Artificial intelligence to predict interleukin-23 signalling activity from H&E stained whole slide images of inflammatory bowel disease. J Crohn’s Colitis 2024; 18: i453.

90.

Takenaka

Fujii

Kawamoto

, et al. Deep neural network for video colonoscopy of ulcerative colitis: a cross-sectional study. Lancet Gastroenterol Hepatol 2022; 7: 230–237.

91.

Waljee

Wallace

Cohen-Mekelburg

, et al. Development and validation of machine learning models in prediction of remission in patients with moderate to severe Crohn disease. JAMA Netw Open 2019; 2: e193721.

92.

Choi

Park

Chung

, et al. Development of machine learning model to predict the 5-year risk of starting biologic agents in patients with inflammatory bowel disease (IBD): K-CDM Network Study. J Clin Med 2020; 9: 3427.

93.

Waljee

Lipson

Wiitala

, et al. Predicting hospitalization and outpatient corticosteroid use in inflammatory bowel disease patients using machine learning. Inflamm Bowel Dis 2017; 24: 45–53.

94.

Wang

Yao

, et al. Histological image-based ensemble model to identify myenteric plexitis and predict endoscopic postoperative recurrence in Crohn’s disease: a multicentre, retrospective study. J Crohns Colitis 2024; 18: 727 EP–737.

95.

Lloyd-Price

Arze

Ananthakrishnan

, et al. Multi-omics of the gut microbial ecosystem in inflammatory bowel diseases. Nature 2019; 569: 655–662.

96.

Lee

JWJ

Plichta

Hogstrom

, et al. Multi-omics reveal microbial determinants impacting responses to biologic therapies in inflammatory bowel disease. Cell Host Microbe 2021; 29: 1294–1304 e4.

97.

Iacucci

Santacroce

Zammarchi

, et al. Artificial intelligence and endo-histo-omics: new dimensions of precision endoscopy and histology in inflammatory bowel disease. Lancet Gastroenterol Hepatol 2024; 9: 758–772.

98.

Sahoo

Swanson

Sayed

, et al. Artificial intelligence guided discovery of a barrier-protective therapy in inflammatory bowel disease. Nat Commun 2021; 12: 4246.

99.

Maddox

Rumsfeld

Payne

PRO

. Questions for artificial intelligence in health care. JAMA 2019; 321: 31–32.

100.

Zhou

Huang

, et al. Application of artificial intelligence in gastrointestinal disease: a narrative review. Ann Transl Med 2021; 9: 1188.

101.

Prepared by AAITF, Parasa

Berzin

, et al. Consensus statements on the current landscape of artificial intelligence applications in endoscopy, addressing roadblocks, and advancing artificial intelligence in gastroenterology. Gastrointest Endosc 2025; 101: 2–9.e1.

102.

FDA. U.S. Food and Drug Administration. Evaluate application of artificial intelligence to adaptive enrichment clinical trials, https://www.fda.gov/science-research/advancing-regulatory-science/evaluate-application-artificial-intelligence-adaptive-enrichment-clinical-trials (2021, accessed 20 October 2024).

103.

Zhang

Chen

, et al. Harnessing artificial intelligence to improve clinical trial design. Commun Med (Lond) 2023; 3: 191.

104.

Zhu

Wong

. An overview of adaptive designs and some of their challenges, benefits, and innovative applications. J Med Internet Res 2023; 25: e44171.

105.

Ravi

Wong

Deligianni

, et al. Deep learning for health informatics. IEEE J Biomed Health Inform 2017; 21: 4–21.

106.

Shajari

Kuruvinashetti

Komeili

, et al. The emergence of AI-based wearable sensors for digital health technology: a review. Sensors (Basel) 2023; 23: 9498.

107.

Chopra

Annu Shin

, et al. Revolutionizing clinical trials: the role of AI in accelerating medical breakthroughs. Int J Surg 2023; 109: 4211–4220.

108.

Vampana

Jayanthi

ESS

Mary

, et al. Artificial intelligence-driven patient monitoring for adverse event detection in clinical trials. Int J Basic Clin Pharmacol 2024; 13: 543–550.

109.

Haug

Drazen

. Artificial intelligence and machine learning in Clinical Medicine, 2023. N Engl J Med 2023; 388: 1201–1208.

110.

Yang

Xia

. Unbox the black-box for the medical explainable AI via multi-modal and multi-centre data fusion: a mini-review, two showcases and beyond. Inf Fusion 2022; 77: 29–52.

111.

Ali

Akhlaq

Imran

, et al. The enlightening role of explainable artificial intelligence in medical & healthcare domains: a systematic literature review. Comput Biol Med 2023; 166: 107555.

112.

Amann

Vetter

Blomberg

, et al. To explain or not to explain?—artificial intelligence explainability in clinical decision support systems. PLoS Digit Health 2022; 1: e0000016.

113.

Abgrall

Holder

Chelly Dagdia

, et al. Should AI models be explainable to clinicians? Crit Care 2024; 28: 301.

114.

Labkoff

Oladimeji

Kannry

, et al. Toward a responsible future: recommendations for AI-enabled clinical decision support. J Am Med Inform Assoc 2024; 31: 2730–2739.

115.

Sadeghi

Alizadehsani

Cifci

, et al. A review of explainable artificial intelligence in healthcare. Comput Electr Eng 2024; 118: 109370.

116.

Sedano

Hogan

McDonald

, et al. Underrepresentation of minorities and lack of race reporting in ulcerative colitis drug development clinical trials. Inflamm Bowel Dis 2022; 28: 1293–1295.

117.

Sedano

Hogan

McDonald

, et al. Underrepresentation of minorities and underreporting of race and ethnicity in Crohn’s disease clinical trials. Gastroenterology 2022; 162: 338–340.e2.

118.

Shah

Shillington

Kabagambe

, et al. Racial and ethnic disparities in patients with inflammatory bowel disease: an online survey. Inflamm Bowel Dis 2024; 30: 1467–1474.

119.

Nazer

Zatarah

Waldrip

, et al. Bias in artificial intelligence algorithms and recommendations for mitigation. PLoS Digit Health 2023; 2: e0000278.

120.

Arora

Alderman

Palmer

, et al. The value of standards for health datasets in artificial intelligence-based applications. Nat Med 2023; 29: 2929–2938.

121.

Rana

Azizul

Awan

. A step toward building a unified framework for managing AI bias. Peer J Comput Sci 2023; 9: e1630.

122.

Yang

Soltan

AAS

Eyre

, et al. Algorithmic fairness and bias mitigation for clinical machine learning with deep reinforcement learning. Nat Mach Intell 2023; 5: 884–894.

123.

Mennella

Maniscalco

De Pietro

, et al. Ethical and regulatory challenges of AI technologies in healthcare: A narrative review. Heliyon 2024;10:e26297.

124.

Reddy

. Navigating the AI Revolution: the case for precise regulation in health care. J Med Internet Res 2023; 25: e49989.

125.

Zhao

Alzubaidi

Zhang

, et al. A comparison review of transfer learning and self-supervised learning: definitions, applications, advantages and limitations. Expert Syst Appl 2024; 242: 122807.

126.

Alowais

Alghamdi

Alsuhebany

, et al. Revolutionizing healthcare: the role of artificial intelligence in clinical practice. BMC Med Educ 2023; 23: 689.

127.

Hanley

Jr Bernard

Wilkins

, et al. Decentralized clinical trials in the trial innovation network: value, strategies, and lessons learned. J Clin Transl Sci 2023; 7: e170.

128.

Daly

Brawley

Gospodarowicz

, et al. Remote monitoring and data collection for decentralized clinical trials. JAMA Netw Open 2024; 7: e246228.

129.

Sharma

Badea

Tiwari

, et al. Wearable biosensors: an alternative and practical approach in healthcare and disease monitoring. Molecules 2021; 26.

130.

Gelinas

Morrell

White

, et al. Navigating the ethics of remote research data collection. Clin Trials 2021; 18: 606–614.

131.

Gerke

Minssen

Cohen

. Ethical and legal challenges of artificial intelligence-driven healthcare. In Bohr

Memarzadeh

(eds) Artificial intelligence in healthcare, Denmark: Elsevier, 2020, pp.295–336.

132.

Chen

, et al. Artificial intelligence tools for optimising recruitment and retention in clinical trials: a scoping review protocol. BMJ Open 2024; 14: e080032.

133.

Celi

Cellini

Charpignon

, et al. Sources of bias in artificial intelligence that perpetuate healthcare disparities—a global review. PLoS Digit Health 2022; 1: e0000022.

134.

Noseworthy

Attia

Brewer

, et al. Assessing and mitigating bias in medical artificial intelligence: the effects of race and ethnicity on a deep learning model for ECG analysis. Circ Arrhythm Electrophysiol 2020; 13: e007988.

135.

Aung

YYM

Wong

DCS

Ting

DSW

. The promise of artificial intelligence: a review of the opportunities and challenges of artificial intelligence in healthcare. Br Med Bull 2021; 139: 4–15.

136.

WHO. WHO guidance. Ethics and governance of artificial intelligence for health, vol. 2024, https://www.who.int/publications/i/item/9789240029200 (2021, accessed 20 October 2024).