Abstract
Integrating artificial intelligence (AI) into clinical trials for inflammatory bowel disease (IBD) has potential to be transformative to the field. This article explores how AI-driven technologies, including machine learning (ML), natural language processing, and predictive analytics, have the potential to enhance important aspects of IBD trials—from patient recruitment and trial design to data analysis and personalized treatment strategies. As AI advances, it has potential to improve long-standing challenges in trial efficiency, accuracy, and personalization with the goal of accelerating the discovery of novel therapies and improve outcomes for people living with IBD. AI can streamline multiple trial phases, from target identification and patient recruitment to data analysis and monitoring. By integrating multi-omics data, electronic health records, and imaging repositories, AI can uncover molecular targets and personalize trial strategies, ultimately expediting drug development. However, the adoption of AI in IBD clinical trials encounters significant challenges. These include technical barriers in data integration, ethical concerns regarding patient privacy, and regulatory issues related to AI validation standards. Additionally, AI models risk producing biased outcomes if training datasets lack diversity, potentially impacting underrepresented populations in clinical trials. Addressing these limitations requires standardized data formats, interdisciplinary collaboration, and robust ethical frameworks to ensure inclusivity and accuracy. Continued partnerships among clinicians, researchers, data scientists, and regulators will be essential to establish transparent, patient-centered AI frameworks. By overcoming these obstacles, AI has the potential to enhance the efficiency, equity, and efficacy of IBD clinical trials, ultimately benefiting patient care.
Plain language summary
Inflammatory Bowel Disease (IBD), including Crohn’s disease and ulcerative colitis, poses significant challenges for clinical trials, such as difficulties in recruiting participants, variations in disease presentation, and inconsistent treatment responses. Artificial intelligence (AI) is increasingly recognized as a solution to these challenges, improving recruitment, data analysis, personalized care, and trial design. AI can enhance recruitment by analyzing medical records to match patients to trials efficiently. AI tools can automate this process, improving both efficiency and diversity. Additionally, AI can predict dropout risks, helping researchers plan better and maintain trial integrity. IBD trials generate complex datasets that require advanced analysis. AI can process these large datasets to identify patterns in disease progression and treatment efficacy, also improving the accuracy of endoscopic and histological assessments, providing deeper insights into the disease. AI can enable personalized treatments by predicting responses based on genetics, biomarkers, and medical history. Real-time monitoring through wearable devices supports early interventions, improving patient outcomes and disease management. Adaptive trial designs might also benefit from AI, allowing protocols to adjust based on interim results. This enhances trial efficiency, ethical standards, and participant safety, while ensuring accurate data collection. However, implementing AI requires addressing data privacy, fairness, and regulatory compliance. Transparent, secure, and inclusive AI models are essential to build trust and ensure equitable benefits across all patient populations. AI is transforming IBD clinical trials by streamlining recruitment, improving data analysis, personalizing care, and optimizing trial design. By addressing challenges proactively, we can unlock AI’s full potential, leading to more efficient trials and better outcomes for patients.
Keywords
Introduction
Inflammatory bowel disease (IBD), encompassing Crohn’s disease (CD) and ulcerative colitis (UC), is a chronic and relapsing condition that presents significant challenges for patient management and in conducting clinical research in an efficient manner.1,2 Recruitment rates for pharmaceutical IBD clinical trial continue to decline with average global recruitment rates of 0.1 patient/center/trial. The reasons for poor recruitment are complex and multifactorial and include factors related to patient characteristics, overengineered protocols, use of placebo and increasing availability of commercial agents. In addition, heterogeneity in disease presentation, variability in treatment response, and the requirement for long-term management necessitate innovative approaches in clinical trials. Traditional methodologies have often failed to address these complexities, highlighting the need for advanced technologies to provide more precise and efficient solutions at all stages of a clinical trial including patient recruitment, endpoint assessment, and trial monitoring.
In recent years, artificial intelligence (AI) and machine learning (ML) have emerged as powerful medical tools, offering unprecedented data analysis, pattern recognition, and predictive modeling capabilities. These technologies have the potential to enhance aspects of IBD clinical trials and may provide solutions to challenges that have historically impeded progress in this field. 3 However, the promise of AI can only be fully realized if these technologies are scientifically rigorous, clinically valid, and aligned with regulatory standards to ensure patient safety and improve outcomes.4,5
This review explores the potential impact that AI could have on IBD clinical trials, focusing on key areas such as patient recruitment, data analysis, personalized medicine, and trial design. Additionally, we discuss the ethical, regulatory, and practical considerations that must be addressed to ensure the responsible integration of AI in clinical trials for IBD.
A summary of definitions of commonly used terminology is included in Table 1.
Summary of definitions of commonly used terminology.
AI in patient recruitment: Streamlining and enhancing recruitment for IBD clinical trials
Recruiting patients for IBD clinical trials remains a significant challenge. 14 Despite some advances, issues such as overestimating eligible populations, limited patient awareness, logistical difficulties, and competition for participants still persist. 15 Traditional recruitment approaches are often resource-heavy and struggle to efficiently identify and enroll appropriate candidates. AI offers promising solutions by analyzing large datasets, including electronic health records (EHRs) and patient-reported outcomes. These technologies may help to more efficiently and accurately match eligible participants with trial criteria, improving recruitment criteria.16,17
Screening activities
AI-driven tools like natural language processing (NLP) are improving clinical trial recruitment by analyzing clinical notes and unstructured data to identify potential participants that might be missed by traditional screening methods. Leveraging NLP and ML, generative AI can automate the evaluation of eligibility criteria against medical histories, drastically reducing the need for manual reviews. This allows researchers and clinical staff to potentially pre-screen hundreds of candidates in just minutes, speeding up pre-screening activities. One recent advancement in this field is TrialGPT, a model designed to improve patient-trial matching. 18 Using large language models, TrialGPT analyzes patient medical records and compares them with trial eligibility criteria. Trained on data from 184 patients with complex conditions—predominantly cancer and other chronic diseases such as cardiovascular disease, diabetes, and rare genetic conditions—and 18,238 annotated clinical trials, TrialGPT not only determines patient suitability but also provides detailed explanations for its decision. 18 When tested on a larger dataset of patients across oncology and various chronic disease populations, TrialGPT demonstrated strong performance. Its explanations aligned closely with those of human experts, effectively ranking trials and excluding those for which patients were ineligible. However, some errors were noted due to limitations in the underlying language models.
Generative AI, using tools such as chatbots and virtual assistants, can also reduce the screening burden on clinical trial sites by handling initial participant screening and communication. AI-enabled platforms such as myTrialsConnect enhance participant interactions, gather trial-specific data, and even schedule appointments, thus improving accessibility and workload management. 19
Enhancing diversity in enrollment
In IBD clinical trials, recruitment bias can lead to the underrepresentation of minority populations, which impacts the generalizability of findings. AI Fairness 360 (AIF360) developed by IBM Research might tackle this issue by ensuring that AI-driven recruitment algorithms do not disproportionately exclude these groups, fostering more inclusive and equitable study populations. 13 By addressing bias, the toolkit allows researchers to increase the likelihood of fair and representative samples that reflect the diversity of the population affected by the disease being studied, and to ensure that eligibility criteria are applied equitably to all groups, avoiding bias that could exclude minority populations.
Furthermore, AI can predict patient dropout rates and adherence to trial protocols, enabling proactive management of these issues, which is crucial for maintaining trial integrity.20,21
Enhancing data analysis through AI: Utilizing ML to analyze complex IBD trial outcome data
IBD clinical trials generate large and complex datasets, including clinical, imaging, biomarker data, and genomic data. AI, particularly ML and deep learning algorithms, can process high-dimensional data to uncover hidden patterns or correlations critically for understanding disease progression, treatment responses, and patient subgroups.22,23
EHR data analysis using ML methods
A key application of AI in IBD research involves leveraging ML techniques to analyze EHR-derived data. These methods allow for integrating patient demographics, physiological measurements, disease history, clinical questionnaires, histology, serum biomarkers, and drug exposure to uncover insights that traditional analyses may overlook. ML-based models, such as XGBoost and deep learning approaches, can identify complex, nonlinear relationships that influence disease progression and therapeutic outcomes.
Predicting response to therapy
A recent study by Harun et al. shed light on the role of AI in shaping future clinical trial designs, particularly by identifying stratification factors that can optimize treatment effectiveness and improve patient outcomes. 24 The authors conducted a post hoc analysis of four randomized controlled trials (RCTs) of etrolizumab in patients with UC, using advanced ML techniques to assess which patient factors impact remission. XGBoost ML models were used to evaluate the effect of various patient-level data on the likelihood of achieving remission. To interpret the complex predictions, the SHAP (SHapley Additive exPlanations) framework clarified which factors were most influential. The data analyzed included demographics, physiological measurements, disease history, clinical questionnaires, histology, serum biomarkers, and drug exposure. The models performed well, achieving an area under the receiver operating characteristic curve (AUROC) of 0.74 ± 0.03 for induction and 0.75 ± 0.06 for maintenance. By using AI techniques, the study was able to analyze a large, complex dataset and reveal nonlinear relationships and interactions that traditional methods might miss. The use of XGBoost improved the predictive accuracy of remission based on diverse variables, offering deeper insights into patient outcomes. The SHAP framework further enhanced understanding by identifying key factors influencing remission, aiding in patient stratification and optimizing treatment strategies for future trials.
Endoscopy
Endoscopic assessment is the cornerstone to establish patient eligibility for IBD trial participation and to estimate the efficacy of trial interventions. Blinded central endoscopic reading is the current standard which, compared to local endoscopic reading, increases objectivity, minimizes variability, reduces placebo rates, and consequently maximizes effect sizes.25–29 Nonetheless, even agreement between expert central readers is imperfect,25,30 and several scoring conventions were added to existing evaluative indices with the purpose of harmonizing scores. 31 Disagreement between central readers further complicates the scoring process by introducing the need for outcome adjudication, where multiple read algorithms to resolve disagreement and assign a final score are possible, each with their advantages and disadvantages.32,33
Several AI models have been developed to deliver reliable and accurate readings of endoscopic videos in UC. In a recently published meta-analysis, 12 studies were included, with 9 studies evaluating the Mayo endoscopic score (MES) 34 and 3 the Ulcerative Colitis Endoscopic Index of Severity (UCEIS) 35 as the reference standard. 36 Overall, the sensitivity and specificity of AI for endoscopic assessment was high; both for still images (sensitivity 91%, specificity 89%) and videos (sensitivity 86%, specificity 91%). A notable finding was the high heterogeneity between studies with I2 values exceeding 90%.
Several aspects of study design should however be considered to correctly contextualize these findings and identify future research priorities. All studies used a human expert reader as the reference standard. As the training of convolutional neural networks, the AI tool used in endoscopy, depends on the human reader reference, they, by definition, cannot yet surpass human performance in terms of accuracy, whereas gains in efficiency and throughput may be considerable. A further potential use of AI is using it by default to screen all endoscopies, performed at a given site with the purpose of identifying patients with endoscopically active disease (MES 2 or 3) and flagging them for potential inclusion in trials. Studies included in the meta-analysis all used dichotomized outcomes, for example, MES 0–1 versus 2–3. Whilst endoscopic remission was defined as a MES 0–1 in the past, 37 the two scores are no longer conflated in contemporary trials: a score of 0 denotes endoscopic remission and a score of 1 endoscopic improvement, reflecting two different outcomes. 38 The distinction between a score of 2 and 3 is also not insignificant as baseline endoscopic activity may serve as a stratification factor for randomization. It should be acknowledged that only three of the studies supported their models with external validation cohorts.39–41 Finally, the high heterogeneity persisted even in sensitivity analyses separating studies based on still image versus video assessment and based on the numbers of images evaluated. The high variability between studies could therefore be the result of discrepancies in image annotation, image pre-processing, and training algorithms. 42 Developing AI algorithms for endoscopic assessment of CD remains an unmet research need, studies thus far have focused on video capsule endoscopy, which does not feature in regulatory clinical trials.
Using AI-based algorithms in clinical trials has been shown to be feasible as a recurrent neural network model performed favorably compared to human readers for the evaluation of full-length endoscopy videos in a phase II trial of mirikizumab. 43 It should be noted, however, that this trial utilized a single central reader paradigm, and it remains unknown how to best integrate AI algorithms in multiple central reader paradigms which require outcome adjudication. Currently, we are lacking studies to inform optimum positioning of AI-based algorithms in reading paradigms: it is unknown, whether the algorithm should replace the local reader, the central reader, or perhaps even both readers. An often-cited limitation of the MES is the fact that it defaults to the worst affected area of the colon visualized, regardless of potential changes in disease extent. 44 An AI-based solution integrating both endoscopic disease severity and disease extent is the cumulative disease score (CDS). 45 A notable advantage of this system, tested on the ustekinumab trial dataset is its superior ability to discriminate between the ustekinumab arm and the placebo arm—a simulated sample size calculation indicated that 50% fewer patients would be needed to demonstrate a difference with CDS compared to the MES. Although the latter remains the regulatory standard, CDS could be used in early drug development programs to detect between-arm differences with smaller numbers of patients. AI could conceivably recognize endoscopic lesions and patterns, which are not part of established endoscopic indices and therefore be potentially more sensitive to change. This could be particularly helpful in early drug development to guide decisions whether to continue the clinical program.
Histology
Histological remission is currently understood as an adjunct to endoscopic remission indicating a deeper level of healing. 46 It is not yet considered a treatment target, but its potential role is being evaluated in a randomized trial. 47 Similarly to endoscopy, histological assessment depends on scoring indices, which face challenges similar to those of endoscopic indices: inter-rater reliability is imperfect, scoring can be time-consuming and requires expertise. 48 In UC histology, AI models have been used to develop novel scoring indices,49,50 to replace human readers for established indices,51,52 and to evaluate individual histological features, such as the presence of eosinophils 53 and basal cell plasmacytosis. 54
Overall, models developed to evaluate biopsies using existing indices have shown encouraging sensitivity and specificity to detect histological remission.51,52 One of the systems was developed using clinical trial data, demonstrating feasibility in this setting. 51 Analogously to AI models for endoscopic assessment, currently developed algorithms predict histological remission as a binary outcome, but cannot provide grading of inflammatory activity. Arguably, this is less of a limitation for histology than it is for endoscopy as precise grading of histologically active disease is less relevant and has little impact on the interpretation of clinical trial results.
A further area of development of AI is also the deployment of algorithms to help guide human pathologists identify the main areas of interest within a given biopsy fragment. 51 Novel algorithms also promise to detect histological features beyond those included in established histological scoring indices, which could be more informative for predicting subsequent treatment outcomes. In a study of 114 patients with UC achieving endoscopic improvement (MES ⩽1), a deep learning model successfully quantified the ratio between the goblet cell mucus area and epithelial cells, a lower ratio was associated with an increased rate of disease relapse within the subsequent 12 months. 55 Even more impressively, a ML-based algorithm was able to identify 18 histomic features, which were able to predict which patients with pediatric UC would not respond to treatment with mesalamine alone. 56 These features, discovered in an inception cohort of 292 patients, were later tested in an external validation cohort with almost identical performance (AUROC 0.89 in the development cohort and 0.88 in the validation cohort). More recently, Ohara et al. 57 developed an advanced AI system incorporating semantic segmentation and object detection models to identify neutrophils in hematoxylin and eosin-stained WSIs. This system not only detects neutrophils in the epithelium and lamina propria but also predicts components of the Nancy Histological Index and the PICaSSO Histologic Remission Index. 41 Notably, the AI-predicted histological scores correlated well with pathologists’ assessments (Spearman’s ρ = 0.68–0.80; p < 0.05).
In another study, Peyrin-Biroulet et al. 58 utilized automated image analysis combined with ML to evaluate histological disease activity based on the Nancy index in 200 histological images from UC patients. The AI system’s performance was compared to assessments by four independent histopathologists. Despite limitations due to the small annotated dataset required for AI training, 59 the study reported high correlations both among histopathologists (89.33) and between the AI system and histopathologists (87.20).
Radiology
Radiological assessment is likely to have an increasingly prominent role in clinical trials in IBD. Transmural healing is defined as an adjunct to endoscopic remission, reflecting a deeper level of healing in CD, 46 its potential advantage over contemporary treatment goals is under evaluation in an ongoing randomized trial (NCT06257706). Fibrostenosing CD is an area of unmet therapeutic need, and the development of potential antifibrotic agents is a research priority. 60 The recent publication of dedicated indices61,62 extends the role of cross-sectional imaging beyond evaluating inflammation.
Detection and characterization of strictures is an area well-suited to AI models. Good concordance has been shown between (semi)-automated measurements and expert radiologist assessment for key elements, such as bowel wall thickness, pre-stenotic dilation, and minimum luminal diameter.63,64 Notably, AI was able to quantify intestinal fibrosis with an AUROC exceeding 0.800 compared to the reference standard of histopathological assessment of the resection specimen. 64 Automated assessment was non-inferior to expert radiological assessment and considerably faster. AI has not yet been evaluated for radiological assessment of perianal fistulizing CD, which is expected to be quite challenging given the complex morphology and heterogeneity of this disease phenotype.
Emerging research also indicates that convolutional neural networks are able to accurately identify abnormal bowel wall thickening on images obtained with intestinal ultrasound, although this remains to be proven in studies with larger sample sizes and also using cine loops as opposed to still images.65,66
A summary of endoscopic, histologic and radiology AI modalities can be found in Table 2.
Summary of endoscopic, histologic, and radiology AI modalities.
AI, artificial intelligence.
Synthesizing multimodal clinical trial data
Possibly the greatest opportunity for harnessing the power of AI in clinical trials lies in the analysis of complex multimodal data to predict outcomes. Existing endoscopic indices have limited predictive capability for subsequent disease evolution and histological indices focus, perhaps unduly, on neutrophils. An algorithm quantifying red pixels in endoscopy videos of UC, the red-density index, performed acceptably for predicting 5-year clinical remission.69,70 An algorithm based on endoscopy, supported by endocytoscopy, classified patients by risk of clinical relapse in real-time. 71 The AI-based PICaSSO Histologic Remission Index successfully estimated the likelihood of a flare of UC at 1 year. 52 AI approaches are also well suited to the analysis of complex proteomic and microbiomic data. An AI algorithm supported classification of patients based on the relative abundance of 92 inflammatory protein, which was associated with subsequent response to treatment with infliximab. 67 A neural network algorithm integrating clinical and microbiome data, demonstrated an acceptable predictive capability for clinical remission after 14 weeks of treatment with vedolizumab in CD. 68 The use of wearable devices, such as smart watches, for monitoring IBD and predicting subsequent disease flares is under active investigation. The abundance of clinical and biomarker data gathered through these devices could optimally be analyzed using AI-based methods to develop novel digital outcomes and predict future disease evolution.72,73
Additionally, AI could integrate multi-omics data to identify novel biomarkers, which can serve as surrogate endpoints in clinical trials, thereby accelerating drug development. 74 AI-driven analytics could also enhance the understanding of patient heterogeneity in IBD, enabling the identification of distinct disease subtypes. This stratification informs the development of targeted therapies, ultimately leading to more personalized treatment approaches. 75
Personalized medicine and AI: Tailoring treatment strategies to individual patients
The current management of IBD is challenging, as the disease varies widely in severity and response to treatment among patients. With the availability of several classes of advanced therapies, choosing a suitable therapeutic agent which would result in optimal response is challenging. Currently there are no tools that can accurately predict response to any given agent. Traditional treatment protocols often involve the best clinical judgment based on available evidence, clinical records, social factors, and local institutional policies. Personalized medicine is an emerging approach that moves away from a one-size-fits-all paradigm in healthcare, instead tailoring medical treatment to the individual characteristics of each patient. For chronic, heterogeneous conditions such as IBD, this approach holds significant potential. Patients with IBD differ widely in their genetic profiles, disease severity, and response to medications.
The advent of AI presents a revolutionary opportunity to optimize and personalize treatment strategies for IBD, tailoring care to individual patient characteristics. AI can analyze large, complex datasets that include genetic information, clinical records, environmental factors, and treatment responses, enabling clinicians to make more precise and effective decisions in IBD management.76,77 This process significantly reduces the time spent on trial-and-error treatments, improves clinical outcomes, and minimizes the risk of adverse effects from ineffective therapies.
AI models can continuously learn and adapt based on new data, enabling real-time personalization of treatment strategies. 78 These adaptive strategies could be particularly beneficial in managing IBD, where disease activity can fluctuate over time.79–81 AI can assist physicians in several stages of management of patients with IBD including choosing appropriate agent early at the time of diagnosis, predicting disease progression and exacerbations allowing early intervention.
AI in choosing appropriate therapeutic agent
There are several factors that determine response to a given agent. AI has been shown to be useful in predicting treatment response to various drugs in cancer therapy and antibacterial therapy.82,83 Similarly, AI can predict which patients are more likely to respond to certain biologic therapies based on their genetic and microbiome composition.77,84 Several studies demonstrated predictive ability of AI models in predicting response to various advanced therapies such as anti-TNFs, vedolizumab, and ustekinumab in patients with CD.85–88 Some of these studies used clinical and laboratory data, while others used genotype data. Recent research has identified hundreds of genetic loci associated with IBD, yet these genetic insights have not been fully integrated into clinical practice due to their complexity. AI can synthesize this genomic information with other patient-specific data to create predictive models that anticipate how a patient’s disease will progress and how they will respond to different treatments. Therefore, the use of AI models can potentially predict best treatment option for a given patient. In the coming years, AI-based histopathological studies are expected to make significant contributions to IBD management. Preliminary data 89 showed that computational pathology algorithms can identify cytokines, such as IL-23 signaling activity, from H&E images. This could significantly enhance our understanding of disease path mechanisms and optimize treatment options for patients with IBD.
AI in predicting disease progression
The unpredictable course of IBD is one of the most challenging aspects of managing the disease. Patients often alternate between periods of active inflammation and remission, with some experiencing frequent complications, such as fistulas, strictures, or the need for surgery. Predicting when a patient will experience a disease exacerbation or develop complications is essential for timely intervention and disease management. ML algorithms can be trained on large datasets that include clinical records, laboratory results, imaging studies, and lifestyle factors to identify patterns and predictors of disease progression. 90 These algorithms can then generate risk profiles for individual patients, estimating the likelihood of a flare-up, complication, or need for surgery. For example, AI can assess inflammatory biomarkers, such as C-reactive protein or fecal calprotectin, alongside clinical symptoms and patient-reported outcomes, to predict when a patient is at high risk of a non-response to therapy. 91 Such predictive models allow physicians to modify treatments proactively, such as increasing medication doses or initiating new therapies before the patient experiences a relapse. For example, in a study from Korea, the ML model for prediction of IBD-related outcomes at 5 years after diagnosis yielded an area under the curve of 0.86 (95% CI: 0.82–0.92). This model performed consistently across a range of other datasets, enabling physicians to perform close follow-up based on the patient’s risk level. 92 A novel ML model based on data of 20,368 veteran health administration patients substantially improved ability to predict future IBD-related hospitalization and steroid use. 93 Furthermore, AI can predict long-term outcomes in IBD patients, guiding decisions about the intensity of treatment. For example, patients at high risk of developing complications might benefit from early, aggressive therapy with biologics or immunosuppressants, while those with a lower risk profile could be managed with less intensive treatments. This individualized approach to disease management can reduce overtreatment, minimize side effects, and improve patient quality of life. Recently, Wang et al. 94 developed a deep learning framework aimed at predicting postoperative recurrence in CD. The model automatically analyzed the muscular layer and myenteric plexus, integrating clinical data to evaluate myenteric plexitis severity and recurrence risk. This approach sheds light on the mechanisms underlying postoperative recurrence and offers potential for enhancing long-term disease management.
Wearable devices, mobile health applications, and home-based diagnostic tools can collect continuous data on a patient’s symptoms, biometrics, and lifestyle factors. AI can analyze these data streams in real time, identifying subtle changes that may indicate an impending flare-up or treatment failure. For instance, fluctuations in inflammatory markers detected through home-based stool tests or blood samples can signal worsening disease activity. By integrating this data with AI algorithms, clinicians can be alerted to intervene early, preventing a full-scale relapse or the need for hospitalization. AI-powered applications can provide patients with personalized feedback based on their symptoms, response to questionnaires. These applications can remind patients to take their medications, track their symptoms, and alert them to seek medical attention if necessary. AI can further enhance these platforms by analyzing patterns in patient-reported outcomes, detecting early warning signs of non-adherence or treatment failure, and suggesting adjustments to the treatment plan. Real-time monitoring supported by AI not only improves disease management but also empowers patients to take a more active role in their care. This proactive approach has the potential to reduce the burden of IBD, both in terms of physical symptoms and the psychological toll of living with a chronic illness.
Utilizing multi-omics data and AI to aid personalized medicine
In recent years omics data analysis has gained importance and has helped in understanding pathogenesis of IBD. The integration of multi-omics and clinical data has been enhanced, leading to breakthroughs in disease diagnosis, drug discovery, and precision medicine. ML-based methods offer significant advantages in handling large-scale datasets and can reveal patterns among a high number of features that traditional methods may fail to identify. Multi-omics analysis can help in holistically understanding pathogenesis, simultaneous changes in microbiome, biological processes which in turn can help the researchers in identifying potential targets. For instance, in a study by Lloyd-Price et al. extensive multi-omics molecular profiling was performed on 132 IBD patients. 95 The authors observed significant alterations in microbiota composition and function based on disease activity states. In another study, remission-associated multi-omic profiles were unique to each therapeutic class. 96 Recently integration of endoscopic, histological data with multi-omics has also been proposed. 97 Moreover, AI models offer the opportunity to identify a pioneering gut barrier-protective agents for IBD and forecasts the potential success of candidate agents in phase III trials. Sahoo et al. developed an AI-assisted approach for target identification and validation. This ML path has demonstrated the ability to predict epithelial barrier-related genes, such as PRKAB1, the β1 subunit of the metabolic master regulator, AMPK, which might represent a novel target for gut barrier-protective therapies. 98
However, AI application in multi-omics is still in infancy and major challenges of multi-omics AI models include the lack of generalization when applied to independent validation cohorts. The primary limitation of the clinical applicability of AI lies precisely in the high heterogeneity of the disease and its variations over time, leading to inadequate reproducibility and generalizability of predictive results and a possible overestimation of prediction accuracy. In the near future, the AI approaches are expected to be especially valuable in classifying already diagnosed patients into disease sub-phenotypes, predicting disease progression, and evaluating response to treatment.
AI in trial design and monitoring: Enhancing adaptive trial designs and real-time participant monitoring
The FDA has highlighted the importance of integrating AI and ML into drug and biological product development, particularly in clinical trial designs. Their discussion paper highlights the potential of AI to streamline the development process by enhancing the design and execution of clinical trials through adaptive methodologies, real-time monitoring, and predictive modeling.4,5,99 This framework not only enhances the efficiency and accuracy of clinical trials but also aligns with regulatory requirements to ensure patient safety and data integrity. 100
One particular advance is the integration of AI for advancing adaptive trial designs, which allow for modifications based on interim data, and have the potential to improve trial efficiency and ethical integrity by minimizing the probability of being randomized to potentially less effective or safe therapies. 101 AI-driven adaptive trials can lead to more flexible studies by enabling real-time adjustments based on patient responses and emerging data.102–104 For instance, whereas traditional frequentist trial designs have pre-specified endpoints where formalized hypothesis testing is carried out to evaluate efficacy and/or safety, AI-driven adaptive designs can potentially identify earlier signals of treatment efficacy or safety, enabling researchers to adjust dosing regimens, modify inclusion criteria, or in the most extreme situation, terminate trials early if necessary. 105 Multiple applications of AI in adaptive trial designs can be implemented. For example, AI-driven predictive analytics or simulation can be used for unbiased interim data analysis, AI-driven outcome prediction modeling can be used for sample size estimation (or re-estimation), AI-driven ML models can inform covariate- or response-adaptive randomization processes, and AI-driven models can be used to generate valid external control arms to help reduce the likelihood of placebo randomization.
In addition to trial design, AI may also facilitate more efficient and actionable real-time monitoring of participants to ensure safety in trial participation. Wearable devices and mobile apps integrated with AI can continuously collect and analyze patient data, providing insights into adherence, disease progression, and adverse events. This real-time monitoring could be essential for proactively managing patient safety and trial integrity.106,107
Moreover, AI has the potential to improve risk-based monitoring (RBM). Historically, RBM has involved a multifaceted approach for identifying, assessing, monitoring, and subsequently mitigating risks that pose threats to quality or safety in an RCT. Given the tremendous amount of data collected within a clinical trial, AI-driven data monitoring systems can detect subtle changes in patient or site-level data that might indicate an adverse event, lack of efficacy, protocol deviations, or potentially site-related concerns that can prompt timely interventions. These systems have the potential to integrate much more data than centralized human monitors and can also help manage large-scale, decentralized trials by coordinating data from multiple sites and ensuring consistency in trial conduct.107,108
Challenges and ethical considerations: Addressing ethical, regulatory, and practical challenges
The integration of AI in IBD clinical trials offers transformative possibilities but also presents significant ethical, regulatory, and practical challenges that must be addressed proactively. One primary ethical concern relates to data privacy, especially given the sensitive nature of health data used in AI models, which includes genetic, biomarker, and multi-omic information. To protect patient confidentiality while enabling AI models to function effectively, robust data security measures such as encryption, anonymization, and strict access protocols must be rigorously implemented. These measures should comply with regulatory standards like General Data Protection Regulation (GDPR) or the Health Insurance Portability and Accountability Act (HIPAA), which are essential for safeguarding patient information throughout AI/ML-driven clinical trials. 109
Transparency and interpretability of AI algorithms are other critical ethical priorities. Many AI models, particularly deep learning systems, operate as “black boxes,” making predictions without easily explainable reasoning. 110 This lack of interpretability can hinder clinicians’ trust in AI-driven recommendations, posing risks of automated decisions that may not align with the best interests of patients. Ensuring transparency and accountability in AI decision-making processes is essential.111,112 To address this, ongoing research and model development should focus on explainable AI frameworks and interpretable ML techniques, which can help clinicians understand and evaluate how models arrive at their conclusions, thereby enhancing clinical justification and maintaining clinician accountability.113,114
The development of attention-based models and post hoc explainability methods such as SHAP and Local Interpretable Model-Agnostic Explanations (LIME) are active areas of research aimed at improving the transparency of AI-driven decisions in IBD trials. 115
Algorithmic bias is another significant concern. AI models trained on historically imbalanced datasets may inadvertently perpetuate existing biases, leading to the underrepresentation or misrepresentation of specific patient groups, such as racial or ethnic minorities, in clinical trials. Studies have shown that IBD trials have historically underrepresented minorities, which can limit the model’s effectiveness in identifying eligible participants from diverse racial or ethnic backgrounds.116–118 Furthermore, this bias can also affect underrepresented subgroups of IBD, such as individuals with fistulizing CD, pouchitis, and extraintestinal manifestations, among other underserved conditions. This further marginalizes these populations and restricts our understanding of these complex and less common disease presentations. Addressing bias requires both methodological and policy-level interventions, such as diversifying training datasets and implementing bias-detection frameworks, adversarial debiasing methods, and fairness-aware ML algorithms.119,120 These strategies can help ensure that AI recruitment algorithms equitably represent the diversity of the IBD patient population, promoting more inclusive and equitable clinical trials.116–118,121 Foundation models, which leverage vast pre-trained datasets, have also been explored as a means to mitigate bias by improving generalizability and reducing the impact of imbalanced training samples. 122
On the regulatory front, the FDA and other regulatory agencies have issued preliminary guidelines on the use of AI in clinical research; however, comprehensive regulations are still evolving.4,5 The lack of standardized regulatory frameworks creates uncertainties for AI developers and researchers, especially concerning model validation, continuous learning protocols, and real-time adjustments in adaptive trials. 123 To navigate these complexities, ongoing collaboration among industry stakeholders, researchers, and regulatory agencies is crucial for establishing clear guidelines on the development, deployment, and monitoring of AI in clinical trials. 124 Regulatory bodies, including the FDA, emphasize the importance of model validation and explainability, calling for regular algorithmic validation, retraining, and the establishment of processes to monitor updates and improvements in AI systems throughout the course of a trial.4,5,102 The emergence of foundation models has further encouraged regulatory discussions, as their broad pre-training across multiple domains can reduce the frequency of retraining and improve transferability across diverse patient populations. 125
From a practical standpoint, implementing AI-driven technologies in clinical trials requires substantial resources, including technical infrastructure, specialized personnel, and ongoing oversight to maintain model performance and data integrity. 126 For example, effective implementation in decentralized trials requires robust digital platforms for seamless data collection and integration across multiple sites. 127 Integrating AI models into clinical workflows also presents logistical challenges; clinicians and trial coordinators need training to understand, trust, and effectively utilize these tools. Practical guidelines for AI implementation should emphasize user-friendly interfaces, interoperability with existing systems, and consistent support to facilitate successful integration into clinical practice.4,126
Lastly, as AI increasingly plays a role in real-time patient monitoring through wearable devices and remote data collection,128,129 patient autonomy and informed consent become critical ethical considerations. 130 Patients should be fully informed about how their data will be used, including the role of AI in monitoring their health and influencing treatment pathways. Informed consent processes must adapt to address AI-driven data analysis, providing clear assurances regarding data security, the purpose of data collection, and the limits of AI’s predictive capabilities. 131 By prioritizing patient autonomy, transparency, and robust security measures, researchers can foster trust in AI-enabled IBD trials, thereby promoting equitable and ethical innovation in clinical research.
Discussion
AI represents a paradigm shift in the landscape of IBD clinical trials and has a potential to expedite the drug development process, thereby making safe and effective drugs available for the patient faster. AI can help at several stages of drug development process including but not limited to identification of potential molecular targets by integrating and analyzing multi-omics data, streamlining patient recruitment, enhancing data analysis, personalizing strategies, thereby optimizing trial design and monitoring107,132 and post-marketing surveillance.
Realizing the potential of AI in clinical trials requires careful consideration of the ethical, regulatory, and practical challenges associated with its integration. 123 Despite its promise, AI technology is still in the early stages, and several obstacles must be overcome before its widespread implementation in drug development processes. Integration of data across various systems, including EHRs, laboratory databases, and imaging repositories, and integration of these databases can be a challenging task. Moreover, variability in data formats, terminologies, and quality across institutions can lead to inaccuracies in AI predictions. Additionally, privacy concerns and proprietary restrictions often limit access to the data needed to train and validate AI models.
Bias in AI models is another critical issue. If the datasets used to train AI systems are not representative of diverse populations, AI algorithms may produce biased predictions, leading to inaccurate outcomes.133,134 The development of AI-driven tools for IBD must prioritize inclusivity, ensuring that these technologies benefit all patient populations, particularly those that have been historically underrepresented in clinical research.81,116 By addressing these challenges, AI can fulfill its promise of transforming IBD clinical research, leading to more effective and equitable healthcare outcomes. For example, underrepresentation of minority groups or women could result in skewed trial results, while AI designs and implementations may unintentionally favor certain demographics, contributing to inequities in clinical trial recruitment and treatment recommendations. Furthermore, the use of AI systems requires careful attention to ethical considerations around sensitive personal data. Ensuring robust informed consent processes that address how AI technologies are employed is essential. Moreover, anonymizing and securing data while maintaining its utility for AI presents considerable challenges.
AI in clinical trials is still relatively novel, and regulatory frameworks remain under development. There is currently no universally accepted standard for validating AI tools in clinical trials, making it difficult to guarantee their reliability and reproducibility. Integrating AI into clinical trials necessitates significant adjustments to established workflows. Clinicians and trial staff often lack the technical expertise needed to effectively operate and interpret AI tools. This requires extensive training and collaboration with data scientists. Additionally, resistance to AI adoption may arise from stakeholders skeptical about the accuracy of these systems or concerned about job displacement. The development and implementation of AI systems in clinical trials can be resource-intensive, with high initial investment requirements. Running advanced AI algorithms demands substantial computing infrastructure, which may not be available in all clinical trial settings. Additionally, continuous updates, validation, and monitoring of AI systems are necessary to ensure their ongoing accuracy and relevance, contributing to long-term costs. AI models also require rigorous validation to ensure they can be applied effectively across diverse trial populations. However, many models are tested on limited datasets, raising concerns about their generalizability. Differences in data sources, trial protocols, and patient demographics can lead to inconsistent results when applying the same AI model across multiple settings.
While AI holds tremendous potential to transform clinical trials, its implementation faces numerous challenges. Overcoming these obstacles requires a multidisciplinary approach that involves collaboration between researchers, clinicians, data scientists, ethicists, and regulators. Key strategies include enhancing data standardization, ensuring transparency in AI systems, establishing clear regulatory guidelines, and fostering education and collaboration among stakeholders. Addressing these challenges will enable AI to become a powerful tool for making clinical trials more efficient, equitable, and effective. The FDA’s guidance provides a comprehensive framework for the responsible use of AI in drug and biological product development, emphasizing the need for transparency, validation, and patient-centric approaches.4,5,135 A recent report from World Health Organization also highlights the ethics and recommendations for governance of AI in health care. 136 As the field continues to evolve, ongoing collaboration between researchers, regulators, and industry stakeholders will be essential to use AI’s full potential in IBD research.
Conclusion
The integration of AI into clinical trials for IBD represents a significant advancement in gastroenterology research and patient care ushering in a new era of precision medicine. AI technologies, including ML and predictive analytics, are revolutionizing trial design, patient recruitment, endpoint assessment, data analyses, personalized treatment strategies, and the monitoring and prediction of treatment responses. The ability to assess deeper levels of healing, such as barrier healing, will enhance therapeutic strategies and potentially organ-sparing approaches.
By leveraging large datasets, AI enhances the accuracy and diversity of participant selection while providing deeper insights into disease mechanisms. Its ability to customize treatment plans for individual patients promises improved outcomes and reduced side effects, which is crucial in IBD management due to the variability in patient responses. However, despite its potential, the adoption of AI presents critical challenges that require careful consideration. For instance, AI systems generated on supervised learning inherently depend on the assumption that the input data—often derived from physician diagnoses or clinical observations—is accurate. This reliance underscores the importance of high-quality, well-annotated datasets to minimize errors and biases. In the context of IBD, where diagnosis and disease characterization can be complex, ensuring the reliability of input data is essential to avoid perpetuating inaccuracies through AI-driven analyses.
Another important consideration is the cost associated with incorporating AI into clinical trials. Developing and maintaining AI systems requires significant investment in infrastructure, including high-performance computing capabilities, data integration platforms, and skilled personnel such as data scientists and bioinformaticians. Moreover, the ongoing need for algorithm validation, retraining, and compliance with regulatory standards adds to the financial burden. While these costs may be prohibitive for some institutions, they must be weighed against the potential long-term benefits, such as more efficient trials, personalized treatments, and reduced healthcare expenditures resulting from improved disease management.
In conclusion, while AI represents a paradigm shift in IBD research and clinical trials, its successful implementation will depend on addressing these foundational challenges. Collaboration between researchers, clinicians, regulatory authorities, and industry stakeholders will be crucial to ensuring that AI technologies are accurate, transparent and accessible. By prioritizing data quality, cost-effectiveness, and ethical standards, the integration of AI has the potential to significantly enhance treatment outcomes and advance the IBD and gastroenterology field.
