Abstract
Background:
Video capsule endoscopy (VCE) has been proven to accurately diagnose small-bowel inflammation and predict flares among patients with quiescent Crohn’s disease (CD). However, data regarding its predictive role in this population over an extended follow-up are scarce.
Objectives:
To predict clinical exacerbation and to assess the yield of Lewis score in identifying CD patients with future clinical exacerbation during an extended follow-up (>24 months).
Design:
A post hoc analysis study.
Methods:
Adult patients with quiescent small-bowel CD who were followed with VCE, inflammatory biomarkers and magnetic resonance enterography in a prospective study (between 2013 and 2018). We extracted extended clinical data (up to April 2022). The primary composite outcome (i.e. clinical exacerbation) was defined as intestinal surgery, endoscopic dilation, CD-related admission, corticosteroid administration, or biological/immunomodulator treatment change during follow-up.
Results:
Of the 61 patients in the study [median age 29 (24–37) years, male 57.4%, biologic treatment 46.7%], 18 patients met the primary outcome during an extended follow-up [median 58.0 (34.5–93.0) months]. On univariable analysis, complicated [hazard ratio (HR) 7.348, p = 0.002] and stricturing disease phenotype (HR 5.305, p = 0.001) were associated with higher risk for clinical exacerbation during follow-up. A baseline VCE middle small-bowel segment Lewis score (midLS) ⩾ 135 identified patients with future exacerbation [AUC (area under the curve) 0.767, 95% confidence interval (CI) 0.633–0.902, p = 0.001, HR 6.317, 93% negative predictive value], whereas the AUC of the conventional Lewis score was 0.734 (95% CI: 0.589–0.879, p = 0.004). Sensitivity analysis restricted to patients with either complicated (n = 34) or stricturing (n = 26) disease phenotype revealed that midLS still predicted clinical exacerbation during follow-up (AUC 0.747/0.753, respectively), in these patients.
Conclusion:
MidLS predicts treatment failure in quiescent CD patients (median follow-up of 5 years) independently of disease phenotype.
Introduction
Crohn’s disease (CD) is a state of chronic, relapsing, and remitting intestinal inflammation, which over time can lead to tissue detriments such as strictures, fistulas, and abscesses.1,2 Mucosal healing and clinical remission (i.e. deep remission) were proven as beneficial objectives, which are associated with improved disease course and prevention of poor clinical outcomes and future complications in patients with CD.3–5
There is no reference standard for disease monitoring of patients with CD. 6 Inflammatory biomarkers (e.g. C-reactive protein and fecal calprotectin) are recommended as intermediate medium-term tests4,7 to monitor inflammation in CD,1,6 although no mucosal visualization is provided. On the other hand, strict monitoring of mucosal inflammation requires invasive (e.g. ileocolonoscopy) and costly [e.g. magnetic resonance enterography (MRE)] procedures, and its prediction yield is still incomplete. 8
Video capsule endoscopy (VCE) affords a noninvasive visualization of the mucosal surface of the entire small bowel. 9 Its use is endorsed in newly diagnosed patients with CD by both American 1 and European 6 guidelines. Small-bowel VCE scores (i.e. Niv and Lewis scores) were proven to accurately predict future clinical outcomes among patients with CD,7,10,11 and some of the studies have even delineated distinctive Lewis score thresholds which were associated with worse future clinical outcomes.7,10 However, data regarding the yield of small-bowel VCE scores during an extended follow-up (>24 months) are lacking, and definitive recommendations to use it have not yet been implemented in the existing guidelines.
In this study, we aimed to predict worse clinical outcomes among patients with quiescent CD over an extended follow-up (>24 months). We also aimed to assess Lewis score yield in identifying patients with future clinical exacerbation during the extended follow-up, as we demonstrated for the shorter follow-up (up to 24 months). 7
Materials and methods
Study design and population
This was a post hoc analysis of adult patients (⩾18 years old) with quiescent ileal or ileocolonic CD (L1 or L3) who were enrolled between 2013 and 2015, and followed with clinic visits, inflammatory biomarkers, MRE and VCE (SB-III and PillCam colon capsule; Given Imaging, Yoqneam, Israel) as part of previously published prospective study. 7
Eligibility criteria included the following: Crohn’s disease activity index (CDAI) ⩽ 150 or mild symptoms (CDAI ⩽ 220) in 3–24 months prior to inclusion, and the absence of corticosteroid use and/or medication change in the parallel period. Only patients with proven small-bowel patency, as tested by patency capsule (PC) ingestion [i.e. intact PC passage through the small bowel within 30 h (Given Imaging, Yoqneam, Israel)], 12 were included.
Study outcomes
The primary composite outcome was a clinical exacerbation defined as intestinal surgery, endoscopic dilation, CD-related admission, need for corticosteroids, or biological/immunomodulator treatment initiation or change during the follow-up (excluding cases of dose intensification). We aimed to identify predictors for the predefined clinical exacerbation over an extended follow-up.
Data extraction and definitions
The following baseline clinical and laboratory data were extracted 13 the case report forms (CRFs) of the previously published study 7 : clinicodemographic parameters such as age, sex, body mass index (kg/m2), smoking status, disease duration, age at disease initiation, involved bowel segments, the presence of perianal disease, extraintestinal manifestations, the presence of complicated disease phenotype (stricturing and/or penetrating), past disease-related hospitalization, past disease-related abdominal operation, past use of corticosteroids, biologics at baseline, C-reactive protein (mL/L) and fecal-calprotectin (µg/g) levels, and baseline MRE scoring to evaluate small-bowel inflammatory activity and damage [Lemann, Magnetic Resonance Index of Activity (MaRIA), and Clermont scores14–16]. Mucosal inflammation was quantified for each of the small-bowel tertiles (equally divided based on the capsule transit time), using the Lewis score system. 17 Conventional Lewis score (convLS) was then determined, based on the highest score of the three tertiles. 17
Data regarding the study outcomes were gathered 13 based on the electronic medical records. To attenuate the effect of missing data, we conducted a phone call to each patient with a lack of an appropriate clinical follow-up, to complete data gathering, in regard to the composite outcome.
Statistical analysis
Discrete variables were presented as proportions. Continuous variables were assessed for normal distribution by Shapiro–Wilk test and expressed as median [interquartile range (IQR)] or as mean ± standard deviation, appropriately. Patients with and without clinical exacerbation were compared using two-sample t-test or Mann–Whitney U test for the continuous variables, while χ2 test with Yates correction was used for the discrete ones. We defined follow-up duration as the time elapsed since enrollment to the last available follow-up of each patient. 13
Survival analysis for clinical exacerbation (i.e. the primary composite outcome) was conducted using Kaplan–Meier curves with log-rank test calculation. Univariable analysis regarding the primary composite outcome was performed using cox proportional hazard ratio [HR; 95% confidence interval (CI)] and log-rank test for continuous and categorical variables, respectively. Log-minus-log and multicollinearity estimation were used to evaluate the Cox P-H assumptions.
A receiver operating characteristic (ROC) curve was constructed and area under the curve (AUC) was calculated for diagnostic tests with continuous results. Youden most accurate points were computed for each ROC curve, as well as for sensitivity, specificity, negative predictive value (NPV), and positive predictive value (PPV). 7
All statistical tests were two-sided and p < 0.05 was considered as statistically significant. Statistical analyses (including survival graphs) were performed using SPSS software (IBM SPSS statistics for windows, version 26; IBM corp. Armonk, NY, USA, 2019).
Results
Out of 90 patients who were screened with PC and MRE for small-bowel patency, 29 patients were excluded (17 patients due to retained PC, 6 patients withdrew consent, 3 patients due to clinical flare, 2 patients due to treatment escalation following MRE findings, and 1 patient due to technical reason); thus, 61 were enrolled between 2013 and 2015 (Figure 1) and underwent VCE procedure.

Study flowchart.
Overall, 18 patients met the primary composite outcome (29.5%) during the study follow-up [median of 58.0 (IQR: 34.5–93.0) months]. One patient underwent endoscopic dilation (1.6%), 6 patients were hospitalized due to CD clinical flare (10%), 7 patients needed a corticosteroid treatment (11%), and 16 patients had their treatment been modified with biologics/immunomodulators (26%). There was not any case of CD-related surgery during the study follow-up. The median time to clinical exacerbation (i.e., the primary composite outcome) among these patients was 32.5 (IQR: 9.75–62.25) months. There was no case of VCE retention or any obstructive complication in any of the 61 VCE procedures.
Patients’ baseline characteristics are presented in Table 1. Demographics were overall comparable between patients with and without clinical exacerbation: median age of 32 (IQR: 21–44) years versus 29 (IQR: 26–36), respectively (p = 0.537), and male gender proportion of 72.2% versus 51.2%, respectively (p = 0.129). Most of the patients had a baseline CDAI score of ⩽150 [16/18 (88.9%) versus 38/43 (88.4%) of the patients with and without clinical exacerbation during follow-up, respectively (p = 1.000)]. The rest of the patients had a baseline CDAI score of 151–220 [2/18 (11.1%) versus 5/43 (11.6%), respectively]. Baseline disease-related features were quite balanced between both groups, except for higher rate of complicated disease phenotype among patients with clinical exacerbation compared with the controls (89% versus 43%, respectively, p = 0.002). Biologics use was less prevalent among patients with clinical exacerbation compared with the controls (22.2% versus 57.1%, p = 0.013).
Baseline characteristics of patients with and without clinical exacerbation during follow-up.
Note: Continuous variables are presented as median (interquartile range) while categorical ones are presented as proportions (%).
Conventional Lewis score was defined as the worst segment score
Data were missing for one patient.
Data were missing for nine patients.
Patients with stenotic and penetrating features.
Involvement of proximal small bowel which was defined by prior health record documentation.
Data were missing for two patients.
Data were missing for three patients.
CD, Crohn’s disease; CDAI, Crohn’s disease activity index; LS, Lewis score; MaRIA, Magnetic Resonance Index of Activity.
Median measures of C-reactive protein, middle small-bowel tertile LS (midLS), and convLS (i.e. worst segment score) and the examined MRE scores were higher among patients with clinical exacerbation compared with the controls. No significant difference was observed regarding the median of fecal-calprotectin levels, and the median Lewis score of the proximal and the distal small-bowel tertiles between both groups (Table 1).
Three patients out of the 16 patients who had needed a treatment modification during follow-up were on biologics (i.e. adalimumab) at baseline. Methotrexate was added to adalimumab in one patient, while treatment was changed to ustekinumab in the others. Of the rest of the patients whose treatment was changed (n = 13), nine patients were on azathioprine/mercaptopurine, four patients had no treatment, and one patient was on 5-aminosalicylic acid at baseline. Among them, infliximab, adalimumab, and vedolizumab were prescribed for seven patients, five patients, and one patient, respectively, during follow-up. The median time to treatment modification during follow-up was 28.5 (10.25–73.25) months. The median time to steroid prescription (i.e. prednisone or budesonide) during follow-up was 35 (15–47) months.
Univariable analyses regarding the primary composite outcome are summarized in Table 2. Complicated (p = 0.002) and stricturing disease phenotype (p = 0.001) were the only clinicodemographic variables to be associated with higher risk for future clinical exacerbation on univariable analysis. We found fecal-calprotectin level (p = 0.027), midLS (p < 0.001), distal small-bowel tertile Lewis score (p = 0.007), convLS (p = 0.006), Lemann score (p < 0.001), MaRIA score (p = 0.034), and Clermont score (p = 0.033) to be associated with higher risk for clinical exacerbation. Since there were only 18 cases of clinical exacerbation in this cohort study, we did not perform multivariable regression analysis, due to low event-per-variable ratio.
Univariable and ROC (receiver operating characteristic) analyses regarding the predefined composite outcome (intestinal surgery, endoscopic dilation, CD-related admission, need for corticosteroids, or biological/immunomodulator treatment change).
ROC analysis was performed only for diagnostic test with continuous results.
AUC, area under the curve; CD, Crohn’s disease; CI, confidence interval; LS, Lewis score; MaRIA, Magnetic Resonance Index of Activity.
On ROC curve analysis regarding future clinical exacerbation, baseline midLS had the highest AUC value (0.767, 95% CI: 0.633–0.902, p = 0.001) followed by convLS with AUC value of 0.734 (95% CI: 0.589–0.879, p = 0.004) (Table 2), without statistically significant difference performing AUCs comparison. MidLS ⩾ 135 was the best threshold to identify patients with higher probability for clinical exacerbation (HR 6.317, 95% CI: 1.820–21.922) with a sensitivity, specificity, PPV, and NPV of 86%, 73%, 53%, and 93%, respectively (Figure 2). The ideal cutoff of convLS was ⩾368 with a sensitivity, specificity, PPV, and NPV of 75%, 80%, 62%, and 88%, respectively.

Kaplan–Meier graph of survival without the predefined composite outcome (intestinal surgery, endoscopic dilation, CD-related admission, need for corticosteroids, or biological/immunomodulator treatment change) among the entire cohort population, divided to patients with and without middle small-bowel segment Lewis score (midLS) ⩾ 135.
Even among patients with either complicated (n = 34) or stricturing disease phenotype (n = 26), midLS (Figures 3 and 4) and convLS still identified patients with higher risk of clinical exacerbation. Although no significant difference was observed, convLS tended to better predict future clinical exacerbation compared with the midLS among patients with complicated disease phenotype [AUC 0.783 (95% CI: 0.624–0.942, p = 0.005) versus 0.747 (95% CI: 0.578–0.915, p = 0.014)]. The best thresholds were convLS ⩾ 195 (HR 8.800, 95% CI: 1.153–67.172) and midLS ⩾ 178 (HR 2.833, 95% CI: 1.027–7.814), respectively. Among patients with stricturing disease phenotype, midLS and convLS identified patients with higher risk for clinical exacerbation [AUC 0.753 (95% CI: 0.564–0.942, p = 0.029) versus 0.741 (95% CI: 0.544–0.938, p = 0.037)], with ideal thresholds of midLS ⩾ 160 (HR 3.774, 95% CI: 1.156–12.128) and convLS ⩾ 435 (HR 2.792, 95% CI: 0.850–9.171), respectively. Among patients who have not been treated with biologics at baseline (n = 32), midLS was the single only small-bowel segment to significantly identify those patients with a higher risk for future clinical exacerbation (AUC = 0.744, 95% CI: 0.569–0.920, p = 0.019).

Kaplan–Meier graph of survival without the predefined composite outcome (intestinal surgery, endoscopic dilation, CD-related admission, need for corticosteroids, or biological/immunomodulator treatment change) among patients with complicated disease phenotype, divided to patients with and without middle small-bowel segment Lewis score (midLS) ⩾ 178.

Kaplan–Meier graph of survival without the predefined composite outcome (intestinal surgery, endoscopic dilation, CD related admission, need for corticosteroids, or biological/immunomodulator treatment change) among patients with stricturing disease phenotype, divided to patients with and without middle small-bowel segment Lewis score (midLS) ⩾ 160.
We did not observe any of the examined MRE scores or the examined inflammatory biomarkers to well identify clinical exacerbation among the cohort population (AUC < 0.7) during the follow-up (Table 2).
Discussion
In this study, we aimed to explore long-term clinical outcomes, among patients with quiescent CD during an extended follow-up. We found midLS ⩾ 135 to accurately identify future clinical exacerbation among these patients, overshadowing the other examined inflammatory biomarkers and MRE scores. To the best of our knowledge, this was the longest follow-up (median of almost 5 years) of patients with quiescent CD, undergoing VCE, to evaluate the prediction role of VCE Lewis score for future clinical exacerbation in this population.
Lewis score is a VCE scoring system to quantify small-bowel mucosal inflammation and stenosis of patients with CD. 17 Mucosal inflammation is then classified into normal/clinically insignificant (<135), mild (135–790), and moderate to severe (⩾790). Using the Lewis score system to diagnose CD (score > 135) had been proven to be very useful with higher rates of sensitivity (82.6–92%) and NPV (87.9–96%).18–20 There was also a significant association between a higher Lewis score and the need for treatment escalation, intestinal resection, and hospital admission among newly diagnosed CD patients, within the first year after diagnosis. 18 Our group had demonstrated that VCE-based monitoring of patients with quiescent CD was accurate, and that Lewis score cutoff ⩾ 350 had the highest yield in predicting clinical flares (defined as ΔCDAI > 70 points from baseline and CDAI > 150 or need for rescue treatment) during 24 months of follow-up. 7 Recently, Nishilawwa et al. had presented a Lewis score cutoff ⩾ 270 as a predictor of CD-related emergency hospitalization, and clinical flares (defined by the need of treatment change/endoscopic intervention) within 2-year follow-up of CD patients with or without disease activity. 10 They also showed that among those patients who had met that cutoff, treatment modification led to improved clinical outcomes during follow-up, compared with patients whose treatment had been unchanged. 10 We demonstrated that VCE-based monitoring has well-identified patients with quiescent CD, who had a higher risk for future clinical exacerbation over an extended follow-up (median 5 years), independently of disease phenotype. To the best of our knowledge, this was the longest clinical follow-up of patients with quiescent CD who underwent baseline VCE.
VCE use among newly diagnosed patients with CD is highly recommended.1,6 It allows us to visualize previously considered obscured segments of the small bowel, resulting in more accurate diagnosis and disease extent classification of patients with CD. 21 Jejunal and/or ileal involvement of CD is associated with worse long-term clinical outcomes,22–25 while ileal involvement is generally more difficult to treat, compared with colonic involvement. 22 Thus, it is conceivable that midLS, which had been probably represented a significant part of the jejunal and ileal segments, was the most accurate predictor for future clinical exacerbation in patients with quiescent CD (AUC = 0.767). Our findings were consistent with our previous research, 7 in which Lewis score of the middle small-bowel segment has the highest AUC (0.79) compared to the Lewis score of either proximal (AUC = 0.64) or distal (AUC = 0.68) small-bowel segments, to predict clinical flares during 24-month follow-up. Lewis scores of both middle small-bowel segment and the conventional one were equal (AUC = 0.79) in their accuracy to predict flares during 24-month follow-up. 7 In this study, using a broader disease outcomes definition and an extended long-term follow-up (median 5 years), we found that midLS tended to better predict future clinical exacerbation compared with the conventional one (AUC 0.767 versus 0.734). Future prospective research with larger cohort size is of prime importance to further establish these findings, and to appropriately implement it in the real-life practice of patients with CD.
The Lewis score system divides the small bowel into three segments (i.e. proximal, middle, and distal) according to the transit time along the small bowel. 26 Therefore, the anatomical extent of each segment may be influenced by several factors which have been previously found to hasten (e.g. prokinetics) or delay (e.g. aging, intestinal stenosis, diabetes) small-bowel transit time.27,28 Yet, many reports in this field are conflicting, including those studies in which intra-individual transit time variations have been tested.27,28 The patients in our cohort were exclusively consisted of CD patients, subsequently, all of them were prone to delayed small-bowel transit time. 29 However, these patients were younger [median age of 29 (24–37)] than the age groups previously reported to have a delayed small-bowel transit time (65–75 27 , >75, 27 and >40 29 years old). Our findings were consistent even among patients of the stricturing disease phenotype group (n = 26), as midLS still identified future clinical exacerbation in this population, although the small-bowel transit time may be deferred in patients with intestinal stenosis. 27
Previously published studies have demonstrated the prime importance of biologics to reduce surgery rates among CD patients30–32; however, improvement in diagnostic tools and disease monitoring play an important role in dealing with CD as well 32 . Biologics use among patients with inflammatory bowel disease (IBD) holds some disadvantages including immunogenicity, infectious complications, liver injury, neurological lesions, and skin manifestations, 33 as well as biologics nonadherence which is quite common in this population. Subsequently, the potential beneficial effects of that group of medications in patients with IBD is limited. 34 In this study, biologics had been more commonly used among patients who did not experience clinical exacerbation during follow-up, compared with those who did experience. Thus, we could not exclude a possible effect of their use on the study outcomes. However, among the patients without biologics at baseline, midLS well identified those patients with a higher risk for future clinical exacerbation among others. We believe that this finding may better guide an appropriate use of biologics, to improve disease control in this population.
Limitations
This study has several limitations. First, we could not use CDAI score changes as an outcome, due to unavailable data during the extended follow-up, but for the initial prospective follow-up (up to 2018). Instead, we have defined a primary composite outcome composed of broadened and stringent disease outcomes, which may further strengthen our findings. Second, we cannot exclude a possible bias whereby physicians of patients who had a high baseline Lewis score were more inclined to modify their treatment compared to patients with normal Lewis score. However, all patients had quiescent disease at the time they underwent VCE, and the median time between VCE and the primary composite outcome was 32.5 (IQR: 9.75–62.25) months [28.5 (10.25–73.25) months to treatment change], making it unlikely that VCE results solely in itself led to treatment modification, rather than clinical progression in a much later physician’s judgment. Third, we had a modest cohort size with a low event to variable ratio limiting us in performing multivariable analysis. However, it was a well-characterized cohort composed of patients with quiescent CD, enabling us to perform sensitivity analyses based on disease phenotype, and on a distinct group of patients without biologics at baseline. Moreover, a sample size of only 54 patients (26 and 28 patients in the midLS < 135 and midLS ⩾ 135 groups, respectively) would have been appropriate to fulfill the statistical constraints of our findings (i.e. survival analysis in regard to midLS ⩾ 135, at a significance level of 5% and a power of 80%). Finally, since patients with retained PC were excluded, generalizability of the study’s findings is limited. However, this study extends the prognostication scope of capsule endoscopy, whereby patients with retained PC as their first diagnostic step have been shown to have worse long-term outcomes (Ukashi, AJG 13 ), and this study shows that patients with confirmed patency who perform VCE can be further risk-stratified based on midLS inflammation.
Conclusion
In conclusion, to the best of our knowledge, this is the longest follow-up of patients with quiescent CD undergoing VCE. For the first time, we demonstrated that midLS is an accurate predictor for clinical exacerbation (median follow-up 5 years), among patients with CD, independently of disease phenotype. Thus, midLS may identify patients with higher risk for treatment failure, to guiding stricter monitoring and more rigorous management modification in this population.
