Abstract
Colorectal cancer (CRC) is a leading cause of mortality, and early detection through colonoscopy is critical for reducing death rates. However, up to 26% of precancerous polyps are missed, often due to variability in physician skill. Simulation-based training (SBT), like physical simulators (PS), have the potential to improve patient outcomes in colonoscopy by reducing cecal intubation time (CIT). PS offer more realistic experiences than virtual simulators, but lack integrated feedback, limiting its effectiveness in polyp detection (PD) training. Results found that experts achieved lower CIT compared to novice residents, identified key challenges residents face in navigation, scope manipulation, and fold inspection, and found that experts employ techniques such as lumen centering, re-advancement, and fold inspection to enhance PD. These findings emphasize the need for targeted feedback within PS training to improve colonoscopy performance. Feedback-driven training strategies using expert strategies may potentially reduce polyp miss rates and improve CRC screening outcomes.
Keywords
Introduction
Colorectal cancer (CRC) is one of the most common and deadly cancers worldwide, accounting for approximately 52,000 deaths in the U.S. in 2022 (U.S. Cancer Statistics Colorectal Cancer Stat BiTe, 2024). Because early detection dramatically improves patient outcomes, colonoscopy has become the primary screening tool—resulting in over 15 million colonoscopies performed annually in the U.S. alone (Gangwani et al., 2023). During a colonoscopy, a physician inserts a long, flexible endoscope into the colon, advances it to the cecum, then withdraws it while carefully inspecting the mucosa for precancerous polyps or adenomas. Early identification and removal of these lesions significantly reduces CRC mortality rates (Baxter et al., 2012); however, the effectiveness of colonoscopy depends on the physician’s ability to detect polyps (Qayed et al., 2017).
A key quality metric in colonoscopy is a physician’s adenoma detection rate (ADR) defined as the proportion of colonoscopies performed by a physician in which at least one adenomatous polyp is detected (Gupta, 2016). Higher ADRs are associated with lower rates of interval colorectal cancer and cancer-related mortality rates (Corley et al., 2014). Although trainees may meet ADRs benchmarks during residency programs, studies have shown that these detection rates tend to decrease afterwards (Sarvepalli et al., 2019), highlighting a need for improved and sustained training approaches. In addition, small polyps (1–5 mm) are particularly difficult to detect with miss rates as high as 26% (Van Rijn et al., 2006). Another essential quality metric is the Cecal Intubation Rate (CIR) which measures the percentages of cases a physician successfully advances the colonoscope tip past the ileocecal valve to visualize the cecum (Hamada et al., 2023). CIR is crucial to ensuring a complete and thorough examination, and lower CIRs are also associated with decreased ADR (Von Renteln et al., 2017). While the US Task Force on Colorectal Cancer recommends a 95% CIR, studies have found that first-year endoscopy fellows only attain a CIR of 83% CIR (Lee et al., 2008), underscoring the need for better training for novice practitioners.
Colonoscopy is considered a complex procedure due to the anatomy and curved shape of the colon (Wullstein et al., 1999). One of the primary challenges physicians’ face is looping, where the endoscope coils within the colon, often leading to incomplete procedures and potential missed polyps due to failure to reach the cecum (Abuelazm et al., 2023). Another common difficulty is navigating and inspecting behind the folds and bends of the colon areas where precancerous lesions can be easily overlooked (Witte & Enns, 2007). These challenges directly impact both ADR and CIR, emphasizing the need for targeted training and feedback that focus on scope manipulation and thorough inspection techniques. A promising approach is to identify and incorporate the strategies employed by expert physicians which have been shown to improve procedural quality and have informed both medical training programs (Upanc et al., 2015) and the design of simulators (Hegde et al., 2020). In particular, research has highlighted several expert techniques that improve adenoma detection rate (ADR), including maintaining sufficient withdrawal time, re-inspecting the right side of the colon (Gubbiotti et al., 2022), performing retroflexion (bending the endoscope to view backwards), and careful inspection during insertion (Brand & Wallace, 2017). Therefore, further identification and integration of expert inspection strategies into colonoscopy training could be key to improving ADR and overall procedural quality.
Simulation-based training (SBT) (Sawaya et al., 2021) is a widely adopted method in medical education, allowing physicians to acquire clinical skills without exposing patients to unnecessary risk (Singh et al., 2014). SBT offers a controlled and safe learning environment and has been shown to reduce cecal intubation time, thereby improving patient outcomes (Koch et al., 2015). Two commonly used simulation modalities are virtual reality (VR) simulators and physical simulators (PS). VR simulators provide advanced visualizations, objective performance feedback, and tailored skill assessments to support polyp detection training (Mu et al., 2024). Feedback during SBT is a critical component of effective learning (Barry Issenberg et al., 2005), and without it, no meaningful learning occurs (Mahmood & Darzi, 2004). However, VR simulators effectiveness is limited by the lack of realistic haptic feedback and equipment mismatch, and occasional latency issues, all of which can reduce training transferability (Mu et al., 2024). In contrast, PS offer superior haptic feedback and more realistic scope navigation (Loukas et al., 2012) but rely on expert feedback, making the training process time-consuming and costly. Furthermore, most PS lack polyp platforms do not generate automated performance feedback, limiting their ability to support objective skill evaluation.
Combining the strength of haptic realism in manikin simulation with the powerful effects of feedback and polyp detection training in VR is crucial for improving ADR and CIR in medical training. In an aim to try and improve upon ADR and CIR in colonoscopy through simulation-based training, the goal of this study was to gain a deeper understanding of the different challenges that residents experience as they perform a colonoscopy insertion and withdrawal task and further identify expert physician strategies needed for thorough colon inspection. This can guide the development of targeted feedback and training strategies for PS, which can potentially improve ADR and patient outcomes. The study aims to answer the following research questions (RQ): RQ1: How do ADR, CIT, and CIR differ between experts, intermediate, and novice residents in simulation-based training?;RQ2: What are the main challenges faced by medical residents during colonoscopy insertion and polyp detection?;RQ3: What strategies do expert physicians use to ensure thorough inspection of the colon during colonoscopies?
Materials and Methods
Participants
Five experts (4 male, 1 female and 12 residents (10 male, 2 female) were recruited from Hershey Medical Center. Experts were defined as someone who had performed more than 250 colonoscopies. Five residents were classified as “intermediate” if they had performed 11–100 colonoscopies, and seven were classified as novice, if they performed 0–10 colonoscopies. Novice residents were pre-screened for basic knowledge on endoscope handling to be recruited. Ten residents were in general surgery programs, one resident in emergency medicine and one in anesthesia. Eleven residents had previous simulator training, six hands-on training, and one had no previous training.

Modified Kyoto Kagaku Simulator with sections A,B, and C.

A Silicone replica of a polyp inside the modified Kyoto Colon.
If a participant had not reached the cecum, the endoscope was maneuvered to the cecum by the research team during survey completion. Participants were then instructed to perform the withdrawal task, which involved retracting the endoscope while detecting polyps and continuing the think-aloud protocol. Prior to withdrawal, an example polyp was shown. During the task, participants verbally indicated when they detected a polyp, detection times were later recorded through video analysis. A 15-minute time limit was given for this task. Following withdrawal, participants completed a second five-question survey consisting of three Likert/open-ended items and two open-ended questions assessing their performance, identifying challenges, and suggesting skills for improvement, see survey here. Overview of steps of procedure is shown in Figure 3.

Steps of Procedure.
Qualitative Data Analysis
To understand the challenges residents face during colonoscopy insertion and withdrawal, and identify experts’ strategies used for inspection, inductive content analysis (Hsieh & Shannon, 2005) was conducted on two fully open-ended survey responses which were: what challenges residents face and what skills residents need to improve upon during insertion/withdrawal, and the “think aloud” audios of experts during the withdrawal task. Specifically, only expert audios were used since prior work have used cognitive processes from experts to identify important skills that can be integrated into simulation-based training (Cannon-Bowers et al., 2013). Survey responses were compiled on word, and the ‘think aloud’ audios of experts were transcribed using Otter.ai and manually checked for accuracy. A codebook was then developed by two graduate researchers using an inductive analysis approach (Hsieh & Shannon, 2005), see codebook here.. To develop the codebook, the two researchers listened to one audio and went through three surveys together to identify different nodes/themes that occurred during the audios and survey responses. This represented around 20% of each data type. Next, each researcher coded 1 video and 3 survey responses separately, and the inter-rater reliability (IRR) was calculated using NVIVO software, obtaining a 79% (Cohen’s Kappa = .79) level of agreement between the two raters. Since the level of agreement was considered good (Fleiss et al., 2013), the remaining files were coded by a single rater.
Quantitative Metrics
To investigate differences in PDR, CIT, and CIR between experts, intermediates, and novices, the following quantitative metrics were computed for all participants:
Polyp Detection Rate (PDR) was defined as the number of polyps detected divided by 15, based on counts from withdrawal task videos. Cecal Intubation Rate (CIR) was recorded as a binary outcome indicating whether the participant reached the cecum within the 8-minute time limit. Cecal Intubation Time (CIT) was defined as the duration from scope insertion to cecal intubation; if the cecum was not reached, the maximum time of 480 s was recorded.
Results
This section highlights our results in terms of the research questions. For purposes of this analysis, “f” refers to the frequency that a node has been discussed during an interview, “E” is considered an expert, I is considered an intermediate resident and “R” is considered a novice resident.
RQ1: How do ADR, CIT, and CIR differ between experts, intermediate, and novice residents in simulation-based training?
The average (PDR) was 0.76 ± 0.077 for experts, 0.63 ± 0.22 for intermediates, and 0.51 ± 0.13 for novices across both tasks. A two-way ANOVA was conducted to examine the effects of expertise and condition on polyp detection rate (PDR). Assumptions were checked: no outliers were identified via boxplot inspection; Shapiro-Wilk tests indicated normality (p > .05); however, Levene’s test indicated unequal variances (p = .017). The interaction between expertise and condition was not significant, F(2, 11) = 1.299, p = .312, partial η2 = .191. All pairwise comparisons were run were reported 95% confidence intervals and p-values were Bonferroni- adjusted. The main effect of expertise was significant, F(2, 14) = 3.994, p = 0.042, partial η2 = .363. Experts had a mean PDR score 0.251, 95% CI [0.009, 0.494] higher than novices, a statistically significant difference, p = 0.042. The main effect of condition was significant, F(1, 15) = 4.77, p = .045, partial η2 = .242, with higher PDR in Condition 1 (M = 0.72, SD = 0.061) than in Condition 2 (M = 0.547, SD = 0.051).
For CIR, all experts and intermediates intubated the cecum in both tasks. In contrast, 1/2 novices (50%) in C1 and 3/5 (60%) in C2 failed to intubate. A two-way analysis of variance (ANOVA) was conducted to examine the effects of expertise and task condition on cecal intubation time (CIT). Residual analysis was performed to assess ANOVA assumptions. Boxplot inspection revealed no outliers. Shapiro–Wilk tests indicated that residuals were normally distributed for each cell, p > .05, and Q–Q plots supported this. Levene’s test indicated a violation of homogeneity of variances, p = .026. The interaction effect between condition and expertise on CIT was not statistically significant, F(2, 11) = 1.21, p = .33, partial η2 = .18. There was no significant main effect of condition, F(1, 11) = 0.83, p = .38, partial η2 = .07. However, there was a significant main effect of expertise, F(2, 11) = 13.50, p = .001, partial η2 = .71.Bonferroni-adjusted pairwise comparisons revealed that experts (M = 208.83, SE = 32.16) had significantly shorter CIT than novices (M = 421.25, SE = 29.47), with a mean difference of 221.80 s, 95% CI [−338.16, −105.49], p < .001. Intermediates (M = 254.00, SE = 32.16) also had significantly shorter CIT than novices, with a mean difference of 167.62 s, 95% CI [−279.04, −56.21], p = .005. Task condition did not significantly affect CIT. Overall, PDR scores were significantly lower for novices but not intermediates compared to experts, and was lower in the more difficult case compared to the easier case. Experts achieved shorter CIT and higher CIR than novices, highlighting the importance of targeted training in both polyp detection and cecal intubation.
RQ2: What are the main challenges faced by medical residents during colonoscopy insertion and polyp detection?
The first research question was developed to gain a deeper understanding of different challenges residents experience while performing a colonoscopy insertion and withdrawal task. To answer this question, an inductive content analysis identified that the most frequently mentioned challenge was related to navigating the colon (f = 14) which was “challenges in navigating through the colon to reach the cecum during colonoscopy insertion”. For example, I1 wrote “navigating sharp turns,” R6 mentioned “Sharp turns were difficult to navigate” and I8 answered “Hard to advance at descending/transverse.” The next challenge was related to scope manipulation (f = 10) which was “difficulty in knowing how to manipulate and use the endoscope effectively.” R3 mentioned “remembering directionality of scope wheels” and R2 answered “using scope control”. Another challenge was inspecting folds/bends (f = 7) which was “difficulty in inspecting the colon wall for polyps due to the folds and bends”. For example, I4 mentioned “difficulty to visualize backside of folds” and R9 mentioned “properly assess all aspects of colon”. Other challenges included force exerted (f = 3) and looping (f = 1) which were “uncertainty in applying the appropriate pressure to advance the scope” and “difficulty in advancing the scope due to the formation of a loop in the colon” respectively. R3 mentioned “how much pressure to apply” and I8 mentioned “pulling back loops” respectively. These results indicated that residents experience difficulties in navigating the colon during insertion, utilizing the endoscope efficiently during insertion and withdrawal, and inspecting different sections of the colon for polyps due to folds/bends. As such, there is a need to implement different strategies and feedback mechanisms during simulation-based training to overcome these challenges.
RQ3: What strategies do expert physicians use to ensure thorough inspection of the colon during colonoscopies?
The aim of our second research question was to understand the different techniques experts utilize for thorough inspection. To answer this research question, an inductive analysis identified that the most frequently mentioned strategy was inspecting folds/bends (f = 18) which was “Ensuring thorough inspection of folds and bends to detect hidden polyps and prevent missed lesions”. For example, E1 mentioned “..there’s some folds that were hard to visualize on the way in, so I have to be more careful”, E3 mentioned "..this is the obscure to lumen, so let’s just make sure behind these folds we’re not hiding another polyp”, E2 mentioned “..big bend there, which I didn’t investigate very well, so I’m going to go back around that bend. I’ll go back in”, and E5 mentioned “..trying to navigate around the folds to make sure that I don’t miss anything.” The next frequently mentioned technique used was lumen centering (f = 16) which was “centering the scope back to the lumen to ensure colon wall is inspected thoroughly”. For example, E4 mentioned “..recenter and find center”, E1 mentioned “..let me recenter myself”, E3 mentioned “..keeping the lumen centered.”
Another mentioned technique was withdrawing and readvancing the scope (f = 10) which was “retracting the endoscope to inspect the colon wall and reinserting it to reassess areas, to ensure no polyps are missed.” E3 mentioned “..you got to push the scope back in and go back in and take a good look around,” E4 mentioned “..there’s an inside bend here, so I’m kind of going back in,” E2 mentioned “... I’m going to go back in with my scope” and E5 mentioned “..pull back and reassess.” Another technique was scope handling strategies (f = 9) which was “various maneuvering techniques used to navigate the colon and ensure comprehensive wall inspection”. E4 mentioned “..having the scope off the table is helpful, because I can spin it freely”, E3 mentioned “..I’ll try to put it in the 12 o’clock position. I’ll head back in”, E5 mentioned “..I’m intentionally torquing over and looking inside the bend”, and E1 mentioned “..cross-checking some of my scope distances so that I don’t just fall past it and then miss an area” and “..to look for polyps again, I mostly torque left, right, and then thumb up and down, with those two motions, I can kind of evaluate all the colon”. The final technique that was mentioned was slow withdrawal for inspection (f = 6) which was “withdrawing the endoscope slowly to ensure a thorough inspection of the colon wall and reduce the risk of missing polyps”. E2 mentioned “..”I’m just scanning, looking for any potential polyps and just withdrawing slowly,” E4 mentioned “..just come back slowly looking for polyps.” These results highlight expert strategies such as thoroughly inspecting folds and bends, centering the scope to the lumen, readvancing the scope to reinspect areas to avoid missing hidden polyps, using specific handling techniques (e.g., torquing, thumb positioning, cross-checking distances), and slow withdrawal to ensure thorough inspection.
Discussion, Conclusion, and Future Work
This study contributes to the design of SBT by identifying specific resident challenges and expert strategies that influence ADR and CIR, providing a foundation for actionable, feedback-driven training. Addressing RQ1, we found that expert endoscopists outperformed novice residents only in PDR, CIT and CIR, aligning with previous findings (Plooy et al., 2012), and PDR was lower in more difficult cases. These differences reinforce the performance gap and underscore the need for effective training in colonoscopy. For RQ2, we investigated specific procedural challenges residents face. The intermediate and novices particularly struggled with navigating the colon during insertion and effectively manipulating the scope —skills essential to reaching the cecum efficiently. These findings confirm existing literature (Witte & Enns, 2007) but also provide new insight into what feedback should be targeted during SBT. During withdrawal, residents often found it challenging to fully inspect colonic folds and bends, which may have contributed to potential missed polyps. RQ3 focused on identifying strategies experts use to ensure thorough mucosal inspection. Experts employed a range of techniques, including inspection behind folds (Sedlack, 2022), frequent re-centering of the scope for lumen visualization (Phitayakorn et al., 2009), and re-advancing the scope to revisit areas (Walsh et al., 2021). These findings point to specific skills that could be reinforced through real-time prompts during SBT, guiding trainees to re-center, re-advance, and inspect behind folds as they progress through procedures. Experts also demonstrated more advanced scope control techniques—such as torquing, “thumb up/thumb down” maneuvers, monitoring insertion depth, and deliberate withdrawal (Atia et al., 2015)—which are often underemphasized in current training. The identification of these expert strategies provides actionable targeted for enhancing colonoscopy training. Instructional modules and objective feedback tools that monitor scope position, mucosal coverage, and inspection completeness could help close the performance gap between residents and experts. These findings align with research showing that understanding expert cognitive strategies supports the development of effective training interventions (Craig et al., 2012; Upanc et al., 2015). Real-time feedback systems that incorporate these techniques may potentially improve trainee outcomes by fostering expert-like inspection behavior and increase ADR. This is because research has indicated that feedback improves ADR (Scalvini et al., 2025).
Future work will focus on developing a graphical user interface (GUI) that provides real-time visual feedback during colonoscopy training. This system will address challenges identified in this study—including insertion navigation, re-centering, and thorough withdrawal inspection—by offering targeted prompts based on expert strategies such as optimal insertion paths, scope re-centering, re-advancement, and highlighting missed areas during inspection. Its effectiveness will be evaluated by comparing PDR and CIT metrics in training sessions with and without feedback. Eye-tracking data will also be analyzed to uncover expert visual attention patterns, supporting the development of tailored gaze training modules to enhance ADR and CIR. While this study provides valuable insights, limitations remain: the small sample size limits generalizability, and slight shifts in polyp locations could have impacted PDR. Future work should include more participants, follow-up interviews, and ensure consistent polyp positioning.
Footnotes
Declaration of Conflicting Interests
The authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. Coauthor Dr. Miller and Dr. Moore owns equity in Medulate, which may have a future interest in this project. Company ownership has been reviewed by the University’s Individual Conflict of Interest Committee and is currently being managed by the University.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Research reported in this manuscript was supported by the National Institute of Diabetes and Digestive and Kidney Diseases under award number R01DK137230.
