Abstract
Background:
Endoscopic recognition of reflux esophagitis is critical for the evaluation and treatment of patients with gastroesophageal reflux disease; however, there are limited data on the need for education to minimize interobserver disagreement in clinical practice.
Methods:
We created an educational program for the LA classification on the International Working Group for the Classification of Oesophagitis (IWGCO) website that included endoscopic video recordings of the distal esophagus. Participants completed an entry survey before the training module and were able to proceed to subsequent training videos once they provided the correct LA classification. Participants then completed a test module—an 80% score was required to pass. Descriptive analyses and regression analyses were performed to analyze data.
Results:
In the entry survey, 83/90 (92%) participants reported using the LA classification for the majority or all patients with reflux symptoms. However, only 31/90 (34%) participants reported feeling very or completely confident in the use of the LA classification. Only 3/71 (4.2%) participants correctly classified all 9 training videos on their first attempt. The testing module was completed by 60 participants, 16 (26.7%) of whom passed after one attempt, with 32 (53.3%), 8 (13.3%), and 1 (1.7%) passing after 2, 3, and 4 attempts, respectively. There was no significant correlation between the number of attempts to successfully pass the testing module and participant characteristics.
Conclusion:
Even after a training module, >75% of participants required more than one attempt to correctly classify the test videos. Further structured education around the LA classification is needed.
Key Learning Points
92% of participants reported using the LA classification for the majority or all patients with reflux symptoms but only 34% of participants reported feeling very or completely confident in its use.
To ensure standardized reporting and appropriate management of patients with esophagitis, further education is needed for endoscopists to apply the LA classification effectively.
Introduction
Gastroesophageal reflux disease (GERD) occurs when stomach contents enter the esophagus causing symptoms or mucosal damage. GERD is extremely common in North America and is reported to be one of the most common diagnoses provided to patients in an outpatient setting in the United States. 1 Erosive esophagitis, a complication in about 30% of patients with GERD, is diagnosed and evaluated for severity by upper gastrointestinal endoscopy. 2
The Los Angeles (LA) classification system is the most widely used grading system for erosive esophagitis to standardize endoscopic descriptions and predict the clinical course of these patients.3-5 This classification system is based on the extent or severity of visible breaks in the mucosa in the distal esophagus. There has been evidence that an LA grading of B to D is conclusive for GERD when compared with pH monitoring, whereas only 17.6% of patients had abnormal acid exposure in the presence of LA Grade A lesions. 6
Previous studies have shown that, despite good agreement between experts, less experienced clinicians have significantly less interobserver agreement. Daperno et al. showed that a dedicated training program of optical endoscopic assessment of inflammatory bowel disease severity can increase interobserver agreement to that achieved by experts. 7 Good interobserver agreement is extremely important for facilitating accurate communication between health care providers, for guiding clinical therapy and for standardizing the assessment of disease severity in research studies. In this study, we aimed to investigate whether a standardized set of training modules could result in more consistent grading of erosive esophagitis severity and therefore increased interobserver agreement between participating endoscopists.
Methods
We created an online educational module for the LA classification on the International Working Group for the Classification of Oesophagitis (IWGCO) website (www.IWGCO.net) which included endoscopic video clips demonstrating the distal esophagus and gastroesophageal junction. Videos were selected from a bank of videos recorded by members of the IWGCO during previous research programs and the videos were assessed and graded by 2 or more IWGCO experts. IWGCO experts were all gastroenterologists with >20 years of experience and expertise in the LA classification. Participants completed an entry survey which asked (1) how frequently they used the Los Angeles (“LA”) Grade system for reporting endoscopic findings in patients with reflux symptoms, (2) how confident they were that they could accurately grade the severity of esophagitis, (3) how many patients per year they endoscoped for reflux symptoms, and (4) whether they endoscoped patients to check for healing of erosive esophagitis. After completing the entry survey, participants watched a 20-minute presentation describing the LA classification system.
Eligibility Criteria: Health care practitioners who visited the IWGCO website were invited to register for an account, free of charge, and to participate in the study. Participants’ registration information was validated with an online check of their name, country and affiliation to avoid and, if necessary, to reject spam registrations.
Survey Design and Collection: Participants completed an entry survey, outlined above, before starting the training module (Figure 1). The survey was designed by DA, PSh, and PS. For each video in the training module (9 videos), participants were able to proceed to the next training video once they had provided the correct LA classification. Multiple attempts were permitted to achieve a passing score. Videos included cases with LA Grade A (3), B (3), C (4), and D (1) as well as cases with no esophagitis (3). Videos were randomly assigned to 2 training modules (5 videos in the first module and 4 videos on the second module) and a test module (5). After each training video, the participant was informed whether or not their answer was correct; if the answer was incorrect, they could review the training video again and provide another answer. Participants were able to re-take the training module and exams as many times as they wished, but they could only proceed to the next video if they had given the correct answer. After completing the training module successfully, the participants then completed the test module (5 videos)—an 80% score was considered as the passing score. Participants were not informed of the correct answers until they had scored all 5 test videos. In a post-exam survey, participants were asked (1) to name 3 key learnings, (2) how they would change their practice, and (3) how IWGCO could improve the course.

Program content for the LA classification Educational Program on the International Working Group for the Classification of Oesophagitis website.
We collected data on how often participants used the LA classification system in clinical practice, how confident they were with using the classification system and the number of upper endoscopies performed annually for reflux symptoms. We also asked if they perform endoscopy to check for healing of erosive esophagitis. Additionally, information was collected on the participants’ level of training, location, and specialty (gastroenterology, surgery, nurse, trainee). The training module was advertised through communication channels with various societies (American College of Gastroenterology, American Foregut Society, Canadian Association of Gastroenterology, International Society for Diseases of the Esophagus) as well as networks of members of the IWGCO.
Outcome Measures: The primary outcome was to assess agreement and performance in assessment from baseline to course end. Secondary outcomes of the study were to determine whether participant characteristics such as years of experience or specialty affected the accuracy in determining the LA grade of videos in the module.
Statistical analysis: Descriptive analyses were performed to summarize participants’ characteristics (years in endoscopic practice, primary practice setting, number of patients scope per year for reflux symptoms, and frequency or confidence in using the LA classification, whether they endoscope patients to check for healing of erosive esophagitis) and the general performance of training and exam videos (number and percentage of participants who passed each video and the number of attempts). Univariable Poisson regression analyses were performed to analyze data from 57 participants who passed the test module, evaluating the association between the following 7 predictors (the 6 participant characteristics noted above and the number of attempts required to pass the training module) and the number of attempts to successfully pass the test module (ranged from 1-4) (dependent variable). Due to the small number of participants (n = 57) who completed the training and test videos and the large number of potential predictors, we planned a priori that only significant predictors (P < .05) would be entered into the multivariable Poisson regression analysis. IBM SPSS® Version 28.0.1.1 was used to perform the analyses.
Results
Among 90 participants who completed the entry survey, 64 (71.1%) reported that they used the LA classification for reporting endoscopic findings in patients with reflux symptoms (all cases n = 64, 71.1%, most cases n = 19, 21.1%; some cases n = 5, 5.6%; not reported n = 2, 2.2%). Only 3 (3.3%) answered that they were completely confident that they could accurately grade the severity of esophagitis (very confident n = 28, 31.1%; moderately confident, n = 40, 44.4%; slightly confident; n = 17, 18.9%; not at all confident, n = 2, 2.2%). Half of the participants reported that they performed endoscopy for reflux symptoms on 50-150 patients per year (n = 45, 50%) (<50 [rarely], n = 15, 16.7%; 50–150, n = 45, 50.0%; 151–250, n = 18, 20.0% > 250 [1 or more per day], n = 12, 13.3%). Only 9 participants (10.0%) endoscoped all patients with erosive esophagitis or complications (e.g. stricture, Barrett’s esophagus) to check for healing of erosive esophagitis; the remaining 81 participants reported that they would repeat endoscopy to check for healing in all patients with severe erosive esophagitis (Grade C and D) or complications, n = 59 (65.6%), only for patients who had severe esophagitis with complications n = 10 (11.1%) or only for those who had severe esophagitis without complications, n = 12 (13.3%).
Of 71 participants who started the training module, there were 50 adult gastroenterologists, 2 pediatric gastroenterologists, 2 internists, 3 trainees, 2 nurses, 1 physician assistant and 10 surgeons; one participant did not provide data. When asked “What number of patients per year do you endoscope for reflux symptoms?”, 10 participants performed less than 50 EGDs per year, 35 performed 50 to 150, 15 performed 151 to 250 and 9 performed more than 250 (1 or more per day). Although 10 participants performed less than 50 EGDs per year, all participants reported being involved with the endoscopic grading of erosive esophagitis in “some cases” at a minimum. Two participants did not respond to the question of how often they use the LA classification, but these were both gastroenterologists (Table 1).
Participant Demographic Information.
Note. GI = Gastroenterologist, 2 gastroenterologists did not respond to the demographic questions.
In all, 61 participants (85.9%) completed all training videos. Only 3/71 (4.2%) participants correctly classified all 9 training videos on their first attempt—all others required multiple attempts. The percentage of participants who passed each video on the first attempt ranged from 44.3 % to 90.8% (Figure 2). One participant passed 9 training videos but did not participate in the test module. Of 60 participants who started the test module, 57 (95%) passed all 5 video questions; 16 (26.7%) passed after one attempt, whereas 32 (53.3%), 8 (13.3%), and 1 (1.7%) passed after 2, 3, and 4 attempts, respectively (Figure 3). Among 3 participants who failed the test module: 2 participants had one attempt and failed, and one participant had 2 attempts, both failed. Of the 3 “Completely confident” participants, none of them passed all 9 training videos the first time (ranged from 10 to 13), and all took 2 attempts to pass the test module.

Percentage of participants that passed the training videos on their first attempt and the average number of attempts per video.

Number of attempts and percentage of participants who passed the testing module.
Fifty-seven participants passed the training module; 21/57 (36.8%) participants passed 4/5 (80%) videos and 36/57 (63.1%) participants passed 100% (5/5) of the videos. In 114 attempts of the 5 videos within the test module (114 × 5 = 570), 380/570 (66.6%) were answered correctly (regardless of participants and final scores) and the percentage of correct attempts for individual videos ranged from 50.9% to 70.8% (Figure 4).

Percentage of correct answers in 5 post-training test videos. (Correct answer: Q1 = B; Q2 = B; Q3 = A; Q4 = A; Q5 = C).
On Poisson regression analyses, we only analyzed the 57 participants who passed the test module, with the predictor variables related to the number of attempts that took to pass the final exam (ranging from 1 to 4). There was no significant correlation between the successful passing of the test module and any of the 6 participant characteristics (years in endoscopic practice, primary practice specialty, number of patients scoped per year for reflux symptoms, frequency or confidence in using the LA classification, and whether the endoscopists checked patients for healing of erosive esophagitis; all P > .05). Only the number of training attempts (range 9-25) was significantly associated with the number of attempts to pass the exam, with OR = 1.052 (95%CI = 1.002-1.104), P = .041). That is, for every additional unit increase in “training attempts,” the expected count of attempts to pass the exam is 1.052 times higher. In a sensitivity analysis, for which primary practice groups were classed as gastroenterologists vs all others, the results were not significantly different.
The post-exam survey was completed by 36 participants, with responses about endoscopic techniques for optimizing esophagitis assessment being the most common. These results are summarized in Table 2. For the 38 responses about how the course might change the participants’ clinical practice, the most common response was using the LA classification (10). There were 35 responses about course feedback, with adding more videos (10) as the most common request.
Participant Post-training Survey Feedback.
Discussion
The LA classification was first proposed in 1996 with the goal of reducing interpreter variability as prior grading scales were dependent on assessment of the depth of mucosal breaks. Initially, there was good interrater agreement for the mild esophagitis (Grade A) and severe esophagitis (Grade D) but interrater agreements for Grades B and C were poor. The grading system was then modified in 1999 to improve interrater agreement by modifying the definition of LA Grade D esophagitis to be 75% or more the circumference of the esophagus. A study from 2011 by Genta et al. used data from outpatient endoscopy centers in the United States to assess the use of LA classification and Savary-Miller grades for esophagitis. 8 They found that 27.9% of their 19 778 cases with endoscopic esophagitis provided an LA grade. This study was limited as only information that was available to the outpatient pathology center was assessed and only community endoscopists were included in this study. There have been no recent assessments regarding the use of the LA classification that the authors are aware of.
Although more than 90% of participants in the present study reported using the LA classification in their practice, only a third were confident in using it. This suggests that there is inadequate awareness of the nuances of the LA classification system among endoscopists. Even after successful completion of a training module, greater than 73% of participants required more than one attempt to correctly classify the testing videos. The LA classification may require a review, particularly since LA grade A has not been shown to be reliably related to acid exposure or clinical outcomes. Multiple studies have shown that interobserver agreement in grading erosive esophagitis was better among experts than trainees; however, most participants in our study were fully qualified endoscopists so this could not be assessed in the current study.9,10
A strength of this study is that we had a wide range of participants from various practice locations and with varying years of experience. We also had a good completion rate with 86% of participants completing all the training modules. The study provides real world evidence, and practising gastroenterologists were recruited globally to provide a breadth of experience in our participants.
There were several limitations to this study. There was a limited number of participants in this study with a limited number of videos to learn from in these modules. Additionally, there was no structured feedback provided to participants given the online nature of the modules. This is also a selected population of endoscopists who were likely interested in the LA classification and enhancing their performance. The study results may not necessarily be representative of all endoscopists.
This study demonstrates that further structured education around the LA classification is needed to standardize LA grading among endoscopists and ensure consistent reporting for patients with erosive esophagitis. We did see an improvement in performance from 3/71 (4.2%) participants correctly classifying all 9 training videos on their first attempt, to 16 (26.7%) passing the testing module after one attempt. A standardized set of training modules could result in more consistent grading of erosive esophagitis and therefore increased agreement results among providers. There is a need to incorporate education on the grading of esophagitis into endoscopy training and ongoing professional education. Given the variability in assessing the severity of erosive esophagitis, one of the most common endoscopic diagnoses, this study suggests that there may be similar needs for training in the accurate endoscopic diagnosis and evaluation of other GI diseases as well.
Footnotes
Author Contributions
KD was involved in the data collection, analysis and writing of the manuscript. DW was involved with the data collection, analysis and manuscript preparation. YY performed data analysis and assisted in manuscript preparations. PS was involved with data collection and contributed to the manuscript. PSh and DA conceptualized the study and supervised the data collection, analysis, and manuscript preparation.
Declaration of Conflicting Interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: KD, DW and YY have no conflicts of interest to declare. Paul Sinclair is a consultant for IWGCO. Prateek Sharma is a consultant for Olympus Corporation, Boston Scientific, Salix Pharmaceuticals, Cipla, Medtronic, Takeda, Samsung Bioepis and CDx. He also has received grant support from ERBE and Fujifilm. DA has received research grants from Nestlé Health Sciences and the Weston Family Foundation; honoraria from Sanofi for advisory board participation, from Amgen, Fresenius Kabi and Takeda for speaking engagements and from CALIBR for consulting. DA is also an unremunerated advisory board participant for Cinclus Pharma and Phathom Pharmaceuticals and co-founder of AI VALI Inc.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: IWGCO received funding (unrestricted) from Phathom Pharmaceuticals to develop the learning platform. No funding for analyses and publication was received. The IWGCO (International Working Group for the Classification of Oesophagitis) is a not-for-profit organization which brings together expertise from around the world. Members of the group are engaged in researching and testing criteria that are designed to meet the needs of routine patient care and research that are suitable for adoption as worldwide standards.
Ethical Approval
No patient consent was required for this study.
Consent Statement
No patient consent was required for this study.
Use of Artificial Intelligence
No artificial intelligence was used to prepare this manuscript. ChatGPT4 was utilized to assist in creating components of the figures in the visual abstract.
