Abstract
Importance:
Otolaryngology residents take the otolaryngology training examination (OTE) yearly to assess their fund of knowledge. The Accreditation Council for Graduate Medical Education (ACGME) milestone evaluations are also conducted semiannually. Accurate prediction of training examination performance allows identification of residents who are performing well and those who need targeted remediation. Prior studies in other specialties have attempted to use milestone evaluations to help predict in-training examination scores.
Objective:
In this study, we aim to identify whether ACGME milestone evaluation scores predict OTE performance.
Design:
Milestone ratings and OTE scores for residents at 2 US otolaryngology residency programs were collected. Multivariate analysis was achieved using linear mixed modeling. We considered a 2-tailed P value of ≤ .05 as statistically significant.
Setting:
Two US otolaryngology residency programs
Participants:
Forty-eight otolaryngology residents postgraduate years 2 to 5
Results:
Otolaryngology training examination scores and ACGME milestone evaluations were collected from 48 residents from postgraduate year 2 to 5 between the years 2014 and 2017. One hundred eight OTE scores were available. Linear mixed-effect models were constructed, and after adjusting for level of training and OTE year, the total milestone rating made a negligible impact in estimating OTE percentage correct (β = −.01, P = .9). Similarly, total milestone rating demonstrated a minimal contribution in approximating OTE national stanine score after adjusting for the level of training (β = −.003, P = .9).
Conclusions and Relevance:
In our study, ACGME milestone evaluations are not predictive of residents’ OTE performance. What these milestone evaluation data mean and how they should be used continues to be an unanswered question. We should aim to identify the most effective applications of the milestone data collected yearly by otolaryngology programs.
Keywords
Introduction
Otolaryngology residents nationwide take the otolaryngology training examination (OTE) yearly in March to assess, quantify, and compare the factual knowledge acquired by each resident. The OTE represents an objective assessment of a resident’s current knowledge in all areas of otolaryngology. 1 Prior studies in the general surgery literature noted that resident and program factors influence performance on the American Board of Surgery In-Training Examination (ABSITE). Specifically, Kim et al and Chang et al observed a positive association between ABSITE performance and Medical College Admission Test score, United States Medical Licensing Examination step scores, personal beliefs, and study habits. 2,3 Residency program factors such as tracking reading by program directors, dedicated remediation, and even vacation schedules also affect ABSITE scores. 4 –6
Otolaryngology training examination scores are an important metric for resident assessment. In fact, OTE scores in the upper 3 quartiles in the final 2 years of residency training are associated with a 97% passage rate for first-time examinees taking the American Board of Otolaryngology (ABOto) Written Qualifying Examination. 7 Various educational methods have been proposed to increase resident OTE scores with mixed results. Reh et al showed that a learner-centered education curriculum conferred a statistically significant improvement in their residents’ OTE stanines. 8 In contrast, a small randomized trial testing the impact of free online otolaryngology education modules on OTE scores found no quantifiable improvement in overall scores. 9 Platt et al used periodic question-based assessments throughout the year and found strong correlations between assessment results and resident OTE scores. Residents with inadequate OTE scores were identified and remediated successfully. 10 An accurate method of predicting residents’ OTE performance could help identify those who have adequate knowledge acquisition and those who may need targeted remediation. However, no such standardized method of resident evaluation has been identified.
The Accreditation Council for Graduate Medical Education (ACGME) milestone assessment is one possible measure that could help predict a resident’s OTE performance. Residents are assessed semiannually using the ACGME milestones, which were designed as a framework for programs to assess the development of key areas of physician competency in a specialty or subspecialty. 11 Milestones are targets and descriptors for progress as residents move through their training; scores are reported from 1 to 5, with 0.5 increments allowed. 12 Milestones in otolaryngology are reported for 17 specific subcategories representing skill areas in the 6 ACGME core competencies: patient care (PC), medical knowledge (MK), systems-based practice, problem-based learning, professionalism (PROF), and interpersonal communication skills. Resident milestone scores are determined during a semiannual meeting of the clinical competency committee (CCC), a designated group of key program faculty and leadership.
The OTE is important for predicting future board examination success, and at this point, there are few, if any, effective ways of detecting OTE underperformers early. In this study, we seek to characterize the relationship between otolaryngology milestones evaluations and OTE scores, with the goal of identifying a method of predicting resident OTE score.
Methods
Milestone ratings and OTE scores were obtained for 48 residents at 2 US otolaryngology residency programs between the years 2014 and 2017. We included residents in postgraduate years (PGYs) 2 to 5 in the study. Postgraduate year 1 examinees were excluded from this study as their examination participation was only required at one of the 2 residency programs. Otolaryngology training examination scores are reported by percentage correct, scaled score, overall national stanine, overall group stanine, and individual national and group scores by subspecialty (allergy, fundamentals, head and neck, laryngology, otology, pediatrics, plastic and reconstructive, rhinology, sleep). All score reports were deidentified and assigned an identification number by the program coordinators. One resident did not take the OTE 2 times during residency for personal reasons, but all other residents in training from 2014 to 2017 had OTE score reports every year.
The ACGME milestone ratings assigned during semiannual CCC meetings throughout the study period were also collected and deidentified. In general, milestone ratings were developed based on discussion of collective feedback from program faculty about clinical knowledge and operative skills, formative written evaluations, and written examination performance.
Statistical Analysis
One hundred eight OTE in-service scores were available for statistical analysis. Residents included in the study had between 1 and 4 OTE scores available depending on their level of training. As such, a repeated measures correlation was used to calculate the bivariate correlations between milestone ratings and percentage correct, as well as between milestone ratings and national stanine score. 13 Multivariate analysis was achieved using linear mixed modeling with individual residents included as a random effect to account for within-subject correlation. All statistical analyses were performed in R version 3.3.2 (R Foundation for Statistical Computing, Vienna, Austria). We considered a 2-tailed P value of ≤.05 as statistically significant.
Results
Otolaryngology training examination scores (n = 108) and Milestone evaluations were collected from 48 PGY 2 to 5 residents. On average, residents answered 64.5% ± 6.3% of the questions correctly and received a national stanine score of 5 ± 1.7. Their score metrics increased with each year of training (Table 1).
Otolaryngology Training Examination Scoring Breakdown by Program Year.a
Abbreviation: PGY, postgraduate year.
a Cannot calculate a P value for the differences in program years because they’re repeated measures with variable missing data.
Bivariate correlations between percentage correct and milestone ratings demonstrated positive correlation, with r values ranging from 0.53 to 0.66. National stanine scores also exhibited positive correlation with milestone ratings on bivariate correlation analysis, with r values ranging from 0.49 to 0.61 (Table 2).
Repeated-Measures Correlation for Percentage Correct and Milestone Rating as well as National Stanine and Milestone Rating.
Abbreviations: ICS, interpersonal communication skills; MK, medical knowledge; PBL, problem-based learning; PC, patient care; PROF, professionalism; SBP, systems-based practice.
Linear mixed-effect models were constructed to better characterize the relationship between OTE scores and milestone ratings. After adjusting for level of training and the OTE year, the total milestone rating had a negligible impact on estimating OTE percentage correct (β = −.01, P = .9). Similarly, after adjusting for the level of training, total milestone rating also demonstrated a minimal contribution in approximating OTE national stanine score (β = −.003, P = .9; Table 3). Adjusted models evaluating the predictive value of individual milestone ratings in estimating OTE percentage correct and national stanine all demonstrated nonsignificant contributions (Table 4).
Linear Mixed-Effect Model Regression Coefficients.
Abbreviation: PGY, postgraduate year.
Linear Mixed-Effect Model Regression Coefficients for Individual Milestones.
Abbreviations: ICS, interpersonal communication skills; MK, medical knowledge; PBL, problem-based learning; PC, patient care; PROF, professionalism; SBP, systems-based practice.
Discussion
The current study examined whether ACGME milestone evaluations are predictive of OTE scores and found no statistically significant correlation in 2 large US otolaryngology programs over a 4-year period. Our data show that resident percentage correct on the OTE increases with more advanced training level. Breakdown of overall resident performance by PGY year is demonstrated in Table 1. Mixed linear regression analysis showed statistically significant improvement in score in each subsequent year of training (Table 3). However, regression modeling showed no statistically significant correlation between milestone subcategory score and OTE percentage correct and/or national stanine score. Kimbrough et al studied the relationship of ACGME milestones to ABSITE scores in 69 general surgery residents and found the MK1 milestone was predictive of ABSITE scores. 14 Unlike the Kimbrough study, we did not find any component of the milestone evaluation, MK or other, that predicted OTE performance.
Puscas described that OTE scores in the final 2 years of residency correlate well with successful passage of the ABOto board examination. 7 Unfortunately, the OTE is only offered in March of each year, and by the time OTE scores are posted, it may be too late to enact an effective intervention to address resident knowledge deficits. As such, programs need a tool to predict resident OTE underperformers early, so that focused efforts can be made to ensure that residents are on track with MK acquisition. Semiannual milestone evaluations represent a good candidate option for helping to identify examination underperformers; however, in this study, we found that after accounting for other confounders, milestone scores do not correlate well with OTE performance.
This study is limited by a small sample size including residents from only 2 training programs. Additionally, while milestone scores are formulated with input from multiple attending physicians who observe the residents in all facets of their clinical education, the OTE scores are discussed as a component in formulating a resident’s milestone evaluation and thus may influence a resident’s milestone rating. We do note, however, that OTE performance represents only a small component of a resident’s overall milestone score. Future studies should focus on other potential tools that may predict resident OTE performance, in order to allow early intervention for underperformers. Additionally, this study raises the question of how to best interpret and utilize the information gained from yearly ACGME milestone evaluations. It also perhaps highlights the differences between subjective (milestone) and objective (OTE) evaluation of resident knowledge and performance.
Conclusion
Otolaryngology residents are evaluated yearly using ACGME milestone evaluations and the OTE. Our small study shows a lack of predictive value of ACGME milestone ratings relative to performance on the OTE. The exact meaning and impact of milestone evaluation data, and how it should be used, continues to be an unanswered question in graduate medical education. We advocate a continued search for effective applications of the milestone data collected yearly by otolaryngology programs.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
