Abstract
A prospective randomized controlled pilot study was performed to determine if video self-assessment improves competency in mastoidectomy and to assess interrater agreement between expert and resident evaluations of recorded mastoidectomy. Sixteen otolaryngology residents were recorded while performing cadaveric mastoidectomy and randomized into video self-assessment and control groups. All residents performed a second recorded mastoidectomy. Performance was evaluated by blinded experts with a validated assessment scale. Video self-assessment did not lead to greater skill improvement between the first and second mastoidectomy. Interrater agreement was fair to substantial between the expert evaluators and between resident self-evaluations by recall and video review. Agreement between experts and residents was only slight to fair; residents consistently rated their performance higher than experts (P < .05). In conclusion, 1 session of video self-review did not lead to improved competence in mastoidectomy over standard practice. While experts agree on assessments, residents may overestimate their competency in performing cadaveric mastoidectomy.
Mastoidectomy is a critical skill for the surgical management of the ear, temporal bone, and skull base. Historically, cadaveric temporal bone simulation has been the standard method of acquiring skills to become proficient in mastoidectomy.
One novel approach to improving mastoidectomy education is through video self-assessment, similar to a postgame review by athletes. Video self-assessment of simulated and real-time procedures has proven effective in improving technical skills in several surgical specialties, including general surgery, urology, and gynecology.1,2
In this pilot work, our goals were to (1) determine if video self-assessment improves resident skill in cadaveric mastoidectomy over standard training and (2) establish the interrater reliability between expert and resident assessments of skill based on recorded mastoidectomy.
Methods
Study Design
This was a prospective randomized pilot study among otolaryngology residents at the University of Minnesota (N = 16). Participation was voluntary. The University of Minnesota Institutional Review Board deemed this study exempt from review (No. 1501E59582).
After reviewing the performance evaluation tools, all residents performed a cadaveric mastoidectomy recorded on a microscope-mounted camera, followed by self-assessment via recall. Participants were block randomized by training year to the intervention group (which received the recording) or to the control group (which did not). The intervention group reviewed the recording within 7 days and performed self-assessment. Both groups performed a second mastoidectomy 7 to 10 days after the first on a temporal bone of the same-side ear as the first session, followed by self-assessment via recall. All participants were given the video of their second mastoidectomy to review and self-assess within 7 days.
Two attending neurotologists and 1 neurotology fellow served as expert assessors of each recording. They were blinded to study group, resident year, and order of mastoidectomy. Experts met prior to evaluating the videos to establish consistent evaluation techniques.
Assessment Instruments
The Task-Based Checklist (TBC) and the Global Rating Scale (GRS)—developed by Francis et al for mastoidectomy and modified to enable video review—were used for all self- and expert assessment (see Supplemental Table S1, available at www.otojournal.org/supplemental). 3 The following data were recorded: (1) time from initiation of cortical drilling to completion of mastoidectomy stages and (2) number of injuries to relevant structures. Demographic and satisfaction surveys were administered.
Statistical Analysis
We compared the change in outcome measures between the first and second mastoidectomy between the study groups with 2-tailed paired t tests. The mean of the scores assigned by the 3 expert evaluators was used for analysis. Injury counts were compared with chi-square tests. Significance was set at P < .05.
Interobserver agreement was determined with weighted kappa statistics, per the criteria of Landis and Koch for interpretation of levels of agreement. 4 STATA 14.2 (StataCorp, College Station, Texas) was used for all analyses.
Results
Study groups were balanced on resident experience based on training year and mastoids previously drilled in the laboratory (mean ± SD, 6 ± 4 vs 5 ± 4) and operating room (30 ± 24 vs 31 ± 22). There were no significant differences between the first and second mastoidectomy in TBC or GRS scores, completion time of mastoidectomy stages, or injury counts between study arms or within either study arm ( Table 1 ). Injury counts were low for all structures except the tegmen (11 injuries).
Change in Scores on Expert Evaluations and Time to Completion between First and Second Mastoidectomy in the Video Self-assessment and Control Groups.
Abbreviation: EAC, external auditory canal.
Change between first and second mastoid. A negative value indicates that the score in the second mastoidectomy was lower (worse).
P values are for the paired t tests comparing the change from the first to second mastoidectomy between the study groups.
Interrater agreement ( Table 2 ) was fair to substantial among the expert evaluators (κ = 0.23-0.62) with highest agreement on overall surgical performance. Interrater agreement was fair to substantial among resident self-evaluations by recall and video review (κ = 0.40-0.78). There was only slight to fair agreement between expert ratings and resident ratings (κ = 0.03-0.25). For all TBC and GRS items, resident ratings were 0.41 to 1.51 points higher than the mean expert ratings (all P < .007; Figure 1 ).
Weighted Kappa Statistics for Interrater Agreement.
Abbreviation: EAC, external auditory canal.
Self-recall vs self-video assessment.
Agreement among 3 expert assessments.
Weighted kappa was calculated for agreement between each expert (n = 3) and resident self-video assessments. The mean (range) of the 3 kappas is presented here.

Expert vs resident self-assessment of recorded mastoidectomy performance with the Task-Based Checklist and Global Rating Scale. EAC, external auditory canal. Mean values are presented, with error bars indicating SD.
Residents rated satisfaction with video assessment highly (4 ± 1.4 out of 5), and 78% said that they would repeat the study.
Discussion
In this study, 1 episode of video self-review did not produce improved competence in mastoidectomy over standard training. Expert and resident assessments conflicted, as residents consistently rated themselves higher than experts did.
Our findings should be considered in the context of other literature. Malik et al demonstrated a negative association between time spent in the temporal bone laboratory and resident competence in mastoidectomy. 5 In their study and ours, residents did not receive expert feedback or coaching. Expert feedback is critical for complex task learning. Hu et al demonstrated that, in contrast to standard practices, video-based coaching sessions of surgical procedures generate more questions and detailed discussions between attendings and residents about intraoperative decision making. 6 Feedback is also important for development of accurate self-estimations of skill.7 -9 Thus, these findings suggest that mastoidectomy simulation without feedback may not promote new skill development or it may lead to bad habits, as residents may not recognize or correct their own technical errors and inefficiencies.
Video review also allows experts to reflect on educational techniques, identifying areas that require focus. For example, we increased our emphasis on early definition of tegmen contours to address the high rates of injury observed. Furthermore, we learned that recordings provide an accessible and efficient means of observing resident drilling and may augment evaluation and feedback when attendings cannot be present during each laboratory dissection.
The primary limitation of our study is its small sample size, drawn from a single institution. Enrollment in multiple programs would increase study power and external validity. Furthermore, as self-video review was limited to 1 session, additional video review sessions with multiple cadaveric temporal bone dissections may lead to improvements that could not be detected in this study. Future work will determine if joint video review with an expert may afford greater educational value.
Author Contributions
Disclosures
Supplemental Material
OPN770417_Supplemental_Material_CLN – Supplemental material for Randomized Controlled Pilot Study of Video Self-assessment for Resident Mastoidectomy Training
Supplemental material, OPN770417_Supplemental_Material_CLN for Randomized Controlled Pilot Study of Video Self-assessment for Resident Mastoidectomy Training by Ashok R. Jethwa, Christopher J. Perdoni, Elizabeth A. Kelly, Bevan Yueh, Samuel C. Levine and Meredith E. Adams in OTO Open: The Official Open Access Journal of the American Academy of Otolaryngology-Head and Neck Surgery Foundation
Footnotes
This article was presented at the 2017 AAO-HNSF Annual Meeting & OTO Experience; September 10-13, 2017; Chicago, Illinois.
Sponsorships or competing interests that may be relevant to content are disclosed at the end of this article.
Supplemental Material
Additional supporting information is available in the online version of the article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
