Abstract
INTRODUCTION
Standardized patient (SP) encounters allow medical students to practice physical examination skills and clinical reasoning. SP cases are used for learning and assessment, but recorded encounters can also be valuable curriculum evaluation tools. We aimed to review SP encounters to improve abdominal examination skills and the broader physical examination curriculum.
METHODS
We reviewed recorded SP encounters of third-year medical students on surgery clerkship rotation. Students examined a cisgender woman presenting with acute right lower abdominal pain. We observed abdominal examinations to determine which maneuvers were attempted and completed correctly. We then used these outcomes to develop targeted clerkship training for the subsequent student cohort. Our intervention targeted abdominal examination gaps by explaining how to integrate abdominal examination findings with a focused history for surgical patients. We evaluated the intervention's impact on abdominal examination skills with third-year medical students in comparison (2021-2022, n = 119) and intervention (2022-2023, n = 132) groups.
RESULTS
In both the comparison and intervention groups, nearly all students attempted at least 1 general examination maneuver like auscultation, palpation, percussion, or rebound tenderness. Only 40% of students in the comparison group attempted an advanced maneuver like the Rovsing, Psoas, or Obturator sign. After the intervention, 75% of students in the intervention group attempted an advanced maneuver (χ2(1, 251) = 31.0, p < .001). Cohorts did not gain skills over time through the clerkship. Rebound tenderness was frequently assessed incorrectly by students in both groups, with many avoiding the right lower quadrant entirely.
CONCLUSIONS
This project highlights how medical students struggle to utilize abdominal examination maneuvers and integrate findings. The results also showed that students did not consistently learn advanced examination skills either before or during clerkship rotation, which may be commonly assumed by clinical faculty. Finally, this work demonstrates how SP encounters can be used to evaluate and improve surgical education curriculum.
Introduction
Physical examination skills are critical diagnostic tools. When students do not know physical examination skills or cannot discriminate when to implement the appropriate physical examination skills, they cannot produce the necessary evidence to support the correct diagnosis. 1 Furthermore, these deficiencies can impact performance and clinical competency of incoming residents/postgraduate trainees. 2 Assessments of medical students’ physical examination skills suggest that students are frequently below an acceptable level of competence at different points of their training.3–5 Ensuring that students have a foundation on which to further develop and hone their skills is vital for success in residency and beyond as deficiencies in physical examination skills are a major contributor to medical errors. 6
In the United States, medical school is traditionally structured with the first 2 years devoted to the pre-clinical curriculum and the second 2 years in clinical rotations or clerkships; most medical schools begin to integrate patient evaluation and assessment into the pre-clinical curriculum with standardized patients (SPs), including patient interviewing and physical examination skills. This transition is often felt to be stressful and jarring by medical students with initial feelings of inadequacy. 7 A systematic review examining the transition from pre-clinical to clerkships noted that perceived deficiencies of preparedness could be predominantly addressed with curricular interventions including clinical skills refreshers. 8 Advanced skills were generally felt to be better instructed during clerkship. 9 Additional focused pre-clinical teaching sessions on limited or sensitive topics like genital, breast, and anorectal examinations are well received by students.10,11 However, the impact of similar workshops or other advanced instruction is limited when the training takes place outside of the clinical or patient-care context, and the results are often based on survey data that demonstrates an improvement in confidence (usually in a pre- and post-survey format) while competency and skill retention are not always assessed.12–14
Clinical teaching faculty generally assume that students arrive to their clinical rotations having retained the requisite baseline knowledge from their pre-clinical training. 15 Thus, students will continue to build on these skills with additional experiences on rotation by working with the care team and modeling skills observed in the clinical learning environment in a more self-directed manner. 15 The objective structured clinical examination (OSCE) provides a degree of objectivity and assessment that is generally lacking in faculty clinical evaluation of students. 16 While most clerkships do use OSCE as part of their evaluation and assessment of students, these are rarely graded in detail by clinical faculty. The systematic evaluation of clinical skills assessments of clerkship students can help to identify skill gaps and evaluate the impact of targeted training interventions.
In surgery, the use of appropriate abdominal examination skills is crucial for evaluation and assessment of the surgical patient with student examinations on surgery clerkship having notable deficiencies. 17 However, abdominal examination skills—like most physical examination skills—are often introduced to students in the pre-clinical years with limited clinical context to assist in students’ comprehensive understanding. It is critical to assess whether students are retaining these abdominal examination skills and utilizing them appropriately to develop their differential diagnosis. Faculty at our institution anecdotally identified a persistent gap between expected and demonstrated skill sets surrounding the abdominal examination. The goal of this work was to determine if students reliably apply skill training in clinical settings by observing how surgery clerkship students demonstrate abdominal physical examination skills in an SP encounter of a typical surgery consult that students would see on rotation. SP encounters at our institution are recorded and have been a useful tool to objectively evaluate student performance. We aimed to identify deficiencies in abdominal examination skills in third-year clerkship students, create a targeted intervention that reinforces the clinical and diagnostic relevance of abdominal examination skills, and evaluate the impact of this intervention.
Material and Methods
Setting and subjects
All third-year medical students at the University of Louisville School of Medicine complete an SP encounter during their 8-week surgery clerkship rotation as part of an existing assessment. There are 6 cohorts that complete their surgery clerkship in an academic year. Students rotate in a variety of surgical settings, including both operating and outpatient clinics. In these rotations, students have the opportunity to practice history, physical examination, and consultations of surgical patients while participating in the daily care of patients in the hospital.
Two groups of third-year medical students on their surgery clerkship rotations were assessed in this study. The 2021-2022 comparison group (n = 119) was analyzed without any curriculum intervention. The following 2022-2023 intervention group (n = 132) completed a brief, 1 h curriculum intervention at the beginning of their surgery clerkship. This study was approved and granted a waiver of informed consent by the University of Louisville Institutional Review Board (IRB number 20.0156). The reporting of this study conforms to the Standards for Quality Improvement Reporting Excellence in Education (SQUIRE-EDU) Publication Guidelines for Educational Improvement (Supplementary File 1). 18
SP assessment
Both groups completed the same required SP encounter during their surgery clerkship. The encounter consisted of a 38-year-old woman who was assigned female at birth and presented with acute, right lower quadrant (RLQ) pain. The case was chosen as the presentation allows a variety of diagnoses that depend on an appropriate patient history and physical examination maneuvers. The expected diagnosis for this patient was appendicitis, and details in the patients’ history that would support this finding included: vague, dull abdominal pain transitioning to severe RLQ pain at McBurney's point; history elements consistent with focal peritonitis; guarding on examination; and severe pain with palpation in the RLQ to light and deep palpation with rebound tenderness. The relevant abdominal examination maneuvers for this case were identified by the surgery clerkship faculty (Table 1). 19
Relevant abdominal examination skills for the surgery clerkship standardized patient assessment.
Students were given 15 min to complete the SP encounter and were instructed to: obtain a history pertinent to this patient's problem, perform a relevant physical examination, develop a differential diagnosis, and discuss their impressions and any initial plans with the patient. The students debriefed with the SPs after the encounter as part of the standard assessment feedback for SP cases at our institution; the feedback was not related to the research question specifically.
As the case was a preexisting assessment for the surgery clerkship, SPs were not recruited for the study and SPs were not informed that physical examination skills specifically were being studied. Rather, SPs from the same pool completed the same training for this case across groups and presented the case to both groups in the same way.
Intervention
The additional training completed by the intervention group included a 1 h, surgeon-led (author DTA) didactic session in which students were refreshed on potential differentials, physical examination maneuvers, and critical thinking regarding evaluation of acute abdominal pain. Because the pre-clinical curriculum taught the examination skills and included a practical component in which students practiced examination maneuvers on SPs, the intervention was included within the clerkship curriculum of didactic lectures focused on demonstration and the appropriate application of these skills but did not include an additional practical component for students. The session was informed by the deficiencies in examination maneuver utilization that were identified in the comparison group of students although the faculty did not connect the session explicitly to the SP assessment—ie, the new curriculum was integrated along with other clerkship didactic training but students were not told that their SP assessment would cover these skills specifically.
The session made explicit the clinical relevance of abdominal examination skills for common surgical problems. During this session, the faculty explicitly described how and why physical examination errors—such as testing for rebound tenderness in only 1 quadrant or starting a palpation examination in the quadrant where the patient feels pain—could affect the examination findings and clinical reasoning for common surgical problems. The faculty described how to effectively to perform the abdominal examination even when a patient expresses pain and why it was necessary to do so in the surgical setting. The intervention occurred prior to the SP assessment for each cohort.
Data collection
Each students’ clerkship SP encounter was recorded, and video encounters were dual-coded by 2 independent video coders (authors HM, TH, JS). examination maneuvers were divided into generalized and advanced examination maneuvers based on how our curriculum taught the skills to pre-clinical students. The general abdominal examination maneuvers were those taught during the first pre-clinical year in our curriculum and identified as foundational, which clinicians expected the students to be proficient in their pre-clinical years (including auscultation, palpation, percussion, rebound tenderness). Although we did not anticipate percussion specifically being needed to diagnose this case, we observed percussion skills for possible clinical reasoning relevance (eg, assessing for tympany). The advanced abdominal examination skills were those introduced in the second pre-clinical year at our institution (Rovsing sign, Psoas sign, or Obturator sign), and surgical clerkship faculty expected students to have been introduced to these skills as a pre-clinical student but assumed that students would master these skills during their clinical years.
The video coders initially trained with 2 attending surgeons (authors DTA and TW) to ensure that coders understood the proper abdominal examination techniques that would be analyzed in this study. Each coder then coded 5 initial training videos and compared their outcomes to the attending surgeon to ensure that the examinations were observed consistently. The coders then watched each video-recorded SP encounter and assessed whether the physical examination maneuvers were attempted by the student, and if so, whether the maneuver was completed correctly or incorrectly. If the student performed a skill incorrectly, the reason was also described by the coder. For any encounters where the physical examination was not visible, the encounter was not included in the analysis. The coders also recorded the differential diagnosis discussed with the patient. Any discrepancies between coders were addressed by returning to the original recording and/or by consulting an attending surgeon on the project for review.
Statistics and analysis
We used non-parametric analyses to accommodate the non-normal data distributions in this study. All statistical analyses were completed using R (version 3.6.3). 20 Alpha was set at 0.05 per convention.
Our primary outcomes were to compare the total number of general and advanced examination maneuvers that students attempted from the comparison and intervention groups. We summarized the frequency of each physical examination maneuver and compared the proportion of each group that attempted advanced maneuvers using a chi-square test. We also used a Wilcoxon rank sum test to compare the number of maneuvers (basic or advanced) attempted between study groups.
Our secondary outcomes included comparing the number of attempted maneuvers among the student cohorts within each group using a Kruskal-Wallis test. Pairwise cohort comparisons of significant outcomes were performed using Dunn’s test of multiple comparisons using rank sums, 21 and the Benjamini-Hochberg method was used to adjust for multiple comparisons. 22
Another outcome of interest was the correctness of attempted maneuvers. We summarized the frequency that each physical examination maneuver was completed correctly between the comparison and intervention groups and also summarized the common errors among the abdominal examination skills in both groups.
Results
All students (100%, n = 119) in the comparison group attempted at least 1 general examination maneuver while only 40% (n = 47) attempted an advanced maneuver. Students completed more general examination maneuvers correctly than advanced examination maneuvers, which were generally just not attempted. Similarly, nearly all students (99%, n = 131) in the intervention group attempted at least 1 general examination maneuver. However, a significantly larger proportion of the intervention group attempted advanced maneuvers, with 75% (n = 99) of students attempting at least 1 advanced examination maneuver (χ2(1, 251) = 31.0, p < .001).
Between the comparison and intervention groups, there was not a significant increase in the median number of general maneuvers attempted (W = 7197, p = .212) but there was a significant increase in the median number of advanced maneuvers attempted (W = 4269, p < .001). In the comparison group, students attempted more general examination maneuvers (Mdn = 2 per student) and completed them correctly (Mdn = 2) compared to advanced examination maneuvers (Mdn = 0 attempted/completed correctly) (Figure 1). Students in the intervention group attempted more abdominal examination maneuvers overall (Mdn = 4 vs. 2 in the comparison group). This reflected more attempts at both general (Mdn = 3) and advanced (Mdn = 1) maneuvers per student, although the number of correct maneuvers increased for advanced skills only (Mdn = 1). Nearly all students in the comparison (97%, n = 115) and the intervention (95%, n = 125) groups discussed appendicitis as the likely diagnosis with the patient, suggesting that variation in the physical examination did not actually influence students’ ultimate diagnosis.

Boxplots indicate the distribution of abdominal examination maneuvers attempted by students in the comparison (n = 119) and intervention (n = 132) groups; the boxes represent the interquartile range between the first and third quartiles and the whiskers denote the lowest and highest values exclusive of outliers (beyond 1.5 × the interquartile range); the median is represented by either a line inside the box or a heavy line on the edge in the case of the median overlapping a quartile.
We did not observe strong temporal patterns in the data across the 8 cohorts each year. Early and late cohorts (ie, student cohorts that completed their surgery rotation at different points in the academic year) attempted a similar number of general maneuvers in both the comparison group (H = 7.8, p = .170) and the intervention group (H = 8.7, p = .121). We did identify significant differences among cohorts for advanced maneuvers in both the comparison group (H = 26.5, p < .001) and the intervention group (H = 15.3, p = .009). However, the post hoc comparisons did not show a meaningful increase of skills through the academic year in either group (Figure 2). In the comparison group, increases in advanced skills in later cohorts were not sustained, suggesting that differences were not driven by meaningful improvements in students’ skills. Improvements were also not consistent in the intervention group, with 1 low cohort in September seeming to drive the variation.

Distribution of abdominal examination maneuvers attempted by each cohort in the comparison (A) and intervention (B) groups across the academic year; the boxes represent the interquartile range between the first and third quartiles and the whiskers denote the lowest and highest values exclusive of outliers (beyond 1.5 × the interquartile range); the median is represented by either a line inside the box or a heavy line on the edge in the case of the median overlapping a quartile; matching letters indicate significant differences in maneuvers between cohorts within each group (eg, within the comparison group, the “a” boxplot label denotes a significant difference in the median number of advanced maneuvers attempted between the July and March cohorts).
We observed some consistent trends in incorrect examination maneuvers (Figure 3). In the comparison group, only 59% of students correctly assessed rebound tenderness: many students performed the examination in the left lower quadrant and/or completely avoided the RLQ where the patient reported pain, so they did not appropriately assess whether the patient had peritonitis on examination. In the intervention group, even fewer students (37%) correctly attempted rebound tenderness, with the same general trends as previous cohort in avoiding the RLQ. For palpation, students often palpated in fewer than 4 quadrants, with many students again avoiding the RLQ where the patient's pain originated. SPs additionally indicated to the SP program that during feedback in both groups, students described that they “did not want to hurt” the patient during the examination, which is why students did not complete this critical portion of the examination.

Accuracy of abdominal examination maneuvers attempted by the comparison (left column) and intervention (right column) groups.
However, some other students incorrectly began their palpation examination in the RLQ rather than ending at this point, which could impact how patients respond to examination in other parts of the abdomen. For the Obturator sign, many students flexed the patient's knee but rotated the leg rather than the hip or rotated the hip externally rather than internally.
Discussion
This study of abdominal examination skills in surgery clerkship highlights how medical students struggle to implement physical examination maneuvers taught in pre-clinical curriculum and meaningfully integrate the findings to develop a differential diagnosis. While our curriculum intervention did result in significant improvements in the attempted use of advanced examination skills, there was still a substantial gap among students in the intervention cohort. The results also showed that students did not reliably learn advanced abdominal examination skills either before or during their clerkship rotation. Finally, this work demonstrates how SP encounters can be used to evaluate clinical skills to identify gaps in the pre-clinical curriculum.
Development and retention of physical examination skills
Few students in the comparison cohort attempted advanced maneuvers, and we observed many errors among students performing general maneuvers. It is possible that students did not remember these skills from their pre-clinical training, or they did not know how to apply the physical examination skills within a clinical context. These gaps also identify limitations in students’ clinical reasoning where students treat patients as a multiple choice question with a “correct” diagnosis answer rather than collecting evidence to develop a broad differential. 23 Our study focused on the process of developing a differential, and in this case, our students did not gather the appropriate physical examination evidence to make the conclusion for the patient even if they did arrive at the diagnosis correctly. While some educators may consider more rigorous pre-clinical curriculum to address gaps like the ones seen in this study, other work suggests that such pre-clinical training does not always translate to superior clinical evaluation skills.24,25 Furthermore, clinical-level students cannot always apply skills in the appropriate clinical scenario even though they are able to perform skills under direct supervision and guidance. 26
These data also provide an objective assessment looking across the entire clerkship. We found that student cohorts did not improve meaningfully over the academic year. This suggests that these skills are not being taught extemporaneously or reinforced on clerkship in a meaningful way, which many pre-clinical and clinical faculty might assume. Clerkship directors may consider integrating additional SP encounters at the beginning of clerkships as a pre-assessment so that students can refresh their skills and receive feedback on deficiencies, which they can then work on during the clerkship. This would allow students to be able to perform well on the OSCE at the end of the clerkship and show directed growth from their starting point.
Our brief, clinically focused intervention increased the likelihood that students attempted advanced abdominal examination maneuvers. It is possible that the clinical faculty being precise about the thought process behind using abdominal examination skills and connecting examination findings to surgical problems specifically helped some students see the clinical reasoning necessity of the examination maneuvers. This type of instruction in which an expert makes the implicit explicit is critical to help learners move from novice to expert, 27 which may be difficult to do in pre-clinical settings. Students are better able to make clinical judgments on surgical disease processes after their clerkship but the use of appropriate physical examination skills remains elusive. 28
Implications
Advanced examination skills are critical to developing a differential diagnosis but were not used effectively by most students on surgery clerkship, which suggests that students in our program did not reliably learn to apply advanced examination skills in pre-clinical years or during clerkship rotations. It is critical to be purposeful about addressing these gaps in clerkship so that they do not extend into residency. 2 This evaluation and targeted intervention process can be modeled for various clinical skills that are observed to be deficient by clinical faculty as part of quality improvement efforts.
Furthermore, our results highlight the role of SP case review in curriculum evaluation. SP cases are often used for individual learning and assessment, but they can provide rich data for curriculum evaluation. These data are more rigorous than typical self-report satisfaction or confidence surveys used in many evaluations. The video-review process can even benefit trainees who code the encounters. 29 At our program, analyzing these encounters was instrumental to identifying gaps around physical examination knowledge not previously known and instructing based on these deficiencies.
Limitations and future directions
The generalizability of the specific physical examination findings is supported by the large sample size of this project, but it is limited because the study encompassed only 2 cohorts of medical students at 1 institution. The specific gaps identified in this study likely reflect the curriculum at our institution, but the process modeled here could be utilized at any institution.
While it is possible that students may behave differently in an SP case compared to a case with an actual patient, 30 this is unlikely since (1) the student knows the SP is an actor and not in real pain, and (2) the case is graded, which likely elicits thoroughness from students. Relatedly, potential case leakage around the diagnosis of appendicitis would be expected to prompt better or more thorough examinations (ie, if students knew the correct diagnosis, it would be easier to come prepared to perform the correct examination steps). Instead, we found that students were not making the connection between the abdominal examination skills and clinical reasoning. Future research efforts should focus on the remaining gaps that were observed in the intervention group. Introducing additional hands-on training with clinical faculty or more structured assessments or focused teaching in the clinical learning environment could help students further develop these skills.
Conclusion
Our project highlights a gap between the instruction and application of abdominal examination maneuvers. Pertinent advanced abdominal examination skills are critical for surgical evaluation but are not performed correctly by a majority of students on surgery clerkship. A brief, clinically focused intervention improved some aspects of students’ abdominal examination skills. Systematically evaluating clerkship SP cases in detail can help faculty develop curriculum to improve medical students’ clinical skills.
Supplemental Material
sj-docx-1-mde-10.1177_23821205241272382 - Supplemental material for Assessing Abdominal Examination Skills in a Surgery Clerkship Standardized Patient Encounter for Curriculum Improvement
Supplemental material, sj-docx-1-mde-10.1177_23821205241272382 for Assessing Abdominal Examination Skills in a Surgery Clerkship Standardized Patient Encounter for Curriculum Improvement by Hannah Marshall, Laura A. Weingartner, Taylen Henry, Jensen Smith, Tiffany Wright, Carrie A. Bohnert, M. Ann Shaw and Dylan T. Adamson in Journal of Medical Education and Curricular Development
Footnotes
Acknowledgments
The authors thank Mimi Reddy and Zack Kennedy from the Standardized Patient Program at the University of Louisville for their help in scheduling and accessing SP encounters. The authors also thank the Summer Research Scholars Program and the Health Sciences Center Research Office for their support.
Author Contributions
HM contributed to the acquisition, analysis, and interpretation of data for the work; drafting the work; final approval of the version to be published; and agreement to be accountable for all aspects of the work. LAW and DTA provided substantial contributions to the conception and design of the work and analysis and interpretation of the data for the work; drafting of the work; final approval of the version to be published; agreement to be accountable for all aspects of the work. TH and JS contributed to the acquisition, analysis, and interpretation of data for the work; reviewing the work critically for important intellectual content; final approval of the version to be published; and agreement to be accountable for all aspects of the work. TW, CAB, and MAS provided substantial contributions to the conception and design of the work; reviewing the work critically for important intellectual content; final approval of the version to be published; and agreement to be accountable for all aspects of the work.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Ethical Approval
This study was approved by the University of Louisville Institutional Review Board (IRB number 20.0156).
Patient Consent
This study has been granted a waiver of informed consent by the University of Louisville Institutional Review Board.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Summer Research Scholars Program at the University of Louisville School of Medicine in Louisville, Kentucky through student research scholarships to HM, TH, and JS.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
