Abstract
PURPOSE
With the transition of USMLE Step 1 to Pass/Fail, Step 2 CK carries added weight in the residency selection process. Our goal was to develop a Step 2 predicted score to provide to students earlier in medical school to assist with career mentoring. We also sought to understand how the predicted scores affected student's plans.
METHOD
Traditional statistical models and machine learning algorithms to identify predictors of Step 2 CK performance were utilized. Predicted scores were provided to all students in the Class of 2024 at a large allopathic medical school. A cross-sectional survey was conducted to assess if the estimated score influenced career or study plans.
RESULTS
The independent variables that resulted in the most predictive model included CBSE score, Organ System course exam scores and Phase 2 (Third Year Clinical Clerkships) NBME percentile scores (Step2CK = 191.984 + 0.42 (CBSE score) + 0.294 (Organ Systems) + 0.409 (Average NBME). The standard error of the prediction model was 7.6 with better accuracy for predicted scores greater than 230 (SE 8.1) as compared to less than 230 (SE 12.8). Nineteen percent of respondents changed their study plan based on the predicted score result. Themes identified from the predicted score included reassurance for career planning and the creation of anxiety and stress.
CONCLUSION
A Step 2 Predicted Score, created from pre-existing metrics, was a good estimator of Step 2 CK performance. Given the timing of Step 2 CK, a predicted score would be a useful tool to counsel students during the specialty and residency selection process.
Background
The United States Medical Licensing Exam (USMLE) Step 1 is a standardized exam that is used to assess a student's knowledge of basic science concepts and application to clinical skills 1 . Step 1 scores have been correlated with specialty board pass rates and has historically been used as a filter to allow for more holistic review of residency applications2,3. The shift from a three-digit numeric score to a pass/fail outcome has placed additional focus on the Step 2 Clinical Knowledge (CK) exam4–7.
Step 2 CK provides a quantitative measure of a candidate's medical knowledge and readiness for clinical practice 8 . The score has also been used beyond this initial intent9–13. It has been used by residency programs as a screening tool especially for more competitive specialties to decrease the number of applications to review during the selection process14,15. Since many students do not take Step 2 CK until late in their third year or the beginning of their fourth year, they often lack a numeric score during a critical period of the residency counseling process. Knowledge of an anticipated Step 2 CK score would potentially assist students and their advisors in the residency application process16,17.
The objective of our study was to develop a predictive tool for USMLE Step 2 CK based on pre-existing metrics18,19. We aim to discuss our tool creation and dissemination of the score to our large allopathic medical school. Additionally, we discuss the impact of this tool on student's study behaviors and specialty selection.
Method
Creating of a step 2 predicted score
In 2020, the IU School of Medicine began a project to analyze student data with the goal of identifying predictors of key student outcomes. A team of representatives from Business Intelligence, Educational Affairs, and Biostatistics analyzed and verified the demographic, pre-matriculation, and educational data for 896 students between 2019 and 2022. We included students that matriculated after 2016—the first year for a new medical school curriculum. The team used traditional statistical models and machine learning algorithms to identify predictors of Step 2 CK performance. Predictive variables assessed included scores for courses in the first and second year of medical school, scores on clinical clerkship National Board of Medical Examiners (NBME) exams, Medical College Admission Test (MCAT) performance, and Comprehensive Basic Science Examination (CBSE) performance (Table 1). The predictive variables identified in this preliminary project (Appendix A) were then evaluated using more than 20 linear regression models in SPSS and Excel to identify a usable predictive model. For internal validation of the regressions, the data set was randomly split 80% into a training set, and 20% into a testing set.
Variables assessed for inclusion in the step 2 predictor model
Abbreviations: CBSE, Comprehensive Basic Science Examination; NBME, National Board of Medical Examiners; GPA, Grade Point Average; MCAT, Medical College Admission Test; CPBS, Chemical and Physical Foundations of Biological Systems; CARS, Critical Analysis and Reasoning Skills (CARS).
Although some NBME subject exams were identified as more predictive than others, the fact that not all students complete subject exams in the same order required modifying the optimal model. It was determined that average percentile for all NBME subject exams, calculated by comparing a student's score to the national percentiles provided by the NBME, would allow for each student to receive a predicted score at any point during their third year. Once the final model was selected, data was processed in late January 2023 and made available to students that spring.
Score dissemination and evaluation
The Step 2 Predicted Score was issued as a range (+/− 7) and provided to all students (337) in the Class of 2024 during their third year of medical school (Appendix B). Every student at IUSM has a Lead Advisor and students were expected to meet with them following receipt of the Step 2 Predicted Score to discuss the results and to prepare for Step 2 CK which occurred at the start of their fourth year.
A cross-sectional, anonymous survey was provided to all fourth-year medical students in February of 2024 following completion of Step 2 CK and residency interviews. The survey was developed internally by members of the Medical Student Affairs team at IUSM. Survey questions included the accuracy of the Step 2 predicted score and if it changed the student's preparation plan for Step 2 CK. Students were asked if they altered their specialty or career plans based on the Step 2 Predicted Score. Students could share their actual score on the Step 2 CK by a given range instead of exact result to help ensure anonymity from the study team. The survey underwent an iterative refinement process with team members reviewing and revising the questions and possible choices to ensure clarity and relevance. A free text option was provided to allow students to share their concerns anonymously. The research project was determined to be exempt by the Indiana University School of Medicine IRB (#23419). The reporting of this study adheres to the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement criteria (Appendix C) 20 .
Statistical methods
The survey for this paper was generated using Qualtrics software, Version February 2024 Qualtrics, copyright © 2022 Qualtrics. Qualtrics and all other Qualtrics product or service names are registered trademarks or trademarks of Qualtrics, Provo, UT, USA (https://www.qualtrics.com).
Descriptive statistics were analyzed with SPSS version 28.0. Qualitative text was reviewed by the authors and free text responses grouped into themes (“Create a thematic analysis for the following free text,” ChatGPT, OpenAI, February 2024)21,22. Responses were reviewed to select representative quotes for each theme. All output was reviewed for quality and integrity of comments.
Results
The independent variables that resulted in the most predictive model included CBSE score, Organ System course exam scores and Phase 2 (Third Year Clinical Clerkships) NBME percentile scores (Step2CK = 191.984 + 0.042 (CBSE score) + 0.294 (Organ Systems) + 0.409 (Average NBME) (Appendix A). The constant value (191.984) is the prediction if all other values are zero, in this context meaning it is the lowest possible score this model could predict. The P-values of the variables in this equation are <0.001 except for CBSE, which was 0.041. The standard error of the prediction model was 7.6 with R2 of 0.667. When applied to actual test scores, the standard error differed based on the Step 2 score with a standard error of 8.1 when the predicted score was greater than 230 and 12.8 when predicted score was less than 230 (Figure 1).

Step 2 prediction model when compared to actual step 2 CK score.
The model was then applied to a broader group. Based on the current model, out of 1632 students taking the Step2CK from 2019 to 2023, 120 were predicted to score below 230. Of those students, representing 7.35% of the total cohort, predictions were higher than the actual scores with 69% scoring below the predicted value.
Three hundred forty-nine students were in the Class of 2024 and 146 students (42%) responded to our survey. One hundred twenty-four students (85%) stated that they reviewed the Step 2 Predicted Score prior to taking the Step 2 CK exam. Sixty-one students out of 117 (54%) felt that the Step 2 Predicted Score was “about what they expected,” while 26/117 (22%) felt that the score was “higher than expected” and 28/117 (24%) felt it was “lower than expected.”
One hundred sixteen students responded to the question “did you change your Step 2 CK preparation plan after knowing your predicted score.” Ninety-four (81%) did not while 22 (19%) of stated that they did change their preparation plan. The most common changes to student's study plan for Step 2 CK included an increased utilization of a question bank (55%), lengthened preparation time (50%), and purchasing additional study resources (32%) (Figure 2).

Changes to step 2 CK study preparation based on step 2 predicted score.
One hundred eleven (97%) students did not change their specialty or career plans based on their Step 2 Predicted Score while 4 (3%) stated that they did. For those students that changed their specialty plans, one sought out a more competitive specialty while three considered a less competitive specialty for residency.
Students actual Step 2 CK score is presented in Figure 3. Sixty students (52%) felt that their score on Step 2 CK was higher than they were expecting while 44 (38%) and 11 (9%) felt their score was as expected or lower than expected.

Self-reported range of actual step 2 CK score.
Free text responses were collected and grouped into themes. Representative quotes were verified for accuracy and authenticity. Three themes emerged: Career Planning, Accuracy, and Anxiety and Stress.
“I felt like I had no grasp on where I stood in terms of being competitive as a residency applicant, and the estimator was my first guess that I might be on the right track! I thought the data was helpful, allowed me to feel more confident in my decision to apply to a more competitive specialty, and ultimately pushed me to study to achieve the predicted score for me.” “I think the score prediction should be taken lightly as it is often predicting based off of early step 2 dedicated and things that aren't super important like preclinical scores.” “I was surprised how close the estimator was to my actual score which was the very high end of my predicted score.” “It scared me into studying much harder than was probably healthy when I would have been satisfied with an average score.”
Discussion
Our Step 2 Predictor, created from pre-existing metrics based on objective academic performance, was a good estimator of Step 2 CK performance. In the new era of Step 1 being Pass/Fail, medical students may be less able to judge their competitiveness for some specialties. Given the timing of Step 2 CK—shortly before residency applications are due—a Step 2 predicted score provided to students prior to that time could be a useful tool for students and advisors to appropriately counsel students during the specialty and residency selection process.
Our work adds to the literature on the development of Step 2 predictors. A recent publication of a smaller cohort reported the development of a Step 2 prediction model that included MCAT scores, NBME Customized Assessment Service (CAS) exams, and NBME Subject exams 4 . The only other publication we could find reported a Step 2 model that included undergraduate GPA, MCAT score, USMLE Step 1 score, and NBME clinical subject exam scores 18 . Interestingly, our model did not find that MCAT scores or undergraduate GPA added to the predictive value. All IUSM students are required to take the CBSE (the NBME Comprehensive Basic Science Exam) in the last 2 months of their second year of preclinical work in a controlled setting. Though Step 1 scores are no longer available, a student's score on their CBSE, a Step 1 practice exam, did contribute to our model.
Our prediction model assessed readily available metrics such as preclinical grades and clerkship examinations. There are predictors available via third-party resources as well as one via Reddit, a social media platform and online community 23 . However, the lack of transparency may be an issue, and some variables may not be applicable to every medical school. The ability to customize the predictor for individual student-level performance can be useful to individualize counseling for students. Having a Step 2 predictor model created and managed internally by a medical school has many benefits.
Overall, our model performed well; however, the prediction errors were larger than expected. Regression analysis specifically for the lower scores, whether predicted or actual, did not result in any improvements. It is notable that the average Step 2 CK score increased slightly each year from 2019 to 2022, with a 3.5-point increase in 2023, with the difference particularly pronounced at the lower end of the score distribution. There are likely variables outside of the model that account for this fluctuation in scores less than 230. Based on this less precise prediction, all students who had a predicted score below 230 were required to meet with their Lead Advisor to receive their score. All other student prediction scores were delivered via our secure learning management site. These required meetings allowed for a 1:1 conversation about the model's less reliable predictive value in their specific case and the development of an individualized study strategy. This meeting was also designed to provide the opportunity for these students to discuss potential changes to their career plans, seek extra academic or emotional support, and obtain appropriate referrals if desired. Regardless, this predicted model is intended to serve as a guide rather than an exact score prediction.
We also sought to understand how students felt about their predicted score and if the information altered their study plans. Feedback from our students was mixed. Many found it helpful with confidence and goal setting. It also helped some with career and residency planning. Others cited increased anxiety with the predicted score. It is unclear how that anxiety impacted their Step 2 CK score. This highlights the importance of personalized guidance with career advising. Advisors can assist with score interpretation within the context of a student's overall academic performance and specialty selection. They should also be prepared to manage anxiety and setting realistic expectations about the predictive tool's limitations.
The potential to increase anxiety around Step 2 scores was considered as a potential unintended consequence of supplying students with a predicted score. There is essentially no literature regarding the effect of anxiety on Step 2 scores since nearly all prior work has focused on the stress surrounding Step 1 preparation. A review of the literature looking at anxiety related to Step 1 preparation revealed only a very modest inverse correlation with Step scores 24 and that a supportive environment is the biggest mitigator of this stressor 25 . School supplied exam prep sessions have also been shown to decrease stress and anxiety 26 . The strong bond and frequent contact that our students have with their Lead Advisor, as well as the availability of free, unlimited mental health services, free, unlimited peer-tutoring and a free CBSE practice test provided by the school reassured us that we were providing appropriate and supportive stress mitigation services.
This highlights the importance of personalized guidance with academic and career advising. With a Step 2 predicted score, advisors can now assist by helping the student to interpret the meaning within the context of a student's overall academic performance and specialty selection. Finally, students altered their study plans, spending more time doing practice questions and lengthening their preparation period. Whether this resulted in more than half of students scoring better than anticipated is unclear but may be an optimistic potential outcome.
Strengths of our study include our sample size, its comprehensive data set of exam scores and student feedback on its implementation. Conducted at Indiana University School of Medicine, the largest allopathic medical school in the country, this study included data from 1632 students taking the Step 2 CK exam over 4 years (2019–2023). This large sample size enhances the generalizability of the findings and strengthens the statistical power of the predictive model.
Limitations
Our study is not without limitations. Prediction accuracy varied significantly with the model performing better for students with predicted scores above 230. The standard error was higher for those with predicted scores less than 230 suggesting that factors influencing lower scores were not captured by current variables in the model. The reliance on the current factors may miss other influential factors such as clinical experiences, socioeconomic factors, and study habits27–29. Additionally, selection bias in our survey may have accounted for a disproportionate number of positive responses. Our study was also not powered to determine the impact of consequential validity—the positive or negative social consequences resulting from the use of this assessment30,31. Though our sample size was large, the study was conducted at a single institution which may limit its external generalizability. Additionally, since each medical school has a unique curriculum, exam scores from preclinical years may not be as predictive at other institutions. However, this proof-of-concept study should serve as a helpful starting point and encourage others to explore the development of their own models.
Given the added importance of Step 2 CK in the current residency application process, providing student support to ensure success on this exam is critical 32 . A Step 2 Predictor based on pre-existing metrics is one potential solution provided that appropriate guidance and advising occurs along with its distribution. Future directions include refining the model to enhance its accuracy and relevance and providing workshops and counseling sessions to help students and advisors interpret and use these predicted scores effectively.
Conclusion
A Step 2 Predicted Score, created from pre-existing metrics, was a good estimator of Step 2 CK performance. Given the timing of Step 2 CK, a predicted score would be a useful tool to counsel students during the specialty and residency selection process.
Supplemental Material
sj-docx-1-mde-10.1177_23821205251321812 - Supplemental material for Impact of a USMLE Step 2 Prediction Model on Medical Student Motivations
Supplemental material, sj-docx-1-mde-10.1177_23821205251321812 for Impact of a USMLE Step 2 Prediction Model on Medical Student Motivations by Anthony Shanks, Ben Steckler, Sarah Smith, Debra Rusk, Emily Walvoord, Erin Dafoe and Paul Wallach in Journal of Medical Education and Curricular Development
Supplemental Material
sj-docx-2-mde-10.1177_23821205251321812 - Supplemental material for Impact of a USMLE Step 2 Prediction Model on Medical Student Motivations
Supplemental material, sj-docx-2-mde-10.1177_23821205251321812 for Impact of a USMLE Step 2 Prediction Model on Medical Student Motivations by Anthony Shanks, Ben Steckler, Sarah Smith, Debra Rusk, Emily Walvoord, Erin Dafoe and Paul Wallach in Journal of Medical Education and Curricular Development
Supplemental Material
sj-docx-3-mde-10.1177_23821205251321812 - Supplemental material for Impact of a USMLE Step 2 Prediction Model on Medical Student Motivations
Supplemental material, sj-docx-3-mde-10.1177_23821205251321812 for Impact of a USMLE Step 2 Prediction Model on Medical Student Motivations by Anthony Shanks, Ben Steckler, Sarah Smith, Debra Rusk, Emily Walvoord, Erin Dafoe and Paul Wallach in Journal of Medical Education and Curricular Development
Footnotes
Acknowledgements
The authors would like to thank our Medical Students, Affairs Team and Lead Advisors for their work.
Author's contributions
All listed authors made a significant contribution to the concept, design, acquisition, analysis, and interpretation of the data. BS and SS led the acquisition and analysis; AS drafted the article, and all other authors revised it and approved the final version for publication. All authors agree to be accountable for all aspects of the work.
Consent
Participants were provided with informed consent prior to volunteer participation in the anonymous survey. Collected responses were kept confidential and secure and only accessed by study team members.
DECLARATION OF CONFLICTING INTERESTS
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Ethical approval
This study was approved as exempt by the Indiana University School of Medicine Institutional Review Board #23419.
FUNDING
The authors received no financial support for the research, authorship, and/or publication of this article.
Supplemental material
Supplemental material for this paper is available online. Appendix A: POSO Score Model Results Appendix B: Information given to students when the Step 2 Estimator was delivered. Appendix C: STROBE guidelines.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
