Sage Journals: Discover world-class research

Abstract

French

Introduction

Large language models (LLMs) such as OpenAI's GPT-4o are increasingly used to summarize information and report trends in available data for medical education. For integrated plastic surgery, the utility of LLMs to recommend taking a research year has not been established. We aim to establish the reliability of ChatGPT reproducibility of research year recommendations for medical students applying to integrated plastic surgery.

Methods

De-identified, self-reported integrated plastics applicant profiles in publicly available Google Sheets from 2022–2025 were assembled. Inputs provided to GPT-4o (three runs per profile) included Step 2 CK (Clinical Knowledge) score, AOA designation, and research productivity. Research-year status and match outcome were withheld. The model returned a binary recommendation to pursue a research year. Reproducibility was summarized as cross-run concordance. We compared model recommendations with applicants’ actual research-year decisions.

Results

Of 98 entries, 55 complete profiles were retained. Mean Step 2 CK was 258.3 (SD = 10.4). Applicants reported a mean 20.1 (SD = 19.9) research presentations, 3.84 (SD = 3.6) first-author publications, and 9.18 (SD = 6.4) total publications. Twenty-one eligible applicants (51.2%) reported AOA. Overall, 98.2% (54/55) matched. Across the three computed runs, there was a 98% concordance in recommendations. The LLM recommended a research year for 32.7% (18/55) of entries, whereas 45.5% (25/55) actually undertook one (p = 0.208). Agreement between model recommendations and applicant decisions was 41.8% (p = 0.28).

Conclusion

ChatGPT demonstrated internal consistency, but its recommendations could not predict which students would take a research year en route to a successful residency match.

Keywords

Large language models artificial intelligence integrated plastic surgery research year residency advising

Get full access to this article

View all access options for this article.

References

Cardillo

. 40+ Chatbot Statistics (2025). Exploding Topics. Published online May 1, 2025. Accessed August 19, 2025. https://explodingtopics.com/blog/chatbot-statistics.

Zhu

Wang

Chen

, et al. Performance of ChatGPT on US medical licensing examinations: Potential for AI-assisted medical education using large language models. PLoS Digit Health. 2023;2(7):e0000596. doi:10.1371/journal.pdig.0000596

Zarei

Mamaghani

Abbasi

, et al. Application of artificial intelligence in medical education: A review of benefits, challenges, and solutions. Medicina Clínica Práctica. 2024;7(2):100422. doi:10.1016/j.mcpsp.2023.100422

Association of American Medical Colleges. Principles for the responsible use of artificial intelligence in and for medical education. Published July 2024. Accessed August 19, 2025. https://www.aamc.org/about-us/mission-areas/medical-education/principles-ai-use.

Cho

Lee

Kopatsis

, et al. Could residency application resources benefit from centralization? Survey insights from fourth-year medical students. J Med Educ Curric Dev. 2025;12:23821205251391536. Published 2025 Dec 2. doi:https://doi.org/10.1177/23821205251391536. doi:10.1177/23821205251391536

Mehta

Sinno

Thanik

Weichman

Janis

Patel

. Matching into integrated plastic surgery: The value of research fellowships. Plast Reconstr Surg. 2019;143(2):640-645. doi:10.1097/PRS.0000000000005212

Lane

Akhter

Crowe

Morrison

Lopez

. Perceptions of achievability, representation, and access to plastic surgery training and mentorship among medical students. Plast Reconstr Surg Glob Open. 2025;13(6):e6889. Published 2025 Jun 17. doi:10.1097/GOX.0000000000006889. doi:10.1097/GOX.0000000000006889

Myers

Amalfi

Ramanadham

. Mentorship in plastic surgery: A critical appraisal of where we stand and what we can do better. Plast Reconstr Surg. 2021;148(3):667-677. doi:10.1097/PRS.0000000000008295

Seneviratne

Kodikara

Abeykoon

Palpola

. Perception of medical undergraduates on artificial intelligence in medical education: Qualitative exploration. JMIR Med Educ. 2025;11:e73798. Published 2025 Sep 19. doi: https://doi.org/10.2196/73798

10.

Bankins

Jooss

Restubog

Marrone

Ocampo

Shoss

. Navigating career stages in the age of artificial intelligence: A systematic interdisciplinary review and agenda for future research. J Vocat Behav. 2024;153:104011. doi:10.1016/j.jvb.2024.104011

Would AI Take a Research Year? Pilot Study Evaluating the Reliability of ChatGPT in Advising Plastic Surgery Applicants on Research Years