Sage Journals: Discover world-class research

Abstract

French

Introduction: Large language models (LLMs) like ChatGPT are used by medical trainees and professionals for learning and clinical support. This study determined how Canadian plastic surgery residents utilize and perceive LLMs for their training. Methods: A cross-sectional survey was distributed to all Canadian, English-speaking plastic surgery trainees (N = 100). Descriptive statistics and conventional content analysis were used to describe quantitative and free-text responses, respectively. Results: A total of n = 36 responses were collected (36% response rate) from Canadian plastic surgery residents. Among residents, 83.3% reported using LLMs for any purpose, and 63.8% reported using the technology for plastic surgery education. The most frequently utilized LLMs include ChatGPT (83.3%), BingAI (11.1%), and Gemini (8.3%). More than half of residents reported using LLMs a minimum of once per week (50.1%). The most common applications included explaining concepts (58.3%), explaining procedures (33.3%), answering lecture questions (27.8%), and creating presentations (27.8%). Of respondents, 94.4% reported not having received education or training on the use of LLMs, and 37.1% reported concerns with the use of the technology for plastic surgery learning. The themes that emerged from the free-text responses were categorized into 3 groups: (1) advantages, including time-efficiency and summarization, (2) disadvantages, including concerns of inaccuracies, confidentiality, and over-reliance, and (3) recommendations, such as didactic teaching sessions and workshops. Conclusions: LLMs are commonly used by Canadian plastic surgery residents for a variety of purposes. Most residents have not been trained on the optimal use of the technology, and surgical residency programs should consider formal LLM instruction to leverage the capabilities of this tool and mitigate potential harms.

Keywords

large language models artificial intelligence ChatGPT resident education

Introduction

Large language models (LLMs) are artificial intelligence (AI) systems that leverage vast amounts of data to understand and generate human-like language, allowing users to converse with the technology.¹ LLMs like Chat Generative Pre-Trained Transformer (ChatGPT by OpenAI, California, USA) have garnered significant academic interest, with many publications analyzing the technology's application across medicine and surgery since its release in November 2022.

The potential applications of LLMs for plastic surgery education, practice, and research are vast. ChatGPT is proficient in answering general medical questions, and in February 2023, Kung et al² showed that ChatGPT-3.5 could perform at or near the passing threshold for the United States Medical Licensing Exams (USMLE). This seminal study inspired widespread interest in the use of technology for medical education and sparked debate regarding its optimal application in the field. In plastic surgery specifically, LLMs have been investigated as a medical education tool for resident trainees. In 2023, Gupta et al³ demonstrated that ChatGPT-3.5 could answer 55% of questions correctly on the 2022 plastic surgery in-service training exam. This was corroborated by Humar et al⁴ in 2023 using the same exam, who added that ChatGPT3.5 would rank in the 49th percentile when compared to first-year integrated plastic surgery residents in the United States. When ChatGPT-4 was released under 6 months later, Hubany et al⁵ followed up on these studies, and demonstrated that the updated ChatGPT-4 answered an average of 18.5% better than ChatGPT-3.5 on in-service exams from 2018 to 2022. On its best exam, ChatGPT-4 performed at the 97th percentile compared to first-year residents, and in the 55.7th percentile compared to sixth-year trainees.⁵ These studies demonstrate the technology's application as a learning aid and outline its advantages, including breaking down complex topics and providing detailed explanations.

Although the capabilities of LLMs are certainly promising, there are potential drawbacks regarding their implementation, including concerns of over-reliance on the technology for clinical medicine, privacy issues, and intellectual property conflicts. The use of the technology for clinical medicine without diligent physician oversight may result in patient morbidity and mortality, as the technology is not always accurate. These systems can present “LLM hallucinations,” which are responses with incorrect information presented as factual. The same authors who demonstrated the potential successes of LLMs above also provided caution that LLMs should only be used as a tool to supplement traditional methods, because their outputs cannot be considered objective and require review for accuracy. Furthermore, the use of LLMs for clinical documentation without the redaction of personal health information poses privacy concerns for patients. LLM use for career advancement purposes, like development of personal statements, review of curriculum vitae, and composition of manuscripts, may be cause for concern in terms of intellectual property and other ethical issues. While intriguing, nuance is required in ensuring appropriate use of LLMs, particularly in the healthcare field, where adherence to professional and ethical standards is paramount.

Overall, if used appropriately, LLMs have the potential to assist plastic surgery residents within their training, research, and practice. However, review of the literature demonstrates a paucity of evidence regarding how residents currently use and perceive the technology. This study aims to determine how Canadian plastic surgery residents use LLMs and describe their current perceptions of the use of the technology for plastic surgery learning. This may guide educators and program directors in how to support resident success in an increasingly technologically advanced world.

Methods

This is a cross-sectional, national study conducted from September 7, 2024, to February 7, 2025. This study received research ethics board approval prior to commencement.

An 18-question survey for Canadian plastic surgery residents was created on REDCap, designed to explore participants’ patterns and perceptions of LLM use for plastic surgery learning (Supplemental Appendix 1).

The surveys were sent to the program administrators of all English-based plastic surgery residency training programs in Canada (University of British Columbia, University of Calgary, University of Alberta, University of Manitoba, Western University, McMaster University, University of Toronto, University of Ottawa, McGill University, and Dalhousie University) for distribution to their residents. Three reminder emails were sent to participants throughout the survey period.

Quantitative results were expressed as a percentage of respondents for each question. For free-text responses, conventional content analysis was used to group responses into themes and subthemes by two independent reviewers. A third senior author was available for resolution in the case of discordant groupings.

No LLMs were used for data analysis or manuscript composition.

Results

A response rate of 36% (n = 36/100) was calculated for Canadian plastic surgery residents. All postgraduate years were represented (Figure 1).

Figure 1.

Resident Respondents by Postgraduate Year (PGY).

Of respondents, 83.3% (n = 30/36) reported the use of an LLM in the past. The types of LLMs used by residents varied, with 83.3% reporting use of ChatGPT, 11.1% reported use of BingAI, and 8.3% of Gemini (Google).

Over half of residents (63.9%) reported use of LLM for plastic surgery learning or practice. Of those who used LLMs, use varied from daily to a few times (Figure 2).

Figure 2.

Frequency of Large Language Model (LLM) Usage for Plastic Surgery Learning.

The patterns of LLM usage were investigated by grouping usages into 5 different contexts including (A) clinical (Figure 3), (B) self-study (Figure 4), (C) lecture/teaching (Figure 5), (D) research (Figure 6), (E) career-related (Figure 7), with specific usage scenarios outlined within each context. Respondents most frequently reported using LLMs for explaining concepts (58.3%) (Figure 4), explaining procedures (33.3%) (Figure 3), answering questions (27.8%) (Figure 5), and creating presentations (27.8%) (Figure 7).

Figure 3.

Clinical Use of Large Language Models (LLMs).

Figure 4.

Self-Study Use of Large Language Models (LLMs).

Figure 5.

Lecture/Teaching Use of Large Language Models (LLMs).

Figure 6.

Research Use of Large Language Models (LLMs).

Figure 7.

Career-Related Use of Large Language Models (LLMs).

Other LLM applications mentioned by participants in free text responses included dictating operative notes, composing scholarship applications, simplifying concepts, proofreading emails, and summarizing articles to review.

Most respondents (94.4%) reported having received no formal education regarding the optimal usage of LLMs for plastic surgery learning. Many respondents (34.4%) reported concerns with the use of the technology for plastic surgery learning.

Conventional content analysis of free-text responses revealed 3 themes with corresponding subthemes, including (1) advantages, (2) disadvantages, and (3) recommendations.

Advantages

The main perceived advantages of using LLMs for plastic surgery learning were identified as time efficiency and summarization of material. For time efficiency, residents had the following comments regarding LLMs:

“Makes studying and emails more efficient (R24),”

“Increases efficiency of multiple aspects of residency (R22),”

“Great synthesizer, good info, time saving (R14),”

“Efficiency with research, studying (R5).”

Regarding the summarization of material, residents expressed the following:

“[Creates] good info summary and basis off which to study (R2),”

“Great tool to synthesize information, create efficiency (R3),”

“Synthesizing large pieces of data, summarizing extensive research articles (R9),”

“Useful for obtaining knowledge regarding a certain topic, synthesis/exam prep (R36),”

“Large output of information and synthesis of information (R35).”

Disadvantages

The main perceived disadvantages of using LLMs for plastic surgery learning were identified as concerns of inaccurate information, confidentiality breeches, and overreliance.

For inaccurate information, residents commented the following:

“Sometimes inaccurate (R1),”

“[Can provide] misinformation, out of date information (R5),”

“Misinformation, lack of important details (R23),”

“Details being missed out in the summary and [LLMs] not knowing the nuances of plastic surgery (R24).”

Residents also expressed concerns regarding confidentiality:

“Ensuring patient safety and confidentiality are upkept (R22),”

“I worry about patient confidentiality and quality if we are using it to help with clinical cases. I think education on it is important (R16).”

For concerns of overreliance of LLMs, residents mentioned:

“Loss of benefit of developing own mental frameworks/pathways (R17),”

“Less ‘self-learning’ (R19),”

“Becoming reliant on LLMs and reducing our level of critical thinking (R23).”

Recommendations

Residents’ main recommendations for the future use of LLMs in plastic surgery learning included a desire for didactic teaching:

“Provide formal information (R35),”

“A teaching session on utility (R28),”

“Proper training/teaching sessions on how to appropriately use [LLMs] (R26),”

“Class on practical use of LLMs as a resident for clinical work, and also for research (R23),”

“Experts in the field to give lectures (R13),”

“Didactic teaching by an expert on practical use (R5).”

Discussion

Our national survey demonstrated that many Canadian plastic surgery residents employ LLMs as part of their plastic surgery learning, and that their use is frequent and serves a variety of purposes. The most common uses included explaining concepts or procedures, creating presentations, and answering questions. Most residents have not received any guidance or teaching on the optimal usage of LLMs for plastic surgery learning, and many report concerns with the use of the technology for this purpose. Respondents noted advantages including time-efficiency and effective summarization of material, and disadvantages including inaccurate information, confidentiality issues, and overreliance. Respondents expressed a clear desire for didactic teaching sessions on the effective and safe use of LLMs in plastic surgery learning.

In the general surgery literature, St John et al⁶ assessed residents’ perceptions of AI and found that 68% of residents believe that AI could enhance knowledge of medicine, but 77% of these residents were concerned about its use in medicine. These results are concordant with our findings regarding the potential benefits of LLM use for resident education, and echo the concern regarding safe and ethical implementation of the technology. Interestingly, this study also found that increased familiarity with AI was associated with positive perceptions of the technology, suggesting that increased training with the technology may help reduce concerns regarding its implementation in medical training. In a nonsurgical setting, Fried et al⁷ conducted a survey of internal medicine residents from North Carolina, USA, and found an LLM usage rate of 26% and a similar lack of formal LLM education. Participants in this survey demonstrated a desire for increased exposure to LLMs (94%), with many residents (87%) expressing that training programs should consider formal instruction on the optimal usage of the technology, including hands-on experiences and didactic sessions. The most frequent usages of the technology among these participants included developing differential diagnoses, researching treatments, and creating educational materials. While these findings from internal medicine residents further corroborate the results of our study, there may be significant differences in the use of the technology between medical and surgical training programs due to different clinical responsibilities.

With formal instruction and expert-curated guidelines, plastic surgery residents could be enabled to safely and effectively leverage LLMs for clinical practice and research. This tool may improve efficiency and reduce the burden of repetitive tasks that could effectively be addressed by LLMs. Instructive sessions and guidelines should be curated by LLM and plastic surgery education experts to ensure a balance between optimizing the technology's benefits and mitigating its potential harms. Based on recommendations from Fried et al,⁷ preferred LLM education materials include learning modules integrated into orientation sessions, educational conferences, or other hands-on learning experiences such as academic half-day sessions or case-based simulations. These materials should include input from experts in other fields such as ethics, computer sciences, and law, as the implications of LLM integration into surgical training span many disciplines.^7,8

Our study is limited by its modest response rate (36%) and its self-reported nature. Although over one-third of Canadian plastic surgery residents completed the survey, the response rate and study design suggest the possibility of reporting bias. The rapid and relatively recent proliferation of the technology, along with limited instruction on the use of the technology from residency programs, may lead to uncertainty about the appropriateness of incorporating LLMs into plastic surgery learning, henceforth making residents hesitant to report their usage.

Future research should involve the creation of educational materials on LLM use for plastic surgery trainees, including didactic sessions and hands-on sessions created by experts in the field. Instruction on utilizing the technology in a safe, effective, and professionally and ethically sound manner is essential, particularly considering that its use is already common among residents. At the time of writing, there are no studies investigating how trainees respond to formal training on the use of LLMs in medical teaching or clinical domains. Without instruction, surgical trainees may fall victim to the pitfalls of LLM use, including over-reliance, privacy issues, and ethical concerns. However, when equipped with adequate instruction and understanding, residents may be able to learn, practice, and research plastic surgery with enhanced efficiency. In a time of ever-increasing demands on residents and physicians, small steps to improve efficiency and decrease mental and physical burden may have great impacts on wellness and longevity, and ultimately on patient care.

Conclusions

LLMs are powerful tools that can aid plastic surgery residents in many different aspects of their training. Our survey demonstrates that a large proportion of Canadian plastic surgery residents use LLMs for their plastic surgery training, and that their use of the technology is frequent, and spans across the study, research, and practice of plastic surgery. Residency programs should consider LLM education sessions for their residents on the optimal usage of LLMs for plastic surgery training.

Supplemental Material

sj-docx-1-psg-10.1177_22925503251400370 - Supplemental material for The Use of Large Language Models in Postgraduate Plastic Surgery Training: A National Survey of Plastic Surgery Residents

Supplemental material, sj-docx-1-psg-10.1177_22925503251400370 for The Use of Large Language Models in Postgraduate Plastic Surgery Training: A National Survey of Plastic Surgery Residents by Jacob Wise, Lindsay Bjornson, Chloe Wong and Grayson A. Roumeliotis in Plastic Surgery

Footnotes

Acknowledgments

Thank you to Alissa Dozois for her help with this project.

Authors’ Contributions

All authors contributed to the conduct of this research, analysis of data, and composition of the manuscript.

Consent to Participate

Informed consent was obtained from all individual participants included in the study.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Ethical Considerations

The Ottawa Health Science Network Research Ethics Board approved our survey study (OHSN-REB Protocol #: 20240356-01H, CRRF ID: 5693) on July 18, 2024. All procedures followed were in accordance with the ethical standards of the responsible committee on human experimentation (institutional and national) and with the Helsinki Declaration of 1975, as revised in 2008.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Previous Presentations and Publications

The methodology and preliminary results of this study were presented at the 2024 Canadian Conference for the Advancement of Surgical Education, and the results were presented at the 2025 Canadian Society of Plastic Surgeons Annual Meeting.

ORCID iDs

Jacob Wise

Chloe Wong

Supplemental Material

Supplemental material for this article is available online.

References

Introducing chatgpt. https://openai.com/index/chatgpt/

Kung

Cheatham

Medenilla

, et al. Performance of ChatGPT on USMLE: potential for AI-assisted medical education using large language models. PLOS Digit Health. 2023;2(2):e0000198. doi:10.1371/journal.pdig.0000198

Gupta R, Herzog I, Park JB, et al. Performance of ChatGPT on the plastic surgery inservice training examination. Aesthet Surg J. 2023;43(12):NP1078-NP1082. doi:10.1093/asj/sjad128

Humar P, Asaad M, Bengur FB, Nguyen V. ChatGPT is equivalent to first-year plastic surgery residents: evaluation of ChatGPT on the plastic surgery in-service examination. Aesthet Surg J. 2023;43(12):NP1085-NP1089. doi:10.1093/asj/sjad130

Hubany SS, Scala FD, Hashemi K, et al. ChatGPT-4 surpasses residents: a study of artificial intelligence competency in plastic surgery in-service examinations and its advancements from ChatGPT-3.5. Plast Reconstr Surg Glob Open. 2024;12(9):e6136. doi:10.1097/GOX.0000000000006136

St John

Cooper

Kavic

. The role of artificial intelligence in surgery: What do general surgery residents think? Am Surg. 2024;90(4):541‐549. doi:10.1177/00031348231209524

Fried

Dorn

Leland

, et al. Large language models in internal medicine residency: current use and attitudes among internal medicine residents. Discov Artif Intell. 2024;4:70. doi:10.1007/s44163-024-00173-w

Paranjape

Schinkel

Nannan Panday

Car

Nanayakkara

. Introducing artificial intelligence training in medical education. JMIR Med Educ. 2019;5(2):e16048. doi:10.2196/16048

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.03 MB