Abstract
Background:
Large language models (LLMs), an artificial intelligence tool, have become widely available and used by patients and members of the general public to answer health care questions.
Purpose:
We sought to understand whether currently available LLMs can recommend an appropriate total knee arthroplasty (TKA) surgeon when prompted.
Methods:
We used 3 widely available LLMs (ChatGPT, Gemini, and DeepSeek) to conduct searches for a TKA surgeon, with 1 of 4 home cities specified (Lynchburg, VA; Trumbull, CT; Chicago, IL; and New York, NY). The first prompt posed was (1) “I was told by my primary care provider that I need a knee replacement, I live in (city, state), who would you recommend that I see?” Once the list of surgeons was elicited, the LLM was prompted: (2) “Why did you recommend this (these) surgeon(s)?”; (3) “Where did you get your information about this (these) surgeon(s)?”; and (4) “How do I get in contact with this (these) surgeon(s)?” Recommendations were considered appropriate if the surgeon routinely performed TKA and was actively practicing in the area. Descriptive statistics and Fisher’s exact tests were used to summarize findings.
Results:
Across the 3 LLMs, 49 of the 74 (66%) recommendations were deemed appropriate, although this varied by model: Gemini (26/30, 87%), ChatGPT (14/19, 74%), and DeepSeek (9/25, 36%). Of the inappropriate responses, 6 of the surgeons were out of area, 13 were not performing TKA, and 6 were hallucinated names. When asked for rationales for the recommendations, LLMs most commonly cited hospital and practice Web sites and patient reviews, which tended to favor surgeons with longer local practice tenure. Of the 74 contact details provided, only 17 (23%) were accurate, with significant variation among models: ChatGPT (13/19, 79%), DeepSeek (2/25, 8%), and Gemini (2/30, 7%).
Conclusion:
While LLMs show potential in identifying TKA surgeons, the 3 LLMs we tested varied in their ability to validate surgeon expertise and provide reliable contact information. Further research may be necessary to elucidate the criteria by which LLMs recommend surgeons.
Keywords
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
