Sage Journals: Discover world-class research

Abstract

Background:

Large language models (LLMs), an artificial intelligence tool, have become widely available and used by patients and members of the general public to answer health care questions.

Purpose:

We sought to understand whether currently available LLMs can recommend an appropriate total knee arthroplasty (TKA) surgeon when prompted.

Methods:

We used 3 widely available LLMs (ChatGPT, Gemini, and DeepSeek) to conduct searches for a TKA surgeon, with 1 of 4 home cities specified (Lynchburg, VA; Trumbull, CT; Chicago, IL; and New York, NY). The first prompt posed was (1) “I was told by my primary care provider that I need a knee replacement, I live in (city, state), who would you recommend that I see?” Once the list of surgeons was elicited, the LLM was prompted: (2) “Why did you recommend this (these) surgeon(s)?”; (3) “Where did you get your information about this (these) surgeon(s)?”; and (4) “How do I get in contact with this (these) surgeon(s)?” Recommendations were considered appropriate if the surgeon routinely performed TKA and was actively practicing in the area. Descriptive statistics and Fisher’s exact tests were used to summarize findings.

Results:

Across the 3 LLMs, 49 of the 74 (66%) recommendations were deemed appropriate, although this varied by model: Gemini (26/30, 87%), ChatGPT (14/19, 74%), and DeepSeek (9/25, 36%). Of the inappropriate responses, 6 of the surgeons were out of area, 13 were not performing TKA, and 6 were hallucinated names. When asked for rationales for the recommendations, LLMs most commonly cited hospital and practice Web sites and patient reviews, which tended to favor surgeons with longer local practice tenure. Of the 74 contact details provided, only 17 (23%) were accurate, with significant variation among models: ChatGPT (13/19, 79%), DeepSeek (2/25, 8%), and Gemini (2/30, 7%).

Conclusion:

While LLMs show potential in identifying TKA surgeons, the 3 LLMs we tested varied in their ability to validate surgeon expertise and provide reliable contact information. Further research may be necessary to elucidate the criteria by which LLMs recommend surgeons.

Keywords

artificial intelligence total knee arthroplasty patient participation information seeking behavior orthopedic surgeon

Get full access to this article

View all access options for this article.

References

Clark

Bailey

Chatbots in Health Care: Connecting Patients to Information: Emerging Health Technologies. Canadian Agency for Drugs and Technologies in Health; 2024.

Van Bulck

Moons

What if your patient switches from Dr. Google to Dr. ChatGPT? A vignette-based survey of the trustworthiness, value, and danger of ChatGPT-generated responses to health questions. Eur J Cardiovasc Nurs. 2024;23(1):95-98. https://doi.org/10.1093/eurjcn/zvad038

Wyatt

Booth

Goldman

AH.

Natural language processing and its use in orthopaedic research. Curr Rev Musculoskelet Med. 2021;14(6):392-396. https://doi.org/10.1007/s12178-021-09734-3

Brameier

Alnasser

Carnino

Bhashyam

Von Keudell

Weaver

MJ.

Artificial intelligence in orthopaedic surgery: can a large language model “write” a believable orthopaedic journal article?

J Bone Joint Surg Am. 2023;105(17):1388-1392. https://doi.org/10.2106/JBJS.23.00473

Adelstein

Sinkler

Mistovich

RJ.

Artificial intelligence promotes the dunning Kruger effect: evaluating ChatGPT answers to frequently asked questions about adolescent idiopathic scoliosis. J Am Acad Orthop Surg. 2025;33(9):473-480. https://doi.org/10.5435/JAAOS-D-24-00297

Shah

Bozic

Jayakumar

Artificial intelligence in value-based health care. HSS J. 2025;21(3):307-313. https://doi.org/10.1177/15563316251340074

Sparks

Fasulo

Windsor

, et al. ChatGPT is moderately accurate in providing a general overview of orthopaedic conditions. JB JS Open Access. 2024;9(2):e23.00129. https://doi.org/10.2106/JBJS.OA.23.00129

Farhadi

Barnes

Sugito

Sin

Henderson

Levy

JJ.

Applications of artificial intelligence in orthopaedic surgery. Front Med Technol. 2022;4:995526. https://doi.org/10.3389/fmedt.2022.995526

Fayed

Mansur

NSB

De Carvalho

Behrens

D’Hooghe

De Cesar Netto

Artificial intelligence and ChatGPT in Orthopaedics and sports medicine. J Exp Orthop. 2023;10(1):74. https://doi.org/10.1186/s40634-023-00642-8

10.

Federer

Jones

GG.

Artificial intelligence in orthopaedics: a scoping review. PLoS One. 2021;16(11):e0260471. https://doi.org/10.1371/journal.pone.0260471

11.

Haleem

Vaishya

Javaid

Khan

IH.

Artificial intelligence (AI) applications in orthopaedics: an innovative technology to embrace. J Clin Orthop Trauma. 2020;11:S80-S81. https://doi.org/10.1016/j.jcot.2019.06.012

12.

Kiwinda

Kocher

Bryniarski

Pean

CA.

Bioethical considerations of deploying artificial intelligence in clinical orthopedic settings: a narrative review. HSS J. 2025;21(3):274-282. https://doi.org/10.1177/15563316251340303

13.

Koucheki

Lex

Brock

Goel

DP.

Integrating artificial intelligence and virtual reality in orthopedic surgery: a comprehensive review. HSS J. 2025;21(3):289-298. https://doi.org/10.1177/15563316251345479

14.

Kunze

KN.

Generative artificial intelligence and musculoskeletal health care. HSS J. 2025;21(3):248-256. https://doi.org/10.1177/15563316251335334

15.

Oettl

Zsidai

Oeding

Samuelsson

Artificial intelligence and musculoskeletal surgical applications. HSS J. 2025;21(3):267-273. https://doi.org/10.1177/15563316251339596

16.

Pawelczyk

Kraus

Voigtlaender

Siebenlist

Rupp

MC.

Advancing musculoskeletal care using AI and digital health applications: a review of commercial solutions. HSS J. 2025;21(3):331-341. https://doi.org/10.1177/15563316251341321

17.

Sloan

Premkumar

Sheth

NP.

Projected volume of primary total joint arthroplasty in the U.S., 2014 to 2030. J Bone Joint Surg Am. 2018;100(17):1455-1460. https://doi.org/10.2106/JBJS.17.01617

18.

Fabrizio

Cardillo

Egol

Rozell

Schwarzkopf

Aggarwal

VK.

Factors influencing patient selection of orthopaedic surgeons for total hip (THA) and total knee arthroplasty (TKA). Arch Orthop Trauma Surg. 2024;144(5):2057-2066. https://doi.org/10.1007/s00402-024-05314-5

19.

Lauer

JR.

Word of mouth and physician referrals still drive health care provider choice. Res Brief. 2008;(9):1-8.

20.

Bozic

Kaufman

Chan

Caminiti

Lewis

Factors that influence provider selection for elective total joint arthroplasty. Clin Orthop Relat Res. 2013;471(6):1865-1872. https://doi.org/10.1007/s11999-012-2640-9

21.

Londhe

Shah

Agrawal

Toor

Londhe

Parkhe

The influence of the Internet on the patients’ choice of surgeon for their total knee replacement surgery. J Clin Orthop Trauma. 2021;17:186-190. https://doi.org/10.1016/j.jcot.2021.03.010

22.

Manning

Bohl

Saltzman

, et al. Factors influencing patient selection of an orthopaedic sports medicine physician. Orthop J Sports Med. 2017;5(8):2325967117724415. https://doi.org/10.1177/2325967117724415

23.

Moser

Korstjens

Van Der Weijden

Tange

Patient’s decision making in selecting a hospital for elective orthopaedic surgery. Eval Clin Pract. 2010;16(6):1262-1268. https://doi.org/10.1111/j.1365-2753.2009.01311.x

24.

Pressman

Borna

Gomez-Cabello

Haider

Forte

AJ.

Clinical and surgical applications of large language models: a systematic review. J Clin Med. 2024;13(11):3041. https://doi.org/10.3390/jcm13113041

25.

Alber

Yang

Alyakin

, et al. Medical large language models are vulnerable to data-poisoning attacks. Nat Med. 2025;31(2):618-626. https://doi.org/10.1038/s41591-024-03445-1

26.

Thirunavukarasu

Ting

DSJ

Elangovan

Gutierrez

Tan

Ting

DSW

. Large language models in medicine. Nat Med. 2023;29(8):1930-1940. https://doi.org/10.1038/s41591-023-02448-8

27.

Liu

Meng

Yang

LLM technologies and information search. J Econ Technol. 2024;2:269-277. https://doi.org/10.1016/j.ject.2024.08.007

28.

Stern

ChatGPT vs. Claude vs. DeepSeek: the battle to be my AI work assistant. Wall Street Journal. 2025. https://www.wsj.com/tech/personal-tech/chatgpt-claude-deepseek-ai-features-compared-c5e1483c?gaa_at=eafs&gaa_n=AWEtsqeZ0YNvtdnyNlV2Donc-5M5PR9kSRpb0QzXmMXIjNXBd9vCYqnYsVVb4C–VMk%3D&gaa_ts=6955878e&gaa_sig=tBWi_vbWQxqHz4XLHM03U5LoN9tQZK08edwrU5TRb1_3WIhUFCCd2HqwWH7_5Q9_umCftysDA0KUapXM6thKhw%3D%3D

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.03 MB

Can Artificial Intelligence Models Appropriately Recommend Knee Arthroplasty Surgeons?