Sage Journals: Discover world-class research

Abstract

Background

Large language models are a type of artificial intelligence that can understand language and generate responses to text inputs. This presents potential within healthcare to improve triage of common conditions with established care pathways, such as lateral elbow tendinopathy (LET). However, its application to clinical scenarios requires evaluation.

Methods

Four questions regarding LET investigation and management were posed to ChatGPT-3.5, which was asked to provide five evidence sources. Five clinical scenarios were posed to the model, simulating consultations with typical and red-flag features. Responses were evaluated by three upper-limb Consultants using the DISCERN tool.

Results

Overall quality was unanimously rated as moderate for both questions and scenario responses, representing potentially important but not serious shortcomings. The model correctly identified the diagnosis and red-flag features and sign-posted accordingly. References cited were found to not exist in 40% of cases. Where references were correctly cited, issues identified included erroneous terminology; exclusion of recent evidence; and misinterpretation of findings.

Conclusions

While this technology's ability to identify diagnosis and red-flag features when presented with clinical scenarios shows promise, application in the clinical setting is not yet justified due to limitations in evidence basis of recommendations and lack of real-time access to evidence.

Keywords

tennis elbow lateral elbow tendinopathy large language models ChatGPT

Get full access to this article

View all access options for this article.

References

Thirunavukarasu

Ting

DSJ

Elangovan

, et al. Large language models in medicine. Nat Med 2023; 29: 1930–1940.

Khurana

Koli

Khatter

, et al. Natural language processing: state of the art, current trends and challenges. Multimed Tools Appl 2023; 82: 3713–3344.

Introducing ChatGPT. 2022. https://openai.com/blog/chatgpt

Kooi

Litjens

van Ginneken

, et al. Large scale deep learning for computer aided detection of mammographic lesions. Med Image Anal 2017; 35: 303–312.

Cui

Mao

Jiang

, et al. Automatic semantic segmentation of brain gliomas from MRI images using a deep cascaded neural network. J Healthc Eng 2018; 2018: 4940593.

Zhu

Chang

, et al. Multiple clustered instance learning for histopathology cancer image classification, segmentation and clustering. IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 2012, pp. 964–971, doi:10.1109/CVPR.2012.6247772.

Kung

Cheatham

Medenilla

, et al. Performance of ChatGPT on USMLE: potential for AI-assisted medical education using large language models. PLoS Digit Health 2023; 2: e0000198.

Maitland

Fowkes

Maitland

. Can ChatGPT pass the MRCP (UK) written examinations? Analysis of performance and errors using a clinical decision-reasoning framework. BMJ Open 2024; 14: e080558.

Thirunavukarasu

Hassan

Mahmood

, et al. Trialling a large language model (ChatGPT) in general practice with the applied knowledge test: observational study demonstrating opportunities and limitations in primary care. JMIR Med Educ 2023; 9: e46599.

10.

Patel

Lam

. ChatGPT: the future of discharge summaries? Lancet Digit Health 2023; 5: e107–e108.

11.

Tosti

Jennings

Sewards

. Lateral epicondylitis of the elbow. Am J Med 2013; 126: 357.e1-6.

12.

Singh

Watts

Bateman

, et al. BESS patient care pathway: tennis elbow. Shoulder Elbow 2023; 15: 348–359.

13.

Lucado

Day

Vincent

, et al. Lateral elbow pain and muscle function impairments. J Orthop Sports Phys Ther 2022; 52: CPG1–C111.

14.

Levkovich

Elyoseph

. Identifying depression and its determinants upon initiating treatment: ChatGPT versus primary care physicians. Fam Med Community Health 2023; 11: e002391.

15.

Chen

Guevara

Moningi

, et al. The effect of using a large language model to respond to patient messages. Lancet Digit Health 2024; 6: e379–e381.

16.

The future of general practice - Health and Social Care Committee. https://publications.parliament.uk/pa/cm5803/cmselect/cmhealth/113/report.html

17.

Charnock

Shepperd

Needham

, et al. DISCERN: an instrument for judging the quality of written consumer health information on treatment choices. J Epidemiol Community Health (1978) 1999; 53: 105.

18.

Charnock

. The DISCERN Handbook Quality criteria for consumer health information on treatment choices. Abingdon, OX: Radcliffe Medical Press, 1998, https://www.ndph.ox.ac.uk/files/discern-handbook.pdf.

19.

Altman

. Practical statistics for medical research. Practical Statistics for Medical Research. 1990. https://www.taylorfrancis.com/books/mono/10.1201/9780429258589/practical-statistics-medical-research-douglas-altman

20.

Gotlieb

Praska

Hendrickson

, et al. Accuracy in patient understanding of common medical phrases. JAMA Netw Open 2022; 5: e2242972–e2242972.

21.

Smidt

Van Der Windt

DAWM

Assendelft

WJJ

, et al. Corticosteroid injections, physiotherapy, or a wait-and-see policy for lateral epicondylitis: a randomised controlled trial. Lancet 2002; 359: 657–662.

22.

Smidt

Assendelft

WJJ

Van der Windt

DAWM

, et al. Corticosteroid injections for lateral epicondylitis: a systematic review. Pain 2002; 96: 23–40.

23.

Smidt

Assendelft

WJJ

Arola

, et al. Effectiveness of physiotherapy, for lateral epicondylitis: a systematic review. Ann Med 2003; 35: 51–62.

24.

Krogh

Bartels

Ellingsen

, et al. Comparative effectiveness of injection therapies in lateral epicondylitis: a systematic review and network meta-analysis of randomized controlled trials. Am J Sports Med 2013; 41: 1435–1446.

25.

Coombes

Bisset

Brooks

, et al. Effect of corticosteroid injection, physiotherapy, or both on clinical outcomes in patients with unilateral lateral epicondylalgia: a randomized controlled trial. JAMA 2013; 309: 461–469.

26.

Bagga

Cay

Ricketts

, et al. Quotation errors related to the Proximal Fracture of the Humerus Evaluation by Randomization (ProFHER) study. Shoulder Elbow 2021; 13: 642–648.

27.

Cay

Leung

Curlewis

, et al. Quotation errors related to the distal radius acute fracture fixation trial paper. J Hand Surg Eur Vol 2021; 46: 654–658.

28.

Clark

Donovan

Schoettker

. From outdated to updated, keeping clinical guidelines valid. Int J Qual Health Care 2006; 18: 165–166.

29.

Seth

Xie

Rodwell

, et al. Exploring the role of a large language model on carpal tunnel syndrome management: an observation study of ChatGPT. J Hand Surg Am 2023; 48: 1025–1033.

30.

Buchbinder

Johnston R

Barnsley

, et al. Surgery for lateral elbow pain. Cochrane Database Syst Rev 2011; 3(3): CD003525.

31.

Capan

Esmaeilzadeh

Oral

, et al. Radial extracorporeal shock wave therapy is not more effective than placebo in the management of lateral epicondylitis: a double-blind, randomized, placebo-controlled trial. Am J Phys Med Rehabil 2016; 95: 495–506.

32.

Hsieh

Kuo

Lee

, et al. Comparison between corticosteroid and lidocaine injection in the treatment of tennis elbow: a randomized, double-blinded, controlled trial. Am J Phys Med Rehabil 2018; 97: 83–89.

Exploring the knowledge base of ChatGPT in lateral elbow tendinopathy

Abstract

Background

Methods

Results

Conclusions

Keywords

Get full access to this article

References