Sage Journals: Discover world-class research

Abstract

Purpose

This study assessed the readability, reliability and accuracy of patient information leaflets on Descemet Membrane Endothelial Keratoplasty (DMEK), generated by seven large language models (LLMs). The aim was to determine which LLM produced the most patient-friendly, comprehensible and evidence-based leaflet, measured against a leaflet written by clinicians from a tertiary centre.

Methods

Each LLM was given the prompt, “Make a patient information leaflet on Descemet Membrane Endothelial Keratoplasty (DMEK) surgery.” Readability metrics (FKG, FRE, ARI, Gunning Fog), reliability metrics (DISCERN, PEMAT), misinformation detection and reference analysis were recorded for each response. A weighted scoring system normalised results on a 0–100% scale.

Results

The clinician-generated leaflet scored the highest (92%). Claude 3.7 Sonnet had the top LLM score (77.8%), with strong readability and referencing. ChatGPT-4o followed closely (70.9%) but lacked references. Moderate scores for DeepSeek-V3, Perplexity AI and Google Gemini 2.0 Flash. ChatGPT-4 and Microsoft CoPilot scored the lowest due to limited reliability and misinformation.

Conclusions

LLMs show promise in generating patient education material but vary in reliability and accuracy. Claude 3.7 Sonnet was the best performing LLM, though none matched in quality to the clinician-generated leaflet. LLM-generated leaflets therefore require clinician oversight before safe clinical use.

Keywords

Large language models (LLMs)patient information leaflets (PILs)health communication descemet membrane endothelial keratoplasty artificial intelligence in ophthalmology Claude ChatGPT readability DISCERN PEMAT

Get full access to this article

View all access options for this article.

References

Deng

Sanchez

Chen

. Clinical outcomes of descemet membrane endothelial keratoplasty using eye bank–prepared tissues. Am J Ophthalmol 2015; 159: 590–596.

Machalińska

Kuligowska

Kaleta

, et al. Changes in corneal parameters after DMEK surgery: a swept-source imaging analysis at 12-month follow-up time. J Ophthalmol 2021; 2021: 3055722.

Spaniol

Hellmich

Borgardts

, et al. DMEK Outcome after one year – results from a large multicenter study in Germany. Acta Ophthalmol (Copenh) 2023; 101: e215–e225.

Trindade

BLC

Eliazar

. Descemet membrane endothelial keratoplasty (DMEK): an update on safety, efficacy and patient selection. Clin Ophthalmol 2019; 13: 1549–1557.

Informed consent in ophthalmology care in the UK: A critical component of patient-centred practice. Eye News, https://www.eyenews.uk.com/education/top-tips/post/informed-consent-in-ophthalmology-care-in-the-uk-a-critical-component-of-patient-centred-practice

Tan

Goonawardene

. SS, N. Internet Health Information Seeking and the Patient-Physician Relationship: A Systematic Review. J Med Internet Res 2017 Jan 19; 19(1): e9. DOI: 10.2196/jmir.5729

McMullan

. Patients using the Internet to obtain health information: how this affects the patient–health professional relationship. Patient Educ Couns 2006; 63: 24–28.

AlGhamdi

Moussa

. Internet use by the public to search for health-related information. Int J Med Inf 2012; 81: 363–373.

Ittarat

Cheungpasitporn

Chansangpetch

. Personalized care in eye health: exploring opportunities, challenges, and the road ahead for chatbots. J Pers Med 2023; 13: 1679.

10.

Reyhan

Mutaf

Uzun

, et al. A performance evaluation of large language models in keratoconus: a comparative study of ChatGPT-3.5, ChatGPT-4.0, Gemini, Copilot, Chatsonic, and Perplexity. J Clin Med 2024; 13: 6512.

11.

Raimondi

Tzoumas

Salisbury

, et al. Comparative analysis of large language models in the royal college of ophthalmologists fellowship exams. Eye 2023; 37: 3530–3533.

12.

Panch

Pearson-Stuttard

Greaves

, et al. Artificial intelligence: opportunities and risks for public health. The Lancet Digital Health 2019; 1: e13–e14.

13.

Bernstein

Zhang

Y (Victor)

Govil

, et al. Comparison of ophthalmologist and large language model chatbot responses to online patient eye care questions. JAMA Network Open 2023; 6: e2330320.

14.

Having a DMEK or DSAEK corneal transplant, https://www.royalfree.nhs.uk/patients-and-visitors/patient-information-leaflets/having-a-dmek-or-dsaek-corneal-transplant (accessed 21 March 25).

15.

McClure

. Readability formulas: useful or useless? IEEE Trans Prof Commun 1987; PC-30: 12–15. DOI: 10.1109/TPC.1987.6449109

16.

Kincaid

and United States. National Technical Information S . Derivation of new readability formulas (automated readability index, fog count and Flesch reading ease formula) for Navy enlisted personnel. Springfield, VA: U.S. Department of Commerce, National Technical Information Service, 1975.

17.

Spiers

Amin

Lakhani

, et al. Assessing readability and reliability of online patient information regarding vestibular schwannoma. Otol Neurotol 2017; 38: e470–e475. DOI: 10.1097/MAO.0000000000001565

18.

Scott

ReadabilityFormulas.com, https://readabilityformulas.com (2024, accessed 26 February 25).

19.

Charnock

Shepperd

Needham

, et al. DISCERN: an instrument for judging the quality of written consumer health information on treatment choices. Journal of Epidemiology and Community Health (1979) 1999; 53: 105–111. DOI: 10.1136/jech.53.2.105

20.

Shoemaker

Wolf

Brach

. Development of the patient education materials assessment tool (PEMAT): a new measure of understandability and actionability for print and audiovisual patient information. Patient Educ Couns 2014; 96: 395–403. DOI: 10.1016/j.pec.2014.05.027

21.

Hong

Kang

J-H

Park

, et al. Quality and readability of online information on hand osteoarthritis. Health Informatics J 2023. DOI: 10.1177/14604582231169297

22.

Thompson

Thornton

Ramsden

. Assessing chatbots ability to produce leaflets on cataract surgery: bing AI, chatGPT 3.5, chatGPT 4o, ChatSonic, google bard, perplexity and pi. Journal of Cataract & Refractive Surgery 2025 May 1; 5(5): 371–375. DOI: https://doi.org/10.1097/j.jcrs.0000000000001622

23.

Jaques

Abdelghafour

Perkins

, et al. A Study of Orthopedic Patient Leaflets and Readability of AI-Generated Text in Foot and Ankle Surgery (SOLE-AI). Cureus 2024 Dec 16; 16(12): e75826. DOI: https://doi.org/10.7759/cureus.75826

24.

Kamminga

Kievits

Plaisier

, et al. Do large language model chatbots perform better than established patient information resources in answering patient questions? A comparative study on melanoma. Br J Dermatol 2025; 192: 306–315.

25.

Pompili

Richa

Collins

, et al. Using artificial intelligence to generate medical literature for urology patients: a comparison of three different large language models. World J Urol 2024; 42: 455.

26.

OpenAI . GPT-4 is OpenAI’s most advanced system, producing safer and more useful responses, https://openai.com/index/gpt-4/ (2024, accessed 1 April 2025).

27.

Bommasani

Hudson

Adeli

, et al. On the opportunities and risks of foundation models. arXiv preprint arXiv:210807258 2021.

28.

Corporation M . Microsoft Copilot: Your everyday AI companion, https://www.microsoft.com/en-us/copilot (2023, accessed 31 March 2025).

29.

Anthropic . Claude 3.7 Sonnet: Advancing reasoning capabilities in conversational AI, https://www.anthropic.com/claude (2025, accessed 31 March 2025).

30.

Mikhail Burtsev

, and Adam Job . The Working Limitations of Large Language Models, https://sloanreview.mit.edu/article/the-working-limitations-of-large-language-models/ (2023).

31.

Roberts

Kit

Phylactou

, et al.

‘posture-Less’ DMEK: is posturing after descemet membrane endothelial keratoplasty actually necessary?

Am J Ophthalmol 2022; 240: 23–29.

32.

Masic

. The importance of proper citation of references in biomedical articles. Acta Inform Med 2013; 21: 148.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.05 MB