Sage Journals: Discover world-class research

Abstract

Background:

Psychiatric discharge summaries are vital for ensuring continuity of care, yet they are often written in technical language that can be difficult for patients to understand and may cause emotional distress or reinforce stigma. With increasing patient access to medical records, there is a pressing need to develop communication tools that are both comprehensible and emotionally safe.

Aim:

This study aimed to evaluate the diagnostic fidelity, linguistic clarity, emotional sensitivity, treatment comprehension, and readability of psychiatric discharge summaries rewritten by ChatGPT-4 based on real clinical cases.

Methods:

This was the first study in South America to examine the use of a generative language model for rewriting psychiatric discharge summaries. A mixed-methods, observational cross-sectional design was applied. Twenty-five anonymized clinical cases were rewritten using ChatGPT-4. Three psychiatrists independently assessed each AI-generated summary across four dimensions: diagnostic fidelity, clarity of language, perceived emotional risk, and understanding of treatment. Readability was evaluated using the Fernández-Huerta Index and the INFLESZ Scale. A thematic analysis of evaluators’ written comments was also conducted.

Results:

Summaries generated by ChatGPT-4 were rated positively, particularly for clarity and treatment explanation. Significant improvements in readability were observed across all diagnostic groups (p < .001), with mean values surpassing recommended thresholds for general comprehension. However, five summaries remained below those thresholds, and some diagnostic inaccuracies were noted (e.g. omissions in bipolar disorder). Evaluators also highlighted emotionally charged or stigmatizing language in a few cases.

Conclusions:

ChatGPT-4 can enhance the accessibility and emotional appropriateness of psychiatric discharge communication, supporting more patient-centered care. Nevertheless, professional oversight remains critical to ensure clinical accuracy and contextual sensitivity. Future research should include patient feedback, assess long-term outcomes, and explore hybrid human-AI collaboration models.

Keywords

ChatGPT-4 health communication discharge summaries psychiatry artificial intelligence readability

Get full access to this article

View all access options for this article.

References

American Psychiatric Association. (2022). Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition, Text Revision (DSM-5-TR). American Psychiatric Association Publishing.

Barrio-Cantalejo

I. M.

Simón-Lorda

Melguizo

Escalona

Marijuán

M. I.

Hernando

(2008). Validación de la Escala INFLESZ para evaluar la legibilidad de los textos dirigidos a pacientes [Validation of the INFLESZ Scale to Assess the Readability of Patient-Directed Texts]. Anales del Sistema Sanitario de Navarra, 31(2), 135–152. https://scielo.isciii.es/pdf/asisna/v31n2/original2.pdf

Breneman

Trager

M. H.

Gordon

E. R.

Mehta

Husain

Gru

A. A.

Samie

F. H.

(2025). Utilization of ChatGPT to simplify complex dermatopathology reports into patient-friendly language. American Journal of Dermatopathology, 47, 498–500. Advance online publication. https://doi.org/10.1097/DAD.0000000000002868https://doi.org/10.1097/DAD.0000000000002868

Chung

E. M.

Zhang

S. C.

Nguyen

A. T.

Atkins

K. M.

Sandler

H. M.

Kamrava

(2023). Feasibility and acceptability of ChatGPT generated radiology report summaries for cancer patients. Digital Health, 9, 20552076231221620. https://doi.org/10.1177/20552076231221620

Clement

Schauman

Graham

Maggioni

Evans-Lacko

Bezborodovs

Morgan

Rüsch

Brown

J. S. L.

Thornicroft

(2015). What is the impact of mental health-related stigma on help-seeking? A systematic review of quantitative and qualitative studies. Psychological Medicine, 45(1), 11–27. https://doi.org/10.1017/s0033291714000129

Cross

J. L.

Choma

M. A.

Onofrey

J. A.

(2024). Bias in medical AI: Implications for clinical decision-making. PLOS digital health, 3(11), e0000651. https://doi.org/10.1371/journal.pdig.0000651

Doraiswamy

P. M.

Blease

Bodner

(2020). Artificial intelligence and the future of psychiatry: Insights from a global physician survey. Artificial Intelligence in Medicine, 102, 101753. https://doi.org/10.1016/j.artmed.2019.101753

Epstein

R. M.

Street

R. L.

(2007). National Cancer Institute, Division of Cancer Control and Population Sciences. https://cancercontrol.cancer.gov/sites/default/files/2020-06/pcc_monograph.pdf

European Union. (2016). Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data (General Data Protection Regulation). Official Journal of the European Union, L119, 1–88. https://eur-lex.europa.eu/eli/reg/2016/679/oj

10.

Fernández-Huerta

(1959). Medidas sencillas de lecturabilidad [Simple Measures of Readability]. Consigna, 214, 29–32.

11.

Gwet

K. L.

(2014). Handbook of Inter-Rater Reliability, 4th Edition: The Definitive Guide to Measuring The Extent of Agreement Among Raters. Advanced Analytics, LLC.

12.

Hart

S. A.

Wiencek

J. R.

(2023). The ethical dilemma of a patient’s immediate access to test results. Clinical Chemistry, 69(7), 773–774. https://doi.org/10.1093/clinchem/hvad057

13.

Healy

Richard

Kidia

(2022). How to reduce stigma and bias in Clinical Communication: A Narrative Review. Journal of General Internal Medicine, 37(10), 2533–2540. https://doi.org/10.1007/s11606-022-07609-y

14.

Iedema

Greenhalgh

Russell

Alexander

Amer-Sharif

Gardner

Juniper

Lawton

Mahajan

R. P.

McGuire

Roberts

Robson

Timmons

Wilkinson

(2019). Spoken communication and patient safety: A new direction for healthcare communication policy, research, education and practice? BMJ open quality, 8(3), e000742. https://doi.org/10.1136/bmjoq-2019-000742

15.

Janota

(2025). Application of artificial intelligence (AI) in the creation of discharge summaries in psychiatric clinics. International Journal of Psychiatry in Medicine, 60(3), 330–337. https://doi.org/10.1177/00912174241284730

16.

Liang

Zhu

Liu

Chen

Qin

Bressington

(2023). Feasibility and effectiveness of artificial intelligence-driven conversational agents in healthcare interventions: A systematic review of randomized controlled trials. International Journal of Nursing Studies, 143, 104494. https://doi.org/10.1016/j.ijnurstu.2023.104494

17.

Luxton

D. D.

(2014). Artificial intelligence in psychological practice: Current and future applications and implications. Professional Psychology Research and Practice, 45(5), 332–339. https://doi.org/10.1037/a0034559

18.

Nori

King

McKinney

S. M.

Carignan

Horvitz

(2023). Capabilities of GPT-4 on medical challenge problems. arXiv preprint, arXiv:2303.13375. https://doi.org/10.48550/arXiv.2303.13375

19.

Nutbeam

(2008). The evolving concept of health literacy. Social Science & Medicine, 67(12), 2072–2078. https://doi.org/10.1016/j.socscimed.2008.09.0501982.

20.

Okan

Bauer

Levin-Zamir

Pinheiro

Sørensen

(2019). International Handbook of Health Literacy: Research, practice and policy across the lifespan. Policy Press.

21.

Rose

Fleischmann

Tonkiss

Campbell

(2003). User and Carer Involvement in Change Management in a Mental Health Context: Review of the literature. Report to the National Co-ordinating Centre for NHS Service Delivery and Organisation R & D (NCCSDO). National Institute for Health and Care Research. https://njl-admin.nihr.ac.uk/document/download/2008675

22.

Schulz

P. J.

Nakamoto

(2013). Health literacy and patient empowerment in health communication: The importance of separating conjoined twins. Patient Education and Counseling, 90(1), 4–11. https://doi.org/10.1016/j.pec.2012.09.006

23.

Schwieger

Angst

de Bardeci

Burrer

Cathomas

Ferrea

Grätz

Knorr

Kronenberg

Spiller

Troi

Seifritz

Weber

Olbrich

(2024). Large language models can support generation of standardized discharge summaries - A retrospective study utilizing ChatGPT-4 and electronic health records. International Journal of Medical Informatics, 192, 105654. https://doi.org/10.1016/j.ijmedinf.2024.105654

24.

Singhal

Azizi

Mahdavi

S. S.

Wei

Chung

H. W.

Scales

Tanwani

Cole-Lewis

Pfohl

Payne

Seneviratne

Gamble

Kelly

Babiker

Schärli

Chowdhery

Mansfield

Demner-Fushman

, . . . Natarajan

(2023). Large language models encode clinical knowledge. Nature, 620(7972), 172–180. https://doi.org/10.1038/s41586-023-06291-2

25.

Szigriszt-Pazos

(1993). Sistemas predictivos de legibilidad del mensaje escrito: formula de perspicuidad [Tesis de Doctorado, Universidad Complutense de Madrid] [Predictive Systems of Written Message Readability: Perspicuity Formula (Doctoral Thesis, Complutense University of Madrid)]. http://webs.ucm.es/BUCM/tesis//19911996/S/3/S3019601.pdf

26.

Torales

Barrios

(2023). Diseño de investigaciones: algoritmo de clasificación y características esenciales [Research Design: Classification Algorithm and Essential Features]. Medicina Clínica y Social, 7(3), 210–235. https://doi.org/10.52379/mcs.v7i3.349

27.

Torales

Barrios

Ortiz

Estigarribia

(2024). Manual de Metodología de la Investigación: una Introducción a la Investigación Científica en Ciencias de la Salud [Research Methodology Manual: An Introduction to Scientific Research in Health Sciences]. EFACIM.

28.

U.S. Congress (1996). Health Insurance Portability and Accountability Act of 1996 (HIPAA), Public Law 104–191. Government Printing Office. https://www.congress.gov/104/plaws/publ191/PLAW-104publ191.pdf

29.

Yanovsky

R. L.

Mostaghimi

Buzney

Watson

(2020). Patient ability to interpret dermatopathology reports in an academic dermatology practice. JAMA Dermatology, 156(3), 341–342. https://doi.org/10.1001/jamadermatol.2019.4195

30.

Zhang

Chen

Nguyen

Choi

Gabel

Leonard

Yim

O’Donnell

Elaba

Deng

Levin

N. A.

(2024). Assessing the ability of an artificial intelligence chatbot to translate dermatopathology reports into patient-friendly language: A cross-sectional study. Journal of the American Academy of Dermatology, 90(2), 397–399. https://doi.org/10.1016/j.jaad.2023.09.072