Abstract
Purpose
Artificial intelligence (AI) is increasingly integrated into healthcare, including psychiatric care. This study evaluates ChatGPT-4o’s reliability in answering frequently asked antidepressant-related questions by comparing its performance with psychiatrists across four key dimensions: accuracy, conciseness, readability, and clarity.
Design
A comparative study analyzing ChatGPT-4o-generated responses and those of psychiatrists with at least five years of clinical experience.
Setting
Participants were recruited through institutional and professional networks and provided with standardized questions derived from authoritative treatment guidelines.
Subjects
Twenty-six psychiatrists participated, and ChatGPT-4o responses were generated using a standardized prompt for each question.
Measures
Two independent psychiatrists evaluated accuracy and conciseness using a blinded rating system. Readability was assessed with the Flesch-Kincaid Grade Level test, and clarity was measured with the Writing Clarity Index Calculator.
Analysis
The Shapiro-Wilk test assessed normality. Paired t-tests were used for normally distributed data, and the Wilcoxon signed-rank test for non-normally distributed data. Statistical significance was set at P < .05.
Results
ChatGPT-4o showed comparable accuracy to psychiatrists (P = .0645) but was significantly more concise (P = .0019). Readability differences were not statistically significant (P = .0892), while psychiatrists provided clearer responses (P = .0059).
Conclusion
ChatGPT-4o delivers accurate and concise responses, highlighting its potential as a patient education tool. However, psychiatrists offer greater clarity, underscoring the indispensable role of clinical expertise in psychiatric care.
Get full access to this article
View all access options for this article.
