Abstract

Women with gestational diabetes mellitus (GDM) require individualized assistance to navigate the complexities of blood glucose control, dietary modifications, and potential medical interventions. In Denmark, the prevalence of GDM increased from 1.7% in 2004 to 4.2% in 2017, reflecting a global trend. 1 Despite this increase, financial resources allocated to the health care sector have only seen marginal growth. Projections indicate that by 2030, a substantial number of additional healthcare professionals, including doctors, nurses, and nursing aides, will be required to maintain the current level of health care service provision. 2
Artificial intelligence (AI)–driven chatbots, such as OpenAI’s ChatGPT, are conversational agents that emulate human interaction through written communication.3,4 Chatbots like ChatGPT have the potential to lighten or streamline healthcare personnel tasks related to text, such as writing summaries for health journal documentation or responding to messages.3,4
Current digital telehealth interventions enable health care providers to interact with patients through various platforms, including email and video calls. 3 However, these interventions often face challenges related to inflexibility. In contrast, AI chatbots offer flexible, on-demand, and personalized support, thereby addressing the limitations of traditional telehealth services. Thus, it is imperative for health care systems to adapt and integrate these innovations to ensure optimal care for conditions such as GDM.
The objective of this proof-of-concept study was to evaluate the clinical accuracy and sentence construction quality of responses generated by large language model (LLM)-based chatbots to 10 commonly asked questions related to GDM. The questions are presented in Supplementary Material S1. Six clinicians assessed and scored responses (on a scale of 1 to 5; low to high) from ChatGPT (v4.0, based on GPT-3.5-turbo-0125), DanskGPT (based on LLaMa v.2), and a clinician. ChatGPT was fine-tuned using non-sensitive data collected from Facebook groups, websites with frequently asked questions, and local clinical guidelines. The origin of the responses was blinded during the assessment process. The differences in scores were tested statistically using the Friedman test with post hoc analyses.
The assessment of clinical accuracy yielded median [25th/75th quantiles] scores of 5 [4;5] for ChatGPT, 4 [3;4] for DanskGPT, and 4 [3;4] for the clinician’s answers, with a significant difference observed (
The assessment of sentence construction quality showed median [25th/75th quantiles] scores of 4.5 [4;5] for ChatGPT, 3 [3;4] for DanskGPT, and 5 [4;5] for the clinician’s answers, with a significant difference observed (

Score heatmap showcases the clinician survey scores in heatmaps, presenting average clinical accuracy and sentence construction quality scores for the clinician, DanskGPT, and GPT-4 across 10 questions.
In conclusion, the results suggest that LLM-based chatbots have the potential to serve as supplementary counseling tools in gestational diabetes care. However, further evidence is needed to consolidate these findings and investigate potential limitations.
Supplemental Material
sj-docx-1-dst-10.1177_19322968241265882 – Supplemental material for The Potential of Large Language Model-Based Chatbot Solutions for Supplementary Counseling in Gestational Diabetes Care
Supplemental material, sj-docx-1-dst-10.1177_19322968241265882 for The Potential of Large Language Model-Based Chatbot Solutions for Supplementary Counseling in Gestational Diabetes Care by Lukas Lindstrøm, Mia Clausen, Nina Albrektsen Jensen, Maria Hartman Nielsen, Amar Nikontovic and Simon Lebech Cichosz in Journal of Diabetes Science and Technology
Footnotes
Abbreviations
GDM, Gestational diabetes mellitus; AI, Artificial intelligence.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
