Sage Journals: Discover world-class research

Abstract

Background:

Artificial intelligence, specifically deep learning-based large language models (LLMs), are quickly emerging as a tool for supporting clinical decision-making. We asked whether current LLMs can accurately and safely perform clinical decision-making for obesity medication prescribing.

Methods:

Twenty hypothetical patients visiting weight management clinics were created with detailed medical and personal histories. Four obesity medicine experts were asked to choose their first-choice weight loss medication, as were four LLMs: GPT-4o, Llama3, Gemini1.5, and Claude3.5. All the LLMs underwent identical prompt tuning and were configured with a uniform temperature setting of 1.0. All medication choices by the obesity medicine experts were presumed correct and a binary system was used to compare whether the LLMs did or did not select one of the medications proposed by the experts. Hallucination incidence and high-risk medication choices are evaluated.

Results:

Gemini1.5 and GPT-4o had the highest matching rate with experts (55%), followed by Claude3.5 Sonnet (50%) and Llama3 (30%). Only Gemini1.5 avoided high-risk or contraindicated medications in all cases. Among the LLMs, Llama3 had a statistically significant high hallucination incidence rate (p < 0.01). Some of the LLMs’ choices and rationales were creative and reasonable, offering insights that experts had not considered.

Conclusion:

LLMs can offer valuable insights into clinical decisions for obesity treatment. However, safety and accuracy concerns persist due to hallucinations. It is premature to use LLMs as clinical support tools. They need refinement through fine-tuning, detailed prompt tuning, real-world data incorporation, and rigorous validation to ensure reliability and safety.

Introduction

Artificial intelligence (AI), specifically deep learning-based large language models (LLMs), are quickly emerging as a tool to support clinical decision-making. In health care, AI is already used for multiple applications: to aid diagnostics, enhance efficiency of health systems, personalize treatment recommendations, offer risk predictions, identify patients for early interventions, and assist with clinical documentation.^1–5 Although still in the early phases, LLMs have already demonstrated their ability to improve the efficiency and effectiveness of clinical practice.¹

In clinical decision-making, especially when prescribing antiobesity medication (AOM), the process requires consideration of both clinical and nonclinical factors. Consequently, the final choice of medication can vary among providers. Overlooking any of these factors may significantly affect the treatment decision. LLMs can support this process by minimizing errors . AI approaches to AOM prescribing to improve efficiency and accessibility should be considered, given the large mismatch between demand for health care providers who can deliver comprehensive weight management care and the population-level need. It is estimated that by 2030, 50% of the global population will be living with either overweight or obesity,⁶ underscoring the importance of exploring all avenues for safe medication prescribing to improve access to AOMs for patients, including decision support from LLMs.

However, the safety of LLM tools as a clinical decision-making aid has not been fully evaluated across specific clinical scenarios, including for AOM prescribing. AI as a clinical decision-making tool for medication prescribing has previously been used in a number of applications,⁷ including: as support for adverse drug reactions detection and prediction,^8–10 to help pharmacists and physicians more easily sort through patient data and avoid prescribing errors using clinical decision support systems¹¹ and computerized physician order entry,¹² and to support individualized dose optimization for chemotherapy.¹³ All of these AI-aided prescribing strategies aim to reduce errors related to medication interactions and improve medical system efficiency.

In this study, we explore whether four LLMs can accurately and safely perform clinical decision-making for AOM prescribing. We compared LLM AOM decisions with choices made by obesity medicine expert clinicians including physicians, nurse practitioners, and physician assistants. We evaluated unsafe choices by LLMs, as well as creative or surprising medication decisions, and explored the rationales the LLMs provided for their choices.

Methods

Twenty hypothetical patients visiting weight management clinics were created with detailed medical and personal histories (patient cases in Supplementary Data S1). Medical information for the patients included age, gender, occupation, medical insurance coverage, chief complaint, history of weight challenges and prior attempts to decrease weight, past medical history, family and social history, a brief dietary history, current medications, a physical exam (including height, weight, body mass index, blood pressure, heart rate, respiratory rate, temperature), and laboratory tests (basic metabolic panel, complete blood count, liver enzymes, thyroid stimulating hormone and free T4, fasting blood glucose, lipid panel, and A1c). Based on the information provided in the cases, four obesity medicine experts (MD, NP, or PA) were asked to choose their first-choice weight loss medication, as were four LLMs: GPT-4o by OpenAI, Llama3 (70B model) by Meta, Gemini1.5 by Google, and Claude3.5 Sonnet by Anthropic.

For the prompt tuning, an AI testing platform was used for LLMs while maintaining a uniform temperature setting of 1.0 across all models. Specifically, GPT-4o was run on GPT Playground, Gemini1.5 on Google AI Studio, Claude3.5 Sonnet on Anthropic API, and Llama3.0 on Ollama. Except for Llama3.0, which was operated offline on a personal computer, all other models were tested online.

All LLMs followed the same prompt tuning process. Each model could choose only one medication, and selecting nonmedical treatment options was not permitted. However, they were allowed to opt for non-FDA approved medications with proven benefits for weight loss, including combination therapies such as bupropion with naltrexone or phentermine with topiramate. They were instructed to consider both clinical and nonclinical information to recommend the most feasible and effective medication for each case (Supplementary Data S2).

All medication choices by the obesity medicine experts were presumed correct and a binary system was used to compare whether the LLMs did or did not select a medication proposed by any of the four expert clinicians. Hallucination incidence and high-risk medication choices by the LLMs were evaluated by a 5th obesity medicine expert clinician (MD) (Table 4).

Categorical variables were expressed as numbers and percentages (frequencies). To compare the “agreement rate with the experts,” “AI hallucination incidence rate,” and “rate of high-risk medication selection” between the LLM groups, Fisher’s exact test was applied. All analyses were performed using SPSS version 29.0 (IBM, Chicago, Illinois, USA). A two-sided P value of <0.05 was considered statistically significant.

This project was reviewed by the Mass General Brigham Institutional Review Board, classified as nonhuman subjects research, and therefore exempt from IRB approval. The decision was made on June 18, 2024.

Results

All four LLMs made medication choices for the patient cases with varying degrees of alignment with expert opinions. They also exhibited hallucinations (medication choices that included misinformation and incorrect rationales even if they matched with expert opinions), and unsafe AOM choices (Tables 1–5). Gemini1.5 and GPT-4o had the highest matching rate of medication choices with expert clinicians (55%), followed by Claude3.5 Sonnet (50%) and Llama3 (30%) (Tables 1, 4–5, Fig. 1). The least well performing was Llama3, which made four high-risk medication choices (20%) and had 17 (85%) hallucinations. Best performing was Gemini1.5 Pro, which made no high-risk drug choices, with only two (10%) hallucinations. In the middle were GPT-4o, which made two high-risk drug choices (10%), and had four (20%) hallucinations, and Claude3.5 Sonnet, which made one (5%) high-risk medication choice, and had four (20%) hallucinations (Tables 1, 4–5, Fig. 1). Although all LLMs experienced hallucinations, only Llama3 had a statistically significant hallucination incidence rate (P < 0.01).

Table 1.

Antiobesity Medication Choices by Large Language Models

CASE	GPT-4o	Gemini1.5 Pro	Claude3.5 Sonnet	Llama3
CASE1	Ozempic	Ozempic	Wegovy	(H)Qsymia
CASE2	(H),(U)Wegovy	(H)Qsymia	Qsymia	(H)Qsymia
CASE3	Semaglutide	(C)Mounjaro	(H)Liraglutide	(H)Qsymia
CASE4	Contrave	Contrave	(H),(U)Qsymia	(H),(U)Qsymia
CASE5	Qsymia	Phentermine	Phentermine	(H)Qsymia
CASE6	Contrave	Wegovy	Phentermine	Qsymia
CASE7	Orlistat	Orlistat	Contrave	(H),(U)Qsymia
CASE8	Mounjaro	(C)Saxenda	Ozempic	(H)Qsymia
CASE9	Phentermine	Phentermine	Phentermine	(H)Qsymia
CASE10	Semaglutide	Ozempic	Wegovy	Wegovy
CASE11	Mounjaro	(C)Wegovy	Mounjaro	(H)Qsymia
CASE12	Orlistat	Contrave	Contrave	(H)Qsymia
CASE13	Contrave	Contrave	Contrave	Wegovy
CASE14	(H)Wegovy	(C)Metformin	(H)Wegovy	(H)Wegovy
CASE15	Contrave	Orlistat	Contrave	(H),(U)Qsymia
CASE16	(H),(U) Qsymia	(C)Wegovy	Metformin	(H),(U)Qsymia
CASE17	Metformin	Qsymia	Metformin	(H)Qsymia
CASE18	Setmelanotide	(H)Phentermine	(H)Metformin	(H)Qsymia
CASE19	Contrave	Contrave	Contrave	(H)Qsymia
CASE20	(H)Contrave	Phentermine	Phentermine	(H)Qsymia

H, hallucination; U, unsafe AOM choice; C, creative choice.

Table 2.

Example Hallucinations and Unsafe Medication Choices by Large Language Models

LLM	AOM choice	Rationale provided by LLM for choice	Clinical validity and comments from expert clinician
GPT-4o	Wegovy	“Although there is a concern due to the family history of thyroid cancer, the type of thyroid cancer is unspecified. Typically, medullary thyroid carcinoma is the main concern with GLP-1 receptor agonists. Given no personal history or direct indication of medullary thyroid carcinoma in Jane or specific types in her family, the benefits of continuing with Wegovy outweigh the potential risks. However, it’s important to monitor her thyroid status consistently due to her family history.”	While technically an unsafe choice, given the patient’s history of unknown type of thyroid cancer in the family, this is not necessarily a completely incorrect rationale, and could rationalize a risk benefit discussion regarding starting an incretin mimetic in this patient.
GPT-4o	Qsymia	“Although phentermine has stimulant properties that can potentially increase heart rate and blood pressure, the extended-release formulation in Qsymia, combined with topiramate, has a more favorable cardiovascular profile than phentermine alone. His blood pressure and heart rate are well-controlled with his current medication regimen, allowing for this option with careful monitoring.”	This patient had history of a stroke and prior myocardial infarction, with SBP 140 in office, which are technically contraindications to phentermine. Although, the LLM makes a decent point about good control on current medications, which may allow for Qsymia with careful monitoring.
Claude3.5 Sonnet	Qsymia	“Kidney stone consideration: Topiramate, a component of Qsymia, has been shown to have some protective effects against kidney stones, which is relevant given Linda’s history.”	This is incorrect; topiramate can cause increased risk of kidney stones.
Llama3	Qsymia	“Topiramate may also help with blood pressure control and metabolic parameters. The patient has a history of kidney stones, and topiramate, a component of Qsymia, has been shown to decrease the risk of kidney stone formation.”	This is incorrect; topiramate can cause increased risk of kidney stones.
Llama3	Qsymia	“While Sarah has a history of atrial fibrillation, Qsymia does not have significant contraindications or interactions with her current medications, including warfarin.”	Although it may not interact with her current mediations, she should not be prescribed Qsymia given history of atrial fibrillation.
Llama3	Qsymia	“Maria’s medical history, including glaucoma and chronic pain, does not contraindicate the use of Qsymia.”	Qsymia is contraindicated in glaucoma.
Llama3	Qsymia	“The patient’s history of stroke and myocardial infarction makes them a high-risk candidate for further cardiovascular events, and Qsymia has been demonstrated to be safe and effective in patients with cardiovascular disease.”	Qsymia should be avoided in patients with prior myocardial infarction or stroke given risk for further complications.

Table 3.

Valid Artificial Intelligence Prescribing Rationales That Did Not Match Clinician Choices (Creative Choices)

LLM	AOM choice	Rationale provided by LLM for choice	Clinical validity and comments from expert clinician
GPT-4o	Wegovy	“Although there is a concern due to the family history of thyroid cancer, the type of thyroid cancer is unspecified. Typically, medullary thyroid carcinoma is the main concern with GLP-1 receptor agonists. Given no personal history or direct indication of medullary thyroid carcinoma in Jane or specific types in her family, the benefits of continuing with Wegovy outweigh the potential risks. However, it’s important to monitor her thyroid status consistently due to her family history.”	While technically an unsafe choice, given patient’s history of unknown type of thyroid cancer in the family, this is not necessarily a completely incorrect rationale, and could rationalize a risk benefit discussion regarding starting an incretin mimetic in this patient.
Gemini 1.5	Wegovy	“Suboptimal Weight Loss on Ozempic: While Michael has seen improvements in his blood sugar control on Ozempic, his weight loss has been insignificant. This suggests that a higher dose of semaglutide, the active ingredient in both Ozempic and Wegovy, may be beneficial,”	Given the slightly higher dosing of semaglutide in Wegovy, this is an interesting consideration even though the clinical relevance of the slightly higher dose is likely negligible.
Gemini 1.5	Metformin	“John is self-paying, making cost a significant factor in medication choice. He’s already on metformin, and increasing his dosage for weight loss would be considerably more affordable than starting a newer, branded medication.”	While weight loss with metformin will likely be minimal, this is still a creative rationale for a medication choice by Gemini1.5.
Gemini 1.5	Wegovy	“Cardiovascular Benefits: Given Arthur’s history of stroke and myocardial infarction, the established cardiovascular risk reduction benefits associated with Wegovy (semaglutide) make it a compelling choice. Medicare Coverage: Medicare coverage for weight-loss medications is limited, so careful verification of Arthur’s specific plan and potential out-of-pocket costs for Wegovy is essential. While cost is a factor, the potential long-term health benefits and cost savings associated with preventing future cardiovascular events should be considered. Discussing these factors and exploring potential financial assistance programs (if needed) with Arthur would be essential.”	In this case, a portion of the cost of Wegovy should be covered by Medicare given the patient’s history of both stroke and myocardial infarction, but this is an important consideration.
Gemini 1.5	Liraglutide	“Post-Bariatric Efficacy: While more research is always being done, some studies suggest that liraglutide (Saxenda) may be particularly effective for weight regain after bariatric surgery.”	It is true that the most robust studies in postbariatric weight regain and medication decisions include liraglutide, while newer incretin mimetics are less well studied to date.

Table 4.

Antiobesity Medication Choices by Expert Clinicians: Binary Yes/No (1/0), if the Large Language Model Matched with at Least One Expert Opinion

	Expert1	Expert2	Expert3	Expert4	GPT-4o	Gemini1.5	Claude3.5 Sonnet	Llama3
CASE1	Wegovy	Mounjaro	Ozempic	Mounjaro	1	1	1	0
CASE2	Phentermine	Phentermine	Metformin	Qsymia	0	1	1	1
CASE3	Ozempic	Mounjaro	Ozempic	Ozempic	1	1	0	0
CASE4	Phentermine	Phentermine	Mounjaro	Phentermine	0	0	0	0
CASE5	Wegovy	Wegovy	Wegovy	Qsymia	1	0	0	1
CASE6	Wegovy	Topiramate	Wegovy	Wegovy	0	1	0	0
CASE7	Orilstat	Naltrexone	Metformin	Contrave	1	1	1	0
CASE8	Wegovy	Mounjaro	Wegovy	Mounjaro	1	0	1	0
CASE9	Phentermine	Phentermine	Phentermine	Metformin	1	1	1	0
CASE10	Mounjaro	Wegovy	Wegovy	Ozempic	1	1	1	1
CASE11	Phentermine	Mounjaro	Mounjaro	Mounjaro	1	0	1	0
CASE12	Contrave	Phentermine	Contrave	Qsymia	0	1	1	1
CASE13	Wegovy	Topiramate	Topiramate	Qsymia	0	0	0	1
CASE14	Phentermine	Qsymia	Contrave	Ozempic	0	0	0	0
CASE15	Contrave	Metformin	Metformin	Bupropion	1	0	0	0
CASE16	Wegovy	Wegovy	Wegovy	Wegovy	0	1	0	0
CASE17	Wegovy	Wegovy	Wegovy	Wegovy	0	0	0	0
CASE18	Setmelanotide	Setmelanotide	Wegovy	Setmelanotide	1	0	0	0
CASE19	Phentermine	Naltrexone	Naltrexone/Bupropion	Naltrexone/Bupropion	1	1	1	0
CASE20	Phentermine	Phentermine	Wegovy	Qsymia	0	1	1	1

Table 5.

Number of Large Language Model Hallucinations, High-Risk Drug Choices, and Antiobseity Medicine Choices Matched with Expert Clinicians (out of 20 Cases)

	GPT-4o	Gemini 1.5 Pro	Claude 3.5 Sonnet	Llama 3
Hallucinations	4	2	4	17
High-risk drug choices	2	0	1	4
Matched choices with experts	11	11	10	6

FIG. 1.

Percentage of LLM hallucinations, high-risk drug choices, and AOM choices matched with expert clinicians (out of 20 cases). LLM, large language model; AOM, antiobesity medicine.

There was also overlap in hallucinations and unsafe medication choices; some choices by the LLMs were contraindicated (unsafe) but were still selected with false rationales (hallucinations). For example, Claude3.5 Sonnet chose Qysmia (phentermine, topiramate) for a patient with kidney stones stating, “Kidney stone consideration: topiramate, a component of Qsymia, has been shown to have some protective effects against kidney stones, which is relevant given Linda’s history.” This is incorrect, as topiramate increases the risk of kidney stones.¹⁴ Only Gemini1.5 avoided high-risk or contraindicated medications in all cases. Further examples of hallucinations and unsafe medication choices with the rationales provided by each LLM and expert clinician comments can be found in Table 2.

Discussion

There was significant variability in performance across LLMs when evaluated for the number of hallucinations, unsafe medication choices, and choices matched with expert clinicians (Table 5, Fig. 1). Although all the LLMs had hallucinations, Llama3 had significantly more than others. This may be because only Llama3 used an independent program and was run on a personal computer (which limited choices and learning). There is a “Meta AI” chatbot based on Llama3 and operated by Meta. However, it is already pretrained with multiple algorithms, and all medical questions are strictly prohibited. Therefore, we are unable to perform any tests using this online-based chatbot. Gemini1.5 was the only LLM that did not choose any contraindicated or high-risk medications across all 20 cases. Except for Llama3, which chose either Wegovy (semaglutide) or Qsymia (phentermine, topiramate) for all 20 cases, all the LLMs had decent variability (>3 different medications) in medication choices. The variability in hallucinations and medication choices between the LLMs is likely due to the pretraining times, or the duration of time it takes to train a machine learning model on a large general dataset, as pretraining times differ for all LLMs. This is also likely why no LLM chose a newer medication, such as Zepbound (tirzepatide), as the study was run in July 2024, and the LLMs did not have the knowledge of Zepbound in their training datasets.

Some of the LLMs’ choices and rationales included creative insights that are still valid even if they did not match the experts' choices (Table 3). For example, GPT-4o chose Wegovy for a patient with family history of an unknown type of thyroid cancer with the rationale, “Although there is a concern due to the family history of thyroid cancer, the type of thyroid cancer is unspecified. Typically, medullary thyroid carcinoma is the main concern with glucagon-like peptide-1 (GLP-1) receptor agonists. Given no personal history or direct indication of medullary thyroid carcinoma in Jane or specific types in her family, the benefits of continuing with Wegovy outweigh the potential risks. However, it’s important to monitor her thyroid status consistently due to her family history.” While technically this could also be considered an unsafe hallucination, this clinical rationale is valid and may provide the opportunity for a risk-benefit discussion with the patient. As another example, Gemini1.5 Pro chose to switch a patient with type 2 diabetes from Ozempic to Wegovy with the rationale, “Suboptimal Weight Loss on Ozempic: While Michael has seen improvements in his blood sugar control on Ozempic, his weight loss has been insignificant. This suggests that a higher dose of semaglutide, the active ingredient in both Ozempic and Wegovy, may be beneficial,” given the slightly higher dosing of semaglutide in Wegovy, this is an interesting consideration even though the clinical relevance of the slightly higher dose is likely negligible.

LLMs also accurately factored insurance considerations into AOM decision making. For example, Gemini1.5 Pro selected metformin for a self-paying patient with the rationale, “John is self-paying, making cost a significant factor in medication choice. He’s already on metformin, and increasing his dosage for weight loss would be considerably more affordable than starting a newer, branded medication.” While weight loss with metformin will likely be minimal, this is still a creative rationale for a medication choice by Gemini1.5 Pro and illustrates the sophistication of the LLM decision making in select cases. As another example of sophisticated risk-benefit considerations and insurance coverage, Gemini1.5 Pro chose Wegovy for a Medicare patient stating, “Cardiovascular Benefits: Given Arthur’s history of stroke and myocardial infarction, the established cardiovascular risk reduction benefits associated with Wegovy (semaglutide) make it a compelling choice. Medicare Coverage: Medicare coverage for weight-loss medications is limited, so careful verification of Arthur’s specific plan and potential out-of-pocket costs for Wegovy is essential. While cost is a factor, the potential long-term health benefits and cost savings associated with preventing future cardiovascular events should be considered. Discussing these factors and exploring potential financial assistance programs (if needed) with Arthur would be essential.” In this case, a portion of the cost of Wegovy should be covered by Medicare given the patient’s history of both stroke and myocardial infarction, but this is an important consideration by Gemini1.5 Pro and illustrates another example of sophisticated decision-making by an LLM.

A number of limitations are inherent to LLM use as clinical decision tools. One issue inherent in LLM use for AOM prescribing is the rapidly changing nature of the AOM landscape and clinical recommendations, as well as changes in insurance coverage plans. Real-world data incorporation and pretraining takes time for LLMs, which means that in a rapidly changing clinical landscape, they may not be able to offer clinical recommendations that are current. As discussed above, pretraining lag time is likely why none of the LLMs chose a newer medication such as Zepbound, as this study was run in July of 2024. Another limitation is the tradeoff between higher temperature settings for LLMs, which can act as a proxy for creativity, but can also produce inconsistent outcomes. For example, if the same query is presented to the same LLM multiple times, it may yield different responses and varied rationales for selecting an AOM.¹⁵

Another limitation of this study is that the LLMs were not trained specifically for obesity medicine diagnostics. Therefore, there were still a significant number of errors in formulating the treatment plans. However, the results may be different if the LLMs underwent training. It is possible to train a model to increase accuracy of generated data, diagnosis, and treatment options, similar to how we train human clinicians in use of AOMs.

Based on these results, many LLM tools are not yet ready to act as the primary decision maker for AOMs. Providers should counsel patients that LLMs and other AI tools should never be used alone to diagnose or treat medical conditions, and patients should always consult with a trained physician to discuss risks and benefits of different medications for overweight and obesity.

Conclusions

AI is emerging as a promising tool for information synthesis and clinical problem solving in medicine. However, none of the LLMs tested in this study had >55% matching rate with obesity expert clinicians for AOM choices. Safety and accuracy concerns persist due to hallucinations and unsafe medication choices. Therefore, it is premature to use any of the four LLMs we tested in this study (GPT-4o, Llama3, Gemini1.5, and Claude3.5 Sonnet) as clinical decision support tools for AOM prescribing. At the same time, our study highlights that LLMs can offer meaningful perspectives on clinical decisions for obesity treatment, including ideas that clinicians might not typically consider. Matching rates and safety concerns can be addressed through fine-tuning, retrieval-augmented generation, detailed prompt engineering, and/or the integration of real-world data into LLMs. Rigorous validation will be essential to ensure their dependability and safety before these models can be utilized in clinical settings to guide obesity treatments.

Footnotes

Authors’ Contributions

D.W.K. conceived of the presented idea and served as the corresponding author. B.F. developed the article. H.L. performed the data analysis. T.T., T.D., and C.C. contributed to data collection and verified the findings. C.M.A. supervised the work and encouraged B.F. throughout article development. All authors discussed the results and contributed to the final article.

Author Disclosure Statement

B.F.—Nothing to disclose. H.L.—Nothing to disclose. T.T.—Nothing to disclose. T.D.—Nothing to disclose. C.C.—Nothing to disclose. C.M.A. has participated on advisory boards for Altimmune, Inc; BioAge; Biolinq Inc; CinFina Pharma, Inc; Cowen and Co, LLC; Covi-dien LP; Form Health, Inc; Fractyl Health, Inc; Gelesis SRL; Lilly USA, LLC; L-Nutra, Inc; NeuroBo Pharmaceuticals, Inc; Nutrisystem; OptumRx, Inc; Pain Script Corp; Pa-latin Technologies, Inc; Pursuit By You; Redesign Health Inc; ReShape Lifesciences Inc; Riverview School; Roman Health Ventures Inc; Vida Health, Inc; Veru Inc; Wave Life Sciences; Xeno Biosciences; and Zyversa Therapeutics, Inc. D.W.K.—Nothing to disclose.

Funding Information

No funding was received for conducting this study.

Supplementary Material

References

Thirunavukarasu

, Ting

DSJ

, Elangovan

, et al. Large language models in medicine. Nat Med, 2023; 29(8):1930–1940; doi: 10.1038/s41591-023-02448-8

Elhaddad

, Hamam

. AI-Driven clinical decision support systems: An ongoing pursuit of potential. Cureus, 2024; 16(4):e57728; doi: 10.7759/cureus.57728

Reddy

. Generative AI in healthcare: An implementation science informed translational path on application, integration and governance. Implement Sci, 2024; 19(1):27; doi: 10.1186/s13012-024-01357-9

Sallam

. ChatGPT utility in healthcare education, research, and practice: Systematic review on the promising perspectives and valid concerns. Healthcare (Basel), 2023; 11(6); doi: 10.3390/healthcare11060887

Kim

, Park

, Sharma

, et al. Qualitative evaluation of artificial intelligence-generated weight management diet plans. Front Nutr, 2024; 11:1374834; doi: 10.3389/fnut.2024.1374834

Finkelstein

, Khavjou

, Thompson

, et al. Obesity and severe obesity forecasts through 2030. Am J Prev Med, 2012; 42(6):563–570; doi: 10.1016/j.amepre.2011.10.026

Chalasani

, Syed

, Ramesh

, et al. Artificial intelligence in the field of pharmacy practice: A literature review. Explor Res Clin Soc Pharm, 2023; 12:100346; doi: 10.1016/j.rcsop.2023.100346

Mohsen

, Tripathi

, Mizuguchi

. Deep learning prediction of adverse drug reactions in drug discovery using open TG–GATEs and FAERS databases. Front Drug Discov, 2021; 1; doi: 10.3389/fddsv.2021.768792

Hammann

, Gutmann

, Vogt

, et al. Prediction of adverse drug reactions using decision tree modeling. Clin Pharmacol Ther, 2010; 88(1):52–59; doi: 10.1038/clpt.2009.248

10.

Bresso

, Grisoni

, Marchetti

, et al. Integrative relational machine-learning for understanding drug side-effect profiles. BMC Bioinformatics, 2013; 14:207; doi: 10.1186/1471-2105-14-207

11.

Sutton

, Pincock

, Baumgart

, et al. An overview of clinical decision support systems: Benefits, risks, and strategies for success. NPJ Digit Med, 2020; 3:17; doi: 10.1038/s41746-020-0221-y

12.

Jungreithmayr

, Meid

, Haefeli

, et al.; Implementation Team. The impact of a computerized physician order entry system implementation on 20 different criteria of medication documentation-a before-and-after study. BMC Med Inform Decis Mak, 2021; 21(1):279; doi: 10.1186/s12911-021-01607-6

13.

Blasiak

, Truong

, Tan

WJL

, et al. PRECISE CURATE.AI: A prospective feasibility trial to dynamically modulate personalized chemotherapy dose with artificial intelligence. J Clin Oncol, 2022; 40(16_suppl):1574–1574; doi: 10.1200/JCO.2022.40.16_suppl.1574

14.

Welch

, Graybeal

, Moe

, et al. Biochemical and stone-risk profiles with topiramate treatment. Am J Kidney Dis, 2006; 48(4):555–563; doi: 10.1053/j.ajkd.2006.07.003

15.

Peeperkorn

, Kouwenhoven

, Brown

, et al. Is temperature the creativity parameter of large language models? arXiv, 2024 preprint arXiv:240500492

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.06 MB

0.01 MB

Comparison of Antiobesity Medication Decisions: Large Language Models and Obesity Medicine Expert Clinicians

Abstract

Background:

Methods:

Results:

Conclusion:

Introduction

Methods

Results

Discussion

Conclusions

Footnotes

Authors’ Contributions

Author Disclosure Statement

Funding Information

Supplementary Material

References

Supplementary Material