Sage Journals: Discover world-class research

Abstract

Background

Worldwide, patients are increasingly being offered access to their full online clinical records including the narrative reports written by clinicians (so-called “open notes”). Against these developments, there is growing interest in the use of generative artificial intelligence (AI) such as OpenAI's ChatGPT to co-assist clinicians with patient-facing documentation.

Objective

This study aimed to explore the effectiveness of OpenAI's ChatGPT 3.5 and GPT 4.0 in generating three patient-facing clinical notes from fictional general practice narrative reports.

Methods

On 1 October 2023 and 1 November 2023, we used ChatGPT 3.5 and 4.0 to generate notes for three validated fictional general practice notes, using a prompt in the style of a British primary care note for three commonly presented conditions: (1) type 2 diabetes, (2) major depressive disorder, and (3) a differential diagnosis for suspected bowel cancer. Outputs were analyzed for reading ease, sentiment analysis, empathy, and medical fidelity.

Results

ChatGPT 3.5 and 4.0 wrote longer notes than the original, and embedded more second person pronouns, with ChatGPT 3.5 scoring higher on both. ChatGPT expanded abbreviations, but readability metrics showed that the notes required a higher reading proficiency, with ChatGPT 3.5 demanding the most advanced level. Across all notes, ChatGPT offered higher signatures of empathy across cognitive, compassion/sympathy, and prosocial cues. Medical fidelity ratings varied across all three cases with ChatGPT 4.0 rated superior.

Conclusions

While ChatGPT improved sentiment and empathy metrics in the transformed notes, compared to the original they also required higher reading proficiency and omitted details impacting medical fidelity.

Keywords

Generative artificial intelligence primary care general practice online record access electronic health records open notes patient-centered care documentation

Introduction

Worldwide, health institutions are increasingly opening online patient access to medical records via secure portals and apps.¹ Online record access (ORA) can include test results, lists of medications, and narrative reports written by clinicians (the latter, often referred to as “open notes”).² In some countries, the practice is advanced.^3,4 In the United States from April 2021, the 21st Century Cures Act mandated that providers offer patients access to their online clinical records, without charge.⁵ In the Nordic countries, ORA has been implemented incrementally, starting around 2010.¹ For example, in Finland, Omakanta, or My Kanta, was rolled out with implementation rolled out between 2010 and 2015.⁶ In Sweden, patients first obtained ORA through 1177 in 1 of 21 regions in 2012⁷ with countrywide implementation reached by 2018. In the United Kingdom, from November 2023 it became mandatory for general practitioners (GPs) working in NHS England to enable prospective ORA by default to patients aged 16 or older.⁸

In the era of open notes, the functionality of medical records is evolving. The record is no longer only an aide memoire or communication tool for clinicians but now has an additional purpose to rapidly convey health information to patients, and their caregivers.^1,9 With the knowledge that patients may now be reading what they write, in some surveys physicians report changing how they document medical information, and the language they use.^10–12 While the extent of changes to documentation post-ORA is not well understood,¹³ conceivably some modifications might be positive, for example removing dense medical terminology, problematic acronyms (e.g. “SOB” for shortness of breath, or “F/U” for follow up) or omitting potentially offensive medical vernacular (such as “patient denies,” or “patient complains of”). However, other changes might risk undermining the accuracy and completeness of records.^14,15 Again, although few objective studies have explored the potential for additional work burdens of open notes,¹⁶ some clinicians report spending longer writing documentation for ORA.^12,17 Some suggest that there is an essential tension between dual function documentation written for both clinicians and patients,¹⁸ and that notes should ideally be created for respective readerships.^18,19

Increasingly, it is recognized that the use of generative artificial intelligence (AI) may offer a long-term strategy to assist clinicians with undertaking such documentation, including co-writing open notes.^19–21 Increased accessibility of large language models (LLMs) such as OpenAI's Chat Generative Pre-trained Transformer (ChatGPT), Meta's Large Language Model Meta AI (LLaMA), and Google's Pathways Language Model 2 (PaLM2) make them particularly viable. These tools have the ability to recognize and summarize data,²² and to present content in a variety of requested styles including embedding empathic and supportive language.²³ In addition, the speed of responses combined with their conversational fluency means uptake has been rapid.

Moreover, preliminary evidence already suggests that clinicians are adopting LLM-powered chatbots for a variety of tasks including assisting with documentation.^24,25 In October 2023, a survey conducted with the American Psychiatric Association found that 44% of respondents had used ChatGPT 3.5 and 33% had used 4.0 “to assist with answering clinical questions” with 70% of psychiatrists believing that “documentation will be/is more efficient” as a result of these tools.²⁵ Even more pressingly, conducted in February 2024, a study of 1006 UK GPs found that 20% reported using generative AI tools in clinical practice; of those who answered affirmatively and were invited to clarify further, 29% reported using these tools to generate documentation after patient appointments.²⁶ These findings highlight the need for further research into the adequacy of these tools to assist with writing clinical notes.

Despite their considerable promise, these tools come with well-documented limitations.^27–29 The nature of the datasets on which responses are trained is critical, and any biases embedded in the training set, or among human agents involved in labeling or training the AI, mean biases may become baked into responses. In addition, the more accessible LLM-powered chatbots are not exclusively trained on medical texts and treat the varied quality of information available on the internet indiscriminately. Furthermore, the routine under recruitment of female participants, racial and ethnic minorities, and seniors in research mean that disparities may already be embedded in published medical texts.^30–34 Combined, these factors, influence the scope of coded and biased responses offered by LLM-chatbots.³⁵

Other problems include lack of consistency in responses, and “hallucinations”—the tendency of LLM-chatbots to invent false information.²⁹ Despite their fluency, these tools do not understand the information fed into them leading to a variety of elementary linguistic shortcomings, such as the inability to understand negation, and to be easily confused by word sentence order or rephrasing of questions.^27,36 For example, inputting the same question to ChatGPT rarely elicits the same response.^37,38 Yet, their quick and compelling conversational tone means LLM-chatbot responses can appear authoritative and factual, leading to risks of misinformation.^39,40

Notwithstanding these shortcomings, LLM-powered chatbots carry enormous potential to co-assist clinicians when it comes to writing documentation. We emphasize that ChatGPT is not trained on medical data, specifically, and therefore other medical grade models such as Google's PALMMed2 may do a superior job writing documentation. However, given the commercial availability of ChatGPT, and the fact it appears to be the most widely adopted LLM chatbot,²⁶ with preliminary studies indicating physicians are already using it, we have chosen to focus on GPT in this study.^25,26 Furthermore, while the promise of documentation capacity frequently alluded to in academic medical journals,^41–43 to date there is scarcely any experimental exploration of the effectiveness of LLM tools in writing open notes.⁴⁴ In a randomized controlled study led by Baker et al.,⁴⁵ ChatGPT generated longer and more detailed documentation compared with typing or dictation methods, however it also embedded errors and hallucinations.

To address current research gaps, we used ChatGPT⁴⁶—the most widely adopted generative AI chatbot—to examine its ability to translate physician documentation into patient-facing notes. First, we aimed to compare the linguistic properties of the original and generated ChatGPT notes. Second, we aimed to assess the patient-centeredness of the original and ChatGPT notes by analyzing for readability and empathy. Third, we aimed to assess the medical fidelity of the generated notes.

Methods

To answer the research questions, we carried out a study in the United Kingdom between October 2023 and January 2024, in which we used ChatGPT to transform fictional primary care notes into patient-facing notes.

Materials

We used three fictional primary care notes written in the style of free text entries by GPs in England. These entries were devised by one of the authors (BM) who works as a GP in England. Each note was independently validated for authenticity by a panel of six UK-based GPs. Entries were devised to encompass three commonly presented chronic conditions in primary care: (a) a diagnosis of type 2 diabetes, (b) a diagnosis of major depressive disorder, and (c) a differential diagnosis where the probable opinion was bowel cancer (see Supplemental Appendix 1). Notes were devised to maximize authentic levels of detail including acronyms and potential for offensive language, and were deliberately fictionalized to avoid ethical concerns associated with using real patient clinical data, including potential for de-identification.^28,47

Procedure

Each fictionalized note was cut and pasted by author CB into ChatGPT together with following prompt, “Write an understandable and empathic clinical note to be read by the patient described in this record:”. We compared the responses of ChatGPT 3.5, which is free to users and is described by OpenAI as, “Our fastest model, great for most everyday tasks”, and ChatGPT 4.0, which functions behind a paywall, and is described as, “Our most capable model, great for tasks involving creativity and advanced reasoning”.⁴⁶ This was done on 1 October 2023 and repeated on 1 November 2023 in light of findings that its outputs are inconsistent and may change over time.⁴⁸ The resulting ChatGPT notes were then saved in a way preserving text formatting and layout (Supplemental Appendix 1).

Data analysis

To measure the linguistic metrics of the original and generated notes, we assessed total word count, words per sentence (WPS), percentage of pronounce per note and percentage of abbreviations per note. To calculate these, AK used Linguistic Inquiry and Word Count software (LIWC-22, University of Texas at Austin).

To assess patient-centeredness, AK, JH, IM, and CB compiled readability and empathy metrics. Readability metrics included several metrics designed to assess the difficulty of reading a text by equating it to a projected reading grade level in the US school system. These were Flesch Reading Ease, Flesch-Kincaid Grade Level,⁴⁹ and the Gunning Fog Index.⁵⁰ We adopted the approach of using distinctive but complementary metrics to obtain a richer understanding of readability. For example, the Flesch Reading Ease and Flesch-Kincaid Grade Level use word syllable count to gauge readability while the Gunning Fox Index includes variables such as the proportion of complex words.

To derive empathy metrics , we used both sentiment analysis and qualitative analysis. For the sentiment analysis, the original and ChatGPT notes were cleaned of stop words and punctuation, associating the data with a Word-Emotion Association lexicon,⁵¹ and calculating the percentages of emotion/sentiment words in each text. Eight emotions (anger, anticipation, disgust, fear, joy, sadness, surprise, and trust) and two sentiments (positive and negative) were considered. One word could be associated with several sentiments/emotions. JH conducted the analysis using R v4.2.2 (R Core Team).

The qualitative analysis employed a theoretically deductive thematic approach following the six-phase process outlined by Braun & Clarke.⁵² To promote theme saturation, all ChatGPT notes were analyzed together. Deductive qualitative analysis was chosen for the analysis of empathy because there is considerable latitude in how empathy is defined in medical contexts with the potential to lead to problematic inferences.^53,54 Following previous work, we interpreted empathy as a multifaceted construct encompassing four dimensions: affective empathy (the capacity to feel what others are feeling which may include affective reactions), cognitive empathy (the capacity to identify, interpret and demonstrate understanding about another person's emotional state), compassion/sympathy (signals of warmth or feeling for someone's wellbeing), and prosocial behavior (signals of helping).^55–59 While ChatGPT is incapable of feeling or grasping human experiences, perceptions of empathy may be embedded via textual cues. Using the four empathy dimensions, authors AK and CB marked phrases and sentences in the original and ChatGPT notes which signaled these dimensions. Once the text passages that signal empathy were extracted, AK coded them through low-level descriptive codes. Double-coding of passages was permissible and codes were not restricted to a single empathy dimension due to semantic overlap between dimensions. Once all text passages were coded, AK created a list of categories by merging semantically similar codes. The category list for each dimension was drafted by AK and finalized through iterative discussions between AK and CB.

To assess medical fidelity , we convened a panel of UK-based GPs. In January 2024, author BM compiled a list of 10 GPs and contacted them individually with a document containing the original note for each case, as well as the ChatGPT notes with anonymized titles to hide the GPT version and time of collection. For each ChatGPT note, we asked the panel to judge whether they would or would not choose to use the note unchanged and to rate how well the generated note preserved the clinical accuracy of the original note (see Supplemental Appendices 1 and 2). A free text box was available to provide additional information. Qualitative analysis was carried out using inductive thematic approach following the six-phase process.⁵² Author AK coded the comments through low-level codes that were transformed into more abstract categories which gave rise to semantically connected themes. The final themes and categories were compiled through iterative discussions between AK and CB.

Ethical considerations

Based on the NHS Research Ethics Committee Review Decision Tool, the study did not require ethical approval since we used only fictional clinical notes entered in a publicly accessible website and did not collect or analyze any personally identifying information but used only fictionalized material. The invited panel of experts were informed of the purposes of the investigation and provided written consent to the further analysis of the data and its use in educational and scientific publications.

Results

Linguistic metrics

Linguistic metrics of the notes are presented in Table 1. For all three cases, the ChatGPT 3.5 notes were longer than the original. The 4.0 notes were longer than the original but shorter than the 3.5 notes. The original note had no second person pronouns, for example, “you,” “your,” in the cases of diabetes and depression but some for cancer. Both ChatGPT 3.5 and 4.0 notes had a higher presence of second person pronouns across all cases with ChatGPT 3.5 being the highest.

Table 1.

Linguistic metrics of original note and ChatGPT notes from 1 October 2023.

	Original note			ChatGPT 3.5			ChatGPT 4.0
	Diabetes	Depression	Cancer	Diabetes	Depression	Cancer	Diabetes	Depression	Cancer
Total words, n	122	352	115	488	721	479	373	506	355
WPS, m	13.56	8.38	12.78	18.77	18.02	19.16	15.54	14.88	16.9
Pronouns, %
1st person, sg.	0.82	0.28	0.87	3.07	1.94	1.67	1.34	–	2.25
1st person, pl.	–	–	–	2.87	2.22	2.71	2.95	–	1.13
2nd person	–	–	0.87	6.97	10.4	8.77	4.83	1.19	8.45
3rd person, sg.	–	0.28	0.87	–	–	0.21	–	7.71	0.28
3rd person, pl.	–	–	–	–	0.28	0.21	–	0.4	0.28
Abbreviations, n	17	4	10	3	1	5	11	1	8
Of which explained	–	–	–	1	–	5	8	–	7

Note: WPS: words per sentence. Pronouns detected by LIWC-22: 1st person, sg.—I, me, my, myself; 1st person, pl.—we, our, us, lets; 2nd person—you, your, u, yourself; 3rd person, sg.—he, she, her, his; 3rd person, pl.—they, their, them, themsel*. In the count of abbreviations, only unique instances were included and “XY” was excluded as it was placeholder initials for the fictitious note.

There were more abbreviations in the original notes than any of the generated ones. None of the abbreviations were explained in the original notes, ChatGPT 3.5 decoded most abbreviations, but ChatGPT 4.0 explained them more.

Readability metrics

For all cases, the ease of readability decreased from the original note to ChatGPT 3.5 and 4.0 notes according to the Flesch Reading-Ease test, the Flesch-Kincaid Grade Level test, and the Gunning Fox Index (see Table 2). In almost all notes, the note generated by ChatGPT 3.5 required the highest reading proficiency.

Table 2.

Readability metrics of original note and ChatGPT notes from 1 October 2023.

	Original note			ChatGPT 3.5			ChatGPT 4.0
	Diabetes	Depression	Cancer	Diabetes	Depression	Cancer	Diabetes	Depression	Cancer
Flesch Reading-Ease
Score	77.7	79.7	66.7	55.2	57.3	51.9	63	56.2	54.6
US grade level	7th	7th	8th–9th	10th–12th	10th–12th	10th–12th	8th–9th	10th–12th	10th–12th
Flesch-Kincaid grade level
Score	4.9	4.2	6.5	10.1	9.8	10.8	7.8	9.2	9.9
Gunning Fox Index
Score	7.1	5.9	6.5	13.3	12.5	10.8	9.2	9.2	9.9
US grade level	7th	6th	8th	College freshman	High-school senior	College sophomore	High-school freshman	High-school senior	High-school senior

Note: For the Flesch Reading-Ease test, the lower the score the more difficult the text is to read. For the Flesch-Kincaid Grade Level, the score represents a US grade level. For the Gunning Fox Index, the higher the score the more difficult the text is to read.

Empathy metrics

Sentiment analysis of the original and ChatGPT notes showed a general increase in positive-leaning sentiment in the generated notes (see Figure 1).

Figure 1.

Frequency of identified emotions and sentiments in the original and ChatGPT notes from 1 October 2023.

Results from the qualitative analysis of the presence of the predefined empathy dimensions can be seen in Table 3. There were only four instances in the original notes that were coded as containing empathic text. In contrast, a variety of empathic text was identified in the ChatGPT notes.

Table 3.

Dimensions of empathy in the original note and all ChatGPT notes broken into categories supplemented by exemplary quotes.

Dimension	Original note	All ChatGPT notes
Affective
–	–	–
Cognitive
Understanding context/stressors	–	“Sally, it's evident that you've been through quite a lot in the past year.”
Understanding upset state	“Understandably upset.”	“I understand that this may come as a shock.”
Acknowledgement of challenges	–	“… it is clear that you have been facing some challenging times recently.”
Reassurance	–	“I want to reassure …”
Understanding concerns	“… understandably concerned.”	“I understand that the possibility of colorectal cancer can be concerning.”
Diagnosis can be overwhelming	–	“… managing a new diagnosis can feel overwhelming.”
Journey metaphor	–	“Please remember that managing diabetes is a journey, and it's completely normal to have questions and concerns along the way.”
Compassion/sympathy
Well wishes	–	“I hope this message finds you well.”
Offer of support	–	“I've tasked a team member, XY, to assist you with the booking.”
Appreciation	–	“I appreciate the openness and honesty you brought to our conversation.”
Positive outlook	–	“… there is hope for a brighter future.”
Not alone	–	“Please remember that you don’t have to navigate this alone.”
Patient is a priority	–	“Your well-being is our top priority.”
Praise	–	“You have strengths and resilience within you.”
Commitment to honesty	–	“I want to be transparent and honest with you.”
Prosocial behavior
Referral to colleague/resources	–	“I've sent you a link to the ‘Live Lighter’ program which offers guidance on this journey.”
Explanation	–	“I wanted to take a moment to discuss the findings from your recent appointment …”
Encouragement to reach out	–	“Please don't hesitate to reach out if you have any questions or concerns in the meantime.”
Offer of support	“Advised support available …”	“…we are here to support you every step of the way.”
Next steps in care	“I will call with results.”	“We have scheduled a follow-up appointment in two weeks.”
Commitment to help	–	“… we are committed to providing you with the best care possible.”
Not alone	–	“Please remember that you don't have to go through this alone, and there are people who care about your well-being, including your mum and brother.”
Journey metaphor	–	“Together, with the right support and resources, we will navigate this journey.”
Promise of personal involvement	–	“I will personally call you as soon as the results come in to discuss them with you.”

Note: Text passages could signal more than one empathy dimension and category, which lead to same categories appearing in multiple dimensions.

Medical fidelity

Six out of the 10 invited GPs judged the medical fidelity of the ChatGPT notes (see Table 4).

Table 4.

Medical fidelity measures of original note and ChatGPT notes from 1 October 2023.

	Original note			ChatGPT 3.5			ChatGPT 4.0
	Diabetes	Depression	Cancer	Diabetes	Depression	Cancer	Diabetes	Depression	Cancer
Would you use the note generated by ChatGPT unchanged as it is? n (%)
Yes	–	–	–	0	2 (66.6)	5 (100)	3 (50)	6 (100)	6 (100)
No	–	–	–	6 (100)	4 (33.3)	0	3 (50)	0	0
How well does ChatGPT preserve the clinical detail of the original note? median
Rating	–	–	–	5	6	7	7	6	6

Note: The Likert scale to rate the preservation of clinical detail ranged from “1 – Not at all” to “7 – Fully preserves clinical detail.” Calculations were made excluding missing data.

Seventy free-text comments were left by the panel of GPs giving additional insight into their opinion on the generated notes. Emerging themes and categories are illustrated in Table 5.

Table 5.

Themes and categories emerging in the free-text comments left by GPs about the ChatGPT notes.

Theme and categories	Example quote
Positive opinions
Liked the content	“Great letter, with everything required.” [GP3] “Safe translation of requirements.” [GP5]
Liked the tone/language	“… empathetic without being scary or minimizing the potential diagnosis.” [GP2]
Liked the layout/structure	“looks very professional similar to a structured hospital discharge summary.” [GP1]
Could use with changes	“I think I could use this one. The only additional bit of detail I would include would be my impression or diagnosis.” [GP1]
Patient-centered	“… it did a fairly good job of picking out the bits that the patient was interested in.” [GP4]
Negative opinions
Did not like the content	“I would not want to add addition phrases such as do not hesitate to reach out to us (as this will create unrealistic expectations for English primary care context).” [GP3]
Did not like the tone/language	“I’d be curious about what patients think of it and if it's too overfriendly/patronizing.” [GP2]
Did not like the layout/structure	“Starting the letter off by saying “clinical update” implies this is new information rather than a recap of the appointment.” [GP1]
Would not use	“Would definitely not use this text. Far too much jargon. What is a ‘positive finding’ – could mean something bad or good!” [GP5]
Not patient-centered	“Far too much info here and could be very confusing for a lay person.” [GP5]
Missing content	“It ignored the whole of the examination which feels more like it didn’t know what to do with it rather than a deliberate choice.” [GP4]
Too long/verbose	“Again, it is VERY long and this would be a barrier to usability by the doctor in the real world, if this is the only version recorded, as wouldn’t be able to scan it quickly.” [GP2]
Unnecessary information	“Not sure that the apology for booking the patient with the incorrect clinician is needed.” [GP6]
Hallucination	“It's intriguing to see all the letters ‘strongly recommend’ weight loss when this isn’t even implying in the original notes.” [GP4]

Discussion

Main findings

The study aimed to assess the effectiveness of using LLM-chatbots in writing patient-facing clinical documentation. We asked ChatGPT to rewrite three fictitious clinical notes in an understandable and empathic manner for the patient. The generated notes were longer and required higher reading proficiency but contained more positive sentiment and signaled several dimensions of empathy. The medical fidelity of the original notes appeared high but was not always preserved.

All ChatGPT notes increased in total length as well as sentence length compared to the original. Therefore, it is unsurprising that the readability tests indicated a need for a higher reading proficiency, as readability formulas are based on sentence and word lengths.⁴⁹ Notably, the original notes comprised multiple medical abbreviations, none of which were decoded or explained, in keeping with traditional clinical documentation practices. In contrast, ChatGPT notes reduced the use of abbreviations and usually explained most of them. Arguably this approach enhances readability as previous research has shown that patients find the use of medical abbreviations and jargon confusing when reading their notes.^60,61

Empathy-signaling language was almost completely absent in the original notes, which was in stark contrast to the generated notes. The generated notes implied understanding of the hypothetical patient's emotional state, provided reassurance and validation, exhibiting cognitive empathy. Compassion and sympathy were also projected by ChatGPT, for example, through offers of support and well wishes, as well as through commitments to prioritizing the patient and not leaving them alone. Empathy was also expressed through descriptions of various prosocial behavior such as explanations, encouragement to contact the note writer, or further referrals. Affective empathy, a dimension of empathy associated with affective responses and “catching” another individual's feelings, was unsurprisingly lacking in both the original and generated notes.

While rewriting the notes in an understandable and empathic language was explicitly requested in the prompt, we also assessed whether ChatGPT maintained the medical fidelity of the original notes as it is reasonably expected of a clinical note. Medical fidelity ratings varied as well as seen in the number of GPs who reported they would use the generated notes unchanged. Analysis of the free-text comments revealed potential reasons for that. GPs liked various aspects of the generated notes. They appreciated the content, the layout of the note as well as the tone, and found they could envision using the note with some alterations to the content. However, sometimes GPs did not appreciate the structure or the content, finding crucial details to be missing or incorrect assumptions to have been made extrapolating beyond the original note. Such hallucinations could pose potential ethical and legal risks, as noted in some comments (“My main concern about the ChatGPT note is that it does not include the examination findings which is a medicolegal issue.” [GP2]).

While GPs found the tone of some of the notes acceptable, at other times they wondered if the documentation appeared overly friendly or was suitable in a British context. Conceivably, some patients may have been offended but others may have found the responses beneficial. The tone and language that ChatGPT uses, and cultural acceptability to different readers is receiving increasing attention. As previously explained, ChatGPT has been commonly perceived as authoritative due to the adoption of a confident tone.²⁹ In our study, guided by the prompt for an empathic note, ChatGPT adopted a friendly tone uncharacteristic for the typical British context. This is likely due to the overrepresentation of US-originating text in the training of GPT, though what training materials were used has not been divulged by OpenAI.⁶² While this may not pose a problem in the lay use of ChatGPT, it may hinder its adoption in cultural and clinical settings outside the United States or require additional editing by physicians who use these tools.

Comparison with previous work

Our study supports previous research by Baker et al.⁴⁵ which shows ChatGPT writes longer and more detailed notes. As reported in that study, our study suggests that ChatGPT has the potential to improve clinical documentation by producing more comprehensive and organized notes. While our study found that fidelity was preserved for some notes, like the study by Baker, our panel of doctors also detected exclusions of information, and potential hallucinations.

Applying readability metrics to analyze note content, in line with Pradhan et al.⁶³ who investigated the use of generative AI to write educational materials for cirrhosis, we found that ChatGPT, and especially version 3.5, responses were more demanding requiring higher reading grades than the original GP note. However, as noted earlier, some aspects of the documentation such as unpacking abbreviations may have mitigated this, and further research is needed to gauge patients’ opinions. Indeed, should patients use the internet to supplement their understanding about clinical documentation, as surmized by Blease,²⁰ a study by Walker et al.⁶⁴ reported that version 4.0 embeds comparable quality of information to static internet searches.

Ayers et al.²³ reported that ChatGPT offered more empathic responses than clinicians, and our study supports this finding. Advancing this work, we probed the generated responses to analyze different dimensions of empathy and found ChatGPT offered a variety of signatures of empathy. An experimental study into lay perspectives on physician empathy by Gerger et al.⁵⁴ found that compared with nonempathic interactions, quality-of-care was rated higher when physicians reacted with cognitively empathic or compassionate responses; with no significant difference reported between affective empathy and no empathy which were rated as offering lower quality care. Although our study did not include patient perspectives, given that generated responses were particularly high on cognitive empathy, compassion/sympathy, and prosocial behavior, our findings may suggest that patients might consider its documentation to be empathic. This is something that deserves further scrutiny, including whether patients can discern the difference between notes written by GPs and those written with the assistance of AI.

Notably, other research has examined the ethical perspectives on generative AI assisting with documentation, weighing up the benefits and risks especially with respect to exposing patients’ private and sensitive health information to third parties via these tools.^19,20,47 Allen et al.⁶⁵ argue that, in light of the quality of disclosures that clinicians currently offer to patients, there may be an important role for generative AI in augmenting and strengthening patient autonomy by improving informed consent processes. The present empirical study does not directly resolve such concerns but can help inform answers to those questions by offering further information on the quality of documentation that ChatGPT can offer.

Regulation of generative AI is evolving rapidly.^66,67 In the European context, including England, ORA is widespread but authorities, including in the EU, are currently reviewing whether, without obtaining informed consent, OpenAI's ChatGPT complies with General Data Protection Regulations and meets the requirement that individual patient consent is not required for public health justifications.⁶⁸ Already in the United States such tools will imminently become embedded in electronic health systems to assist with administrative tasks. For example, Epic, the US software vendor with the largest national share of hospital electronic medical records⁶⁹ is piloting the integration of ChatGPT services⁷⁰ aimed at compliance with the Health Insurance Portability and Accountability Act (HIPAA) which lays out federal standards for protecting patients’ sensitive health information from being shared by “covered entities”—that is providers—to other third parties. Furthermore, an Azure HIPAA compliant ChatGPT 4.0 service already exists⁷¹ with new speech-to-text innovations underway in this space.⁷² Despite regulatory and implementation advances, it is unclear how legislation will intersect in a practical way with these tools in clinical practice.²⁰ Medical bodies have issued advice about the ethical use of these tools, although guidance has been criticized as limited.^73–77 Despite the lack of guidance physicians already report using generative AI to assist with their job.^24,25

Strengths and limitations

This study has some strengths and limitations. The inclusion of diverse cases within primary care and fictional notes written by a practicing UK GP and authenticated by a panel of practicing GPs are strengths of the study. Our study compared the quality of the original note with both ChatGPT 3.5 and 4.0 responses, a consideration that is important if we are to compare these tools with current practice. Going beyond other studies, we also examined empathy as a multidimensional construct, to explore the signature of empathic cues in more detail in the documentation than in other studies.^23,78 Another strength was employing linguistic metrics to assess pronoun usage, and an original finding of the paper was higher adoption of more directive, second-person pronouns, in ChatGPT writing patient-facing notes.

The study has several limitations. The cases were restricted to three fictional clinical notes to uphold privacy and confidentiality; conceivably, however, use of a wider range of documentation written by actual doctors could have influenced our findings. The prompt was also limited to a single sentence, without the provision of examples or followed by corrective feedback. While this was done in an attempt to recreate a typical ChatGPT user, prompt engineering in a healthcare context is a rapidly developing.⁷⁹ In addition, our nonpurposive sample of GPs rating the documentation was small and may have affected the findings. Another limitation was that we used generative AI to rewrite documentation without input from GPs. However, it is not envisaged that these tools are close to replacing or disintermediating doctors in writing documentation.²² Rather, commentators propose that if these tools comply with patient privacy and are adopted in the future, clinicians may always be required to act as overseers to ensure note accuracy, appropriateness, and safe use.^19,29 A potential limitation of sentiment analysis is that some words could be considered questionable without additional contextual information, for example the word “including” might be viewed as a positive sentiment within the phrase “including blood in your stool.” The use of a deductive qualitative approach to empathy analysis helped to supplement this limitation. Our study also did not include patients’ perspectives on the content, style, and readability of ChatGPT outputs; ultimately, however, patients’ views will be critical to understand the adequacy of these tools in assisting with documentation, including how doctors should edit generative AI notes.

Further experimental studies are required to investigate the adequacy of these tools in assisting with clinical documentation, including how they respond to different prompts.³⁸ For example, studies could examine repeated prompts of the same fictional case to assess long-term medical fidelity and test the temporal consistency of content changes. Further, a variety of generative AI tools trained on clinical data are increasingly being adopted in practice. Studies should also examine physicians views about the ease of use of these tools within clinical workflow, and their potential efficiency to assist with documentation.^80,81 Finally, more research is needed to explore patients’ opinions on clinical documentation created, or co-created by generative AI. For example, studies could explore how patients perceive “empathy” that is generated by chatbots.

Conclusions

Compared with fictionalized GP notes across three case studies, ChatGPT 3.5 and 4.0 wrote longer notes, embedding higher presence of second person pronouns. ChatGPT also decoded medical abbreviations, but readability metrics showed that the generated notes required a higher reading proficiency, with ChatGPT 3.5 demanding the most advanced level. Across all notes, ChatGPT offered higher signatures of empathy across cognitive, compassion/sympathy, and prosocial cues. Medical fidelity ratings varied across all three cases with version 4.0 rated superior. Despite this, some GPs reported that they would be willing to use the generated notes unchanged. Our study highlights the need for deeper, ongoing analysis of the quality of clinical documentation generated by LLM-powered chatbots. Finally, our study highlights the need for greater guidance among clinicians about how to adopt these tools safely and ethically.

Supplemental Material

sj-docx-1-dhj-10.1177_20552076241291384 - Supplemental material for Generative artificial intelligence writing open notes: A mixed methods assessment of the functionality of GPT 3.5 and GPT 4.0

Supplemental material, sj-docx-1-dhj-10.1177_20552076241291384 for Generative artificial intelligence writing open notes: A mixed methods assessment of the functionality of GPT 3.5 and GPT 4.0 by Anna Kharko, Brian McMillan, Josefin Hagström, Irene Muli, Gail Davidge, Maria Hägglund and Charlotte Blease in DIGITAL HEALTH

Supplemental Material

sj-docx-2-dhj-10.1177_20552076241291384 - Supplemental material for Generative artificial intelligence writing open notes: A mixed methods assessment of the functionality of GPT 3.5 and GPT 4.0

Supplemental material, sj-docx-2-dhj-10.1177_20552076241291384 for Generative artificial intelligence writing open notes: A mixed methods assessment of the functionality of GPT 3.5 and GPT 4.0 by Anna Kharko, Brian McMillan, Josefin Hagström, Irene Muli, Gail Davidge, Maria Hägglund and Charlotte Blease in DIGITAL HEALTH

Footnotes

Acknowledgements

The authors thank the panel of GPs for contributing with their expertise. The authors would like to thank five UK-based primary care physicians who validated the authenticity of the fictional primary care notes.

Contributorship

CB and AK were involved in conceptualization, investigation, supervision, and writing–original draft preparation; AK, CB, JH, and IM in data analysis; CB, AK, and BM in project administration; AK and JH in visualization; and AK, CB, BM, GD, JH, IM, and MH in writing–review & editing. All authors reviewed and edited the manuscript and approved the final version of the manuscript.

Data availability statement

Data are available as appendices.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Ethical approval

The study did not require ethical approval since we used only fictional clinical notes entered in a publicly accessible website and did not collect or analyze any personally identifying information but used only fictionalized material.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: BM is funded by the National Institute for Health and Care Research (NIHR Award ref: NIHR300887). The views expressed are those of the authors and not necessarily those of the NIHR or the Department of Health and Social Care. AK, JH, and MH were funded by NordForsk through the funding to Nordic eHealth for Patients: Benchmarking and Developing for the Future, NORDeHEALTH, (Project #100477). CB and MH were funded by “Beyond Implementation of eHealth” (2020-01229) awarded by the Swedish Research Council for Health, Working Life and Welfare (FORTE). CB, JH, MH & AK were supported by “AI in Healthcare Unleashed: Responsible and Ethical Implementation of Large Language Model Chatbots in Clinical Workflows and Patient Care” (2024-00039) awarded by the Swedish Research Council for Health, Working Life and Welfare (FORTE).

Guarantor

ORCID iDs

Anna Kharko

Brian McMillan

Supplemental material

Supplemental material for this article is available online.

References

Blease

Salmi

Rexhepi

, et al. Patients, clinicians and open notes: information blocking as a case of epistemic injustice. J Med Ethics 2021; 48: 785–793.

Kharko

Blease

Johansen

, et al. Mapping patients’ online record access worldwide: preliminary results from an International survey of healthcare experts. In: MEDINFO 2023—the future is accessible. Amsterdam, Netherlands: IOS Press; 2024, [cited 2024 Apr 15]. pp.114–118. Available from: https://ebooks.iospress.nl/volumearticle/66295

Salmi

Blease

Hägglund

, et al. US policy requires immediate release of records to patients. London, UK: British Medical Journal Publishing Group, 2021.

Hägglund

McMillan

Whittaker

. Blease C. Patient empowerment through online access to health records. Br Med J 2022; 378.

Health and Human Services Department, USA. 21st Century Cures Act: Interoperability, Information Blocking, and the ONC Health IT Certification Program [Internet]. Federal Register; 2020 [cited 2020 Jul 15]. Available from: https://www.govinfo.gov/content/pkg/FR-2020-05-01/pdf/2020-07419.pdf

Kujala

Hörhammer

Väyrynen

, et al. Patients’ experiences of web-based access to electronic health records in Finland: cross-sectional survey. J Med Internet Res 2022; 24: e37438.

Hägglund

Scandurra

. Patients’ online access to electronic health records: current status and experiences from the implementation in Sweden. In: Medinfo. Amsterdam, Netherlands: IOS Press, 2017, pp.723–727.

NHS England. Changes to the GP Contract in 2023/24 [Internet]. NHS England. 2023 [cited 2023 Mar 20]. Available from: https://www.england.nhs.uk/long-read/changes-to-the-gp-contract-in-2023-24/

McMillan

Eastham

Brown

, et al. Primary care patient records in the United Kingdom: past, present, and future research priorities. J Med Internet Res 2018; 20: e11293.

10.

Petersson

Erlingsdóttir

. Open notes in Swedish psychiatric care (part 2): survey among psychiatric care professionals. JMIR Ment Health 2018; 5: e10521.

11.

Dobscha

Denneson

Jacobson

, et al. VA Mental health clinician experiences and attitudes toward OpenNotes. Gen Hosp Psychiatry 2016; 38: 89–93.

12.

DesRoches

Leveille

Bell

, et al. The views and experiences of clinicians sharing medical record notes with patients. JAMA Network Open 2020; 3: e201753–e201753.

13.

Meier-Diedrich

Davidge

Hägglund

, et al. Changes in documentation due to patient access to electronic health records: protocol for a scoping review. JMIR Res Protoc 2023; 12: e46722.

14.

Blease

Torous

Hägglund

. Opinion: does patient access to clinical notes change documentation? Front Public Health 2020; 8: 578.

15.

Muli

Scandurra

Cajander

, et al. Healthcare professionals’ experiences of the work environment after patients’ access to their electronic health records–a qualitative study in primary care. In: Challenges of trustable AI and added-value on health. Amsterdam, Netherlands: IOS Press, 2022, pp.530–534.

16.

Blease

McMillan

Salmi

, et al. Adapting to transparent medical records: International experience with “open notes”. Br Med J 2022; 379.

17.

Ralston

Penfold

, et al. Changes in clinician attitudes toward sharing visit notes: surveys pre-and post-implementation. J Gen Intern Med 2021; 36: 3330–3336.

18.

Bernstein

. Not the last word: seeing ourselves as doctors see US. Clin Orthop Relat Res 2022; 480: 1653–1656.

19.

Blease

Torous

McMillan

, et al. Generative language models and open notes: exploring the promise and limitations. JMIR Med Educ 2024; 10: e51183.

20.

Blease

. Open AI meets open notes: surveillance capitalism, patient privacy and online record access. J Med Ethics 2024; 50: 84–89.

21.

Rosenberg

Magnéli

Barle

, et al. ChatGPT-4 generates orthopedic discharge documents faster than humans maintaining comparable quality: a pilot study of 6 cases. Acta Orthop 2024; 95: 152.

22.

Lee

Goldberg

Kohane

. The AI revolution in medicine: GPT-4 and beyond. London, UK: Pearson Education, Limited, 2023.

23.

Ayers

Poliak

Dredze

, et al. Comparing physician and artificial intelligence Chatbot responses to patient questions posted to a public social media forum. JAMA Intern Med 2023; 183: 589–596.

24.

Shryock

. AI special report: what patients and doctors really think about AI in health care. Medical Economics. 2023 [cited 2023 Aug 22]. Available from: https://www.medicaleconomics.com/view/ai-special-report-what-patients-and-doctors-really-think-about-ai-in-health-care

25.

Blease

Worthen

Torous

. Psychiatrists’ experiences and opinions of generative artificial intelligence in mental healthcare: an online mixed methods survey. Psychiatry Res 2024; 333: 115724. doi:https://doi.org/10.1016/j.psychres.2024.115724

26.

Blease

Locher

Gaab

, et al. Generative artificial intelligence in primary care: an online survey of UK general practitioners. BMJ Health Care Inform. 2024; 31. doi: https://doi.org/10.1136/bmjhci-2024-101102

27.

Birhane

Raji

. ChatGPT, Galactica, and the progress trap. London, UK: WIRED, 2022.

28.

Cohen

. What Should ChatGPT Mean for Bioethics? Available at SSRN 4430100. 2023.

29.

Blease

Torous

. ChatGPT and mental healthcare: balancing benefits with risks of harms. BMJ Ment Health 2023; 26: e300884.

30.

Dijkstra

Verdonk

Lagro-Janssen

. Gender bias in medical textbooks: examples from coronary heart disease, depression, alcohol abuse and pharmacology. Med Educ 2008; 42: 1021–1028.

31.

Duma

Vera Aguilera

Paludo

, et al. Representation of minorities and women in oncology clinical trials: review of the past 14 years. J. Oncol. Pract. 2018; 14: e1–e10.

32.

Geller

Koch

Pellettieri

, et al.

Inclusion, analysis, and reporting of sex and race/ethnicity in clinical trials: have we made progress?

J Womens Health (Larchmt) 2011; 20: 315–320.

33.

Watts

. Why the exclusion of older people from clinical research must stop. Br Med J 2012; 344: e3445–e3445.

34.

Bourgeois

Olson

Tse

, et al. Prevalence and characteristics of interventional trials conducted exclusively in elderly persons: a cross-sectional analysis of registered clinical trials. PLoS ONE 2016; 11: e0155948. doi:https://doi.org/10.1371/journal.pone.0155948

35.

Zack

Lehman

Suzgun

, et al. Coding inequity: assessing GPT-4’s potential for perpetuating racial and gender biases in healthcare. medRxiv 2023: 2023–2007.

36.

Marcus

Davis

. Rebooting AI: building artificial intelligence we can trust. Vintage; 2019 [cited 2023 Sep 24]. Available from: https://books.google.co.uk/books?hl=en&lr=&id=OmeEDwAAQBAJ&oi=fnd&pg=PA3&dq=gary+marcus+book&ots=Lx1PgOVXbQ&sig=8h3cW-0qM-5r_dmHcGwxG4Fy420

37.

Wang

Chen

, et al. Are you asking GPT-4 medical questions properly?-Prompt engineering in consistency and reliability with evidence-Based guidelines for ChatGPT-4: a pilot study. 2023 [cited 2024 Jan 18]; Available from: https://www.researchsquare.com/article/rs-3336823/latest

38.

Goodman

Paul

Morgan

. AI-generated clinical summaries require more than accuracy. JAMA. 2024 [cited 2024 Feb 29]; Available from: https://jamanetwork.com/journals/jama/article-abstract/2814609

39.

Torous

Blease

. Generative artificial intelligence in mental health care: potential benefits and current challenges. World Psychiatry 2024; 23: 1–2.

40.

Ingram

. A mental health tech company ran an AI experiment on real users. Nothing’s stopping apps from conducting more. NBC News. 2023 [cited 2023 Aug 13]. Available from: https://www.nbcnews.com/tech/internet/chatgpt-ai-experiment-mental-health-tech-app-koko-rcna65110

41.

Nguyen

Pepping

. The application of ChatGPT in healthcare progress notes: a commentary from a clinical and research perspective. Clin Trans Med 2023; 13: e1324.

42.

Biswas

. ChatGPT and the future of medical writing. Radiology 2023; 307: e223312.

43.

Marchandot

Matsushita

Carmona

, et al. ChatGPT: the next frontier in academic writing for cardiologists or a pandora’s box of ethical dilemmas. Eur Heart J Open 2023; 3: oead007.

44.

Hanna

Wakene

Lehmann

, et al. Assessing racial and ethnic bias in text generation for healthcare-related tasks by ChatGPT. medRxiv 2023: 2023–2008.

45.

Baker

Dwyer

Kalidoss

, et al. ChatGPT’s ability to assist with clinical documentation: a randomized controlled trial. J Am Acad Orthop Surg 2022; 32: 123–129.

46.

OpenAI. ChatGPT. 2023 [cited 2023 Oct 1]. Available from: https://chat.openai.com/

47.

Marks

Haupt

. AI Chatbots, health privacy, and challenges to HIPAA compliance. JAMA 2023; 330: 309–310.

48.

Chen

Zaharia

Zou

. How is ChatGPT’s behavior changing over time?. arXiv; 2023 [cited 2024 Apr 9]. Available from: http://arxiv.org/abs/2307.09009

49.

Kincaid

Fishburne

Jr Rogers

, et al. Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel. 1975 [cited 2024 Apr 15]; Available from: https://stars.library.ucf.edu/istlibrary/56/?utm_sourc

50.

Bond

. Gunning Fog Index [Internet]. Gunning Fog Index. 2024 [cited 2024 Apr 15]. Available from: http://gunning-fog-index.com/

51.

Mohammad

Turney

. Crowdsourcing a word–emotion association lexicon. Comput Intell 2013; 29: 436–465.

52.

Braun

Clarke

. Using thematic analysis in psychology. Qual Res Psychol 2006; 3: 77–101.

53.

Hall

Schwartz

Duong

, et al. What is clinical empathy? Perspectives of community members, university students, cancer patients, and physicians. Patient Educ Couns 2021; 104: 1237–1245.

54.

Gerger

Munder

Kreuzer

, et al. Lay perspectives on empathy in patient-physician communication: an online experimental study. Health Commun 2023; 39: 1246–1255.

55.

Decety

Jackson

. A social-neuroscience perspective on empathy. Curr Dir Psychol Sci 2006; 15: 54–58.

56.

Decety

Jackson

. The functional architecture of human empathy. Behav Cogn Neurosci Rev 2004; 3: 71–100.

57.

Shamay-Tsoory

Aharon-Peretz

Perry

. Two systems for empathy: a double dissociation between emotional and cognitive empathy in inferior frontal gyrus versus ventromedial prefrontal lesions. Brain 2009; 132: 617–627.

58.

Bloom

. Against empathy: the case for rational compassion. NYC, New York: Random House, 2017.

59.

Singer

Klimecki

. Empathy and compassion. Curr Biol 2014; 24: R875–R878.

60.

Keselman

Smith

. A classification of errors in lay comprehension of medical documents. J Biomed Inform 2012; 45: 1151–1163.

61.

Gotlieb

Praska

Hendrickson

, et al. Accuracy in patient understanding of common medical phrases. JAMA Network Open 2022; 5: e2242972–e2242972.

62.

OpenAI. OpenAI. 2024 [cited 2024 Apr 9]. Available from: https://openai.com/

63.

Pradhan

Fiedler

Samson

, et al. Artificial intelligence compared with human-derived patient educational materials on cirrhosis. Hepatol Commun 2024; 8: e0367.

64.

Walker

Ghani

Kuemmerli

, et al. Reliability of medical information provided by ChatGPT: assessment against clinical guidelines and patient information quality instrument. J Med Internet Res 2023; 25: e47479.

65.

Allen

Earp

Koplin

, et al.

Consent-GPT: is it ethical to delegate procedural consent to conversational AI?

J Med Ethics 2024; 50: 77–83.

66.

Biden

. Executive order on the safe, secure, and trustworthy development and use of artificial intelligence. Oct 30, 2023. Available from: https://www.whitehouse.gov/briefing-room/presidential-actions/2023/10/30/executive-order-on-the-safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence/

67.

European Council of the European Union

Brussels, Belgium

Artificial intelligence act: council and parliament strike a deal on the first rules for AI in the world. 2023 Dec [cited 2023 Dec 11]. Available from: https://www.consilium.europa.eu/en/press/press-releases/2023/12/09/artificial-intelligence-act-council-and-parliament-strike-a-deal-on-the-first-worldwide-rules-for-ai/.

68.

Minssen

Vayena

Cohen

. The challenges for regulating medical use of ChatGPT and other large language models. JAMA 2023; 330: 315–316.

69.

Adams

. 31 Numbers that show how big Epic, Cerner, Allscripts & Meditech are in healthcare. 2021 [cited 2023 Jul 31]. Available from: https://www.beckershospitalreview.com/healthcare-information-technology/31-numbers-that-show-how-big-epic-cerner-allscripts-meditech-are-in-healthcare.html

70.

Adams

. Epic to integrate GPT-4 into its EHR through expanded Microsoft partnership. MedCity News. 2023 [cited 2023 Jul 31]. Available from: https://medcitynews.com/2023/04/epic-to-integrate-gpt-4-into-its-ehr-through-expanded-microsoft-partnership/

71.

Boyd

. Introducing GPT-4 in Azure OpenAI Service. Azure Microsoft. 2023 [cited 2023 Jul 31]. Available from: https://azure.microsoft.com/en-us/blog/introducing-gpt4-in-azure-openai-service/

72.

DePeau-Wilson

. Clinical note writing app powered by GPT-4 set to debut this year. Medpage Today 2023. Available from: https://www.medpagetoday.com/special-reports/exclusives/103713.

73.

Attwooll

. ‘Extremely unwise’: warning over use of ChatGPT for medical notes. Royal Australian College of General Practitioners. 2023 [cited 2023 Sep 8]. Available from: https://www1.racgp.org.au/newsgp/clinical/extremely-unwise-warning-over-use-of-chatgpt-for-m

74.

American Psychiatric Association. The basics of augmented intelligence: some factors psychiatrists need to know now. American Psychiatric Association. 2023 [cited 2023 Aug 13]. Available from: https://www.psychiatry.org/News-room/APA-Blogs/The-Basics-of-Augmented-Intelligence

75.

AMA. ChatGPT and generative AI: what physicians should consider. American Medical Association. 2023 [cited 2023 Sep 11]. Available from: https://www.ama-assn.org/system/files/chatgpt-what-physicians-should-consider.pdf

76.

Smith

Downer

Ives

. Clinicians and AI use: where is the professional guidance? J Med Ethics. 2024; 50: 437–441.

77.

NHS England. Artificial intelligence. NHS England. 2023 [cited 2024 Apr 15]. Available from: https://transform.england.nhs.uk/information-governance/guidance/artificial-intelligence/#:∼:text=Guidance%20for%20healthcare%20workers&text=If%20you%20are%20using%20AI,via%20your%20clinical%20management%20route

78.

Sharma

Lin

Miner

, et al. Human–AI collaboration enables more empathic conversations in text-based peer-to-peer mental health support. Nat Mach Intell. 2023; 5: 46–57.

79.

Meskó

. Prompt engineering as an important emerging skill for medical professionals: tutorial. J Med Internet Res 2023; 25: e50638.

80.

Jongsma

Sand

Milota

. Why we should not mistake accuracy of medical AI for efficiency. NPJ Digit Med 2024; 7: 57.

81.

Wachter

Brynjolfsson

. Will generative artificial intelligence deliver on its promise in health care? JAMA 2024; 331: 65–69.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.04 MB

0.02 MB