Abstract
Introduction
Digital Health Technologies (DHTs) are currently being funneled through legacy regulatory processes that are not adapted to the unique particularities of this new technology class. In the absence of adequate regulation of DHTs, the briefing of a patient by their healthcare provider (HCP) as a component of informed consent can present the last line of defense before potentially harmful technologies are employed on a patient.
Methods
This exploratory study utilizes a case vignette of a machine learning-based technology for the diagnosis of ischemic heart disease that is presented to a group of medical students, physicians, and bioethicists. What constitutes the necessary standard and content of the HCP–patient briefings is explored using a survey (N = 34). Whether participants actually provide a sufficient HCP–patient briefing is evaluated based on audio recordings.
Results and Conclusions
We find that participants deem artificial intelligence use in medical context should be declared to patients and argue that the explanation should currently follow the standard required of other experimental procedures. Further, since our study provides indications that implementation of HCP–patient briefings lacks behind the identified standard, opportunities for incorporation of training on the use of DHTs into medical curricula and continuous training schedules should be considered.
Keywords
Introduction
An increasing number of Digital Health Technologies (DHTs) are coming to market and are gaining in use among patient– consumers and Healthcare Professionals (HCPs). While there is no universal definition of DHTs, we refer to them as digital technologies applied to the execution of medical functions, such as diagnosis, prognosis, and treatment of disease and other conditions of human health. The U.S. FDA AI/ML database showed 521 entries of artificial intelligence (AI) or machine learning (ML)-enabled approved medical devices as of October 2022. 1 While not the same, the terms DHT and AI are often used interchangeably—although we suggest that AI should be understood as a technology enabling certain forms of DHT. 2 The pipeline for new devices coming to market is filled given high funding both from business 3 and governmental/academic sources. Even though DHTs become increasingly prevalent in clinical practice, 4 regulation currently lags significantly behind technological capability and use. Harms may occur in violation of principles such as safety/non-maleficence, lack of efficacy/beneficence, and privacy, among others, as a consequence of this “regulatory gap.” 5
Legislative bodies and regulatory agencies may not be able to catch up immediately to this regulatory gap and prevent unsafe, ineffective, or otherwise harmful technologies from coming to market. Thus, user handling of DHTs—both by patient–consumers and professional HCPs—may act as another layer of preventing harm. In the context of professional use, the HCP–patient explanation of medical procedures takes a central role in enabling the ethical use of technologies via establishing informed consent.
At the same time, much debate is currently taking place around the transparency of underlying mechanisms of digital health devices as requirements for those being considered trustworthy. An entire field of Explainable AI has evolved. 6 It is striking that such discussions center around the complex interactions between transparency and other ethical principles, primarily fairness, but not practical aspects and the currently prevalent standard of transparency in medicine.
That this is not a debate distanced from patients’ concerns has recently been shown. In a focus group-based study, patients indicated concerns about the safety of AI, suffering from introduced biases and reduced autonomy—and indicated they expect their HCP to ensure that AI use is safe. 7 The satisfaction of all stakeholders within healthcare delivery will lead to a higher degree of implementation and potential benefits to healthcare systems, payors, providers, and—most importantly—patients.
The HCP–patient explanation standard for digital health devices
What constitutes informed consent in detail depends on the jurisdiction, procedure, and patient. However, certain elements have universal applicability mirroring the bioethical principle of autonomy: (1) the patient must have legal and/or mental capacity to make decisions, (2) the decision of consent must be free of controlling influences and/or coercion, and (3) the patient must have the information necessary to make the decision. While all three elements are critical, the latter is most relevant when DHTs as a relatively new technology with limited application experience by users are involved.
What constitutes information necessary for the patient to make the decision? Over time and with a shift from a paternalistic to a more patient-centered approach, the reasonable patient standard has taken a dominant position across jurisdictions, that is, considering what an average patient would need to know as opposed to what a typical physician would say about a procedure. 8
This information content then depends largely on the nature of the procedure, that is, whether it is a low-risk and standard procedure or a high-risk and research study procedure representing the respective ends of the risk spectrum. In Switzerland, the Swiss Academy of Medical Sciences (SAMW) has issued a medicolegal guideline differentiating three classes of procedures requiring different levels of briefings: (1) standard therapy (with on-label or off-label subcategories), (2) experimental therapy, and (3) research study.
The closest category applicable to DHTs in the analogy are experimental therapies that constitute non-standard therapy or treatment in the absence of a standard treatment. This will remain until a time when specific DHTs have become incorporated into standard practice. Given the close interconnection between medical functions (screening, diagnosis, prognosis, treatment, prediction, alleviation, monitoring)—this definition of the standard should be applied to such other acts, including diagnostic procedures that may be invasive or non-invasive. The SAMW has established a catalog of components that an HCP–patient explanation should contain in the case of an experimental therapy 9 that may be deemed a maximum standard in such non-research settings.
Contribution of this study
Despite there being tools to establish informed consent, for example, leaflets or interactive patient decision-aids, the HCP–patient briefing in verbal format takes center stage. It can serve as an important reflection point for both HCP and patient in determining whether to proceed with an experimental procedure or its alternatives. Given the rapid developments in DHTs and the regulatory gap described above, it is the last line of defense to preventing harm. How the HCP–patient briefing should look like and whether Healthcare Providers are ready for the provision of such briefings is currently unclear. There is initial work on what constitutes informed consent in Medical AI from a legal–theoretical perspective 10 as well as on the disclosure risks that stem from Medical AI use from a philosophical–theoretical perspective. 11 Empirical research has been lacking in this area, however.
In that context, the aims of this exploratory study are (1) to assess what components are perceived necessary for a sufficient HCP–patient briefing when AI-based tools are used in the provision of medical services (specifically: diagnosis) and (2) to identify the congruence (or discrepancy) between identified necessary components of the briefing and executed briefing following the presentation of a case vignette.
Methods
To fulfill the aims of the study, both quantitative (survey) and qualitative methods (presumption-focused coding of audio recordings generated by the participants) were employed. In the survey, participants were asked to choose components of an explanation that are to be included when discussing DHT use with their patients. The list of components presented to participants is the catalog list required for experimental treatments by SAMW. For the qualitative part, participants were asked to record and submit an HCP–patient briefing later in the session. The audio recordings were transcribed and coded according to the SAMW requirements list. Coding was done by two independent coders (researchers JDI and MC) in Microsoft Excel and resulted in an inter-coder reliability of 87.4%. For coder conflict resolution, a verbal discussion led to 100% consensus in the second iteration.
Participants and setting
Twenty-one medical students (11 male, six female, four did not declare) between Year-2 and Year-4 (15 Year-2, three Year-3, and three Year-4) participating in elective classwork on AI in medicine provided by instructors at the University of Zurich (UZH), Switzerland participated in the study from Fall 2020 to Fall 2021 and submitted complete survey responses. All UZH Medical Students need to pass a Patient Communication & Interaction required class which includes practicing their HCP–patient briefing skills in the first year of their studies. Also included in the sample are eight licensed physicians (2 male, 6 female) and five PhD candidate/postdoc-level bioethicists (2 male, 3 female). Six medical students, three physicians, and two bioethicists also submitted an audio recording. A pre-study was undertaken in spring 2020 informing the live online format of the study. Participation was voluntary and participants could choose to receive bookstore or supermarket vouchers worth 20–30 CHF. Due to the ongoing COVID-19 epidemic, students participated in the study online during a live streaming session.
To assess what components are necessary for a sufficient HCP–patient briefing when AI-based tools are used in the provision of medical services, an online survey was employed and administered during class. While the term HCP is broader in nature than the participant categories of medical students and physicians, we see them as useful proxies. The English language survey was implemented using LimeSurvey hosted on the University of Zurich servers. To identify the congruence (or discrepancy) between identified necessary components of the briefing and recorded briefing, a link to a University of Zurich-hosted file server was provided where participants could upload self-recorded simulated HCP–patient briefings following the presentation of a case vignette. The study was approved in accordance with the institutional review process at the Faculty of Medicine, UZH governing studies not requiring cantonal ethics board approval.
Implementation of the study session, survey, and audio recording
Following a short video sequence from a popular TV Show showcasing issues of trust in medical products and HCPs administering them, the informed consent statement for participating in the study was verbally discussed and the link to the online survey with the same written informed consent statement was provided. Participants were free to fill out the survey or, alternatively, follow the exercises that were also presented in class. The approximately one-and-a-half-hour-long sessions contained four segments: first, a recapitulation of medical functions and ML was provided. Second, a case vignette was presented following an introduction to ischemic heart disease (IHD). This introduction included information on the definition, etiology, prevalence, diagnostic criteria, risk factors, and treatment options for the disease. In addition, the underlying data and performance of ML models based on different algorithm classes were presented. Third, participants were asked whether the use of an ML-based tool should be disclosed to the patient and what an HCP–patient explanation should contain based on a menu of options on the basis of the Swiss guideline of briefings in non-standard therapy uses. A free field option for further input was provided. Participants were then asked to assume the role of an HCP and explain the procedure to the patient sketched in the case vignette and they had approximately 15 min time to record the statement that then was uploaded by the participants. The sessions ended with a general overview of the state of the art of ML in cardiology. Supplemental Addendum 1 contains the relevant survey questions and answer options.
Case vignette
The following case vignette to provide a relatable context to participants and a basis for the simulated HCP–patient explanation was developed together with a Board-certified cardiologist. The concrete tool described in this case vignette can be understood as a clinical decision support system, which is one type of a DHT.
You are a physician practicing medicine in a small ski resort in Graubuenden. It is a bitterly cold Christmas Week and a 45-year-old male tourist—who just started his month-long ski vacation with his family arriving from all over Europe—shows up in your practice. He complains of pain in the chest when carrying his skis to the lift on a hill in the morning. Resting, the pain disappears quickly. The patient mentions having had a checkup earlier this year at a hospital during a medical tourism stay abroad and he shares the records with you from his smartphone. Due to the ongoing COVID-19 crisis, he cannot travel to that hospital again and they are unavailable for consults during the holiday season. The anamnesis, labs, and other exams point towards IHD and exclude acute coronary syndrome/myocardial infarction but you are uncertain about the presence of IHD. The patient categorically denies undergoing any sort of invasive procedures (including angiography) and certainly does not want to see a specialist at a hospital or leave the vacation chalet. The patient does not take any medications.
You are left with having to make a diagnosis and potentially starting a treatment regimen.
You have access to a digital health device application utilizing data from a few hundred patients developed at UZH. This application utilizes exactly the data points that you have available. Its performance is at approximately 86% accuracy versus the gold standard of approximately 97%.
The goal for the device development was to predict the heart disease state (simplification as coronary arterial stenosis ≥50% signifies the presence of disease, <50% absence) using only non-angiographic data listed in the table on the next page.
ML model
The ML modeling underlying the heart disease case vignette is described in Supplemental Addendum 2. The model was explained during the session such that the participants obtained a basic understanding of how the example on the case vignette worked.
Results
Survey: Components of a sufficient HCP–patient briefing when AI-based tools are used
Ninety-five percent of the medical students deemed the declaration that an ML-based tool in the diagnosis process was to be used to be a necessary part of the HCP–patient briefing. All physicians (100%) and all bioethicists (100%) concurred.
The results of which components an HCP–patient briefing should contain varied considerably. There was great agreement (at least two-thirds responding with inclusion) between all three groups that the current health state of the patient, risks of the procedures, alternatives, and the right to a second opinion should be mentioned. Other providers, details of the procedures, and the off-label-specific details of the procedure were much less agreed for inclusion among medical students and physicians. Bioethicists generally desired more components for inclusion. The detailed results can be found in Table 1.
Survey results (percentage that answered that component should be included in HCP–patient explanation, absolute numbers in brackets) for medical students, physicians, and bioethicists.
Audio recordings: What was explained to the vignette patient
HCP–patient briefings lasted between 0 m:57 s and 3 m:5 s for medical students (six recordings), 1 m:49 s and 3 m:1 s for bioethicists (two recordings), and 1 m:18 s and 4 m:15 s for physicians (three recordings). Two of the medical students opted to provide the briefing in German rather than English, all others provided it in English. Due to the more limited number of non-medical student audio recordings submitted, only medical student recordings were analyzed.
All medical students included a mention of the technology being used to assist in making a diagnosis of the vignette patient. Also included in the respective briefing recordings by most or all were the categories “information on the procedure” (100%), (some) “details of the procedure” (83%), and “mention of alternatives” (100%). At only half the briefings (50%), the categories “current state of the patient” and the “right to a second opinion” were mentioned. Risks were mentioned in only a third of the recordings. All other issues were either not mentioned or in fewer than 20% of the recordings.
The match between the survey and audio recordings shows differences
Comparing the results of the two study components—the survey of what components are deemed necessary for a sufficient HCP–patient briefing and the audio recordings of what was actually explained—shows differences in both possible directions. As listed in Table 2, medical students listed most (8 out of 11) components more frequently in the survey as necessary components but included them less frequently in the actual briefing as recorded. A gap was particularly apparent (>50%) for mentioning the costs of the procedure, the right to withdraw, and a period to consider for a reasonable time as well as the associated risks of the procedure. On the other hand, information on the procedure and details of the procedure were provided more frequently than what was considered necessary in the survey.
Comparison of survey results (percentage choosing requirement of mentioning explanation component, absolute numbers in brackets) and audio recording coding results (percentage actually mentioning explanation component, absolute numbers in brackets) showing a significant gap (lack of mentioning).
Discussion
Survey: Not all components of an HCP–patient briefing for non-standard procedures deemed necessary for AI use
The list presented to study participants contains all required components of briefings by HCPs to patients when non-standard treatments are used according to the SAMW, constituting soft law in Switzerland. As the case vignette and technology represent a non-standard procedure context, it was surprising to find that medical students do not deem all components essential. The greatest need (>80% choosing these as necessary) was seen in explaining risks, alternatives, the current state of the patient as well as costs followed by explaining details of the procedure as well as the rights of having some time to consider and to a second opinion (at least two-thirds choosing those as necessary).
This difference between medicolegal guidelines and opinions of those surveyed can be termed a guideline–opinion gap. This may be uncritical when the individual case does not warrant the inclusion of some components in the briefing (we remind that the case vignette has been shown before the survey, so a framing by the case can be expected). While survey data for physicians is more limited, some differences—particularly regarding the rights to withdraw and to a second opinion may exist. Bioethicists in contrast largely deem all presented components necessary parts of an HCP–patient explanation. This disconnect between clinical practitioners and theoretical bioethicists is strong and may be noteworthy when considering also a potential gap between bioethical guidelines and clinical implementation. Bioethicists are rarely also the clinicians providing the explanation and an overly long list of requirements for inclusion proposed in theoretical elaborations on informing about AI in medical contexts12,13 may lead to more information being offered than can reasonably be expected to be absorbed by the patient. In the written sphere, the terms of use of a mobile app for download may be such an example of legally required but seldom useful information.
A gap between what is deemed necessary and what is executed in an HCP–patient briefing
While there were already fewer components identified as necessary for inclusion in the HCP–patient briefing than stipulated by the medicolegal guideline, these components were not as frequently included in the recordings as they should have been according to the survey. Given the particularities of the presented case vignette that underlies the audio recording, the difference between survey mentions of what is required and actual mentions in the recording (Δvaluesurvey–valuerecording) needs to be distinguished. Critical from a safety, efficacy and general harmfulness perspective are risks of the procedure (here: uncertainty of the target diagnosis and lack of differential diagnosis process potentially leading to lack of or wrongful intervention) that should have been mentioned—only 33% of participants even mentioned some risk of the proposed procedure. The essential patient rights to withdraw and have a period of time to consider were also not mentioned in a majority of cases despite being deemed necessary leading to a significant opinion–execution gap. However, given the particularities of the case, inclusion in the briefing is not essential. The vignette patient wants a non-invasive diagnostic procedure to which a later withdrawal and time for consideration seem to have limited applicability. Also, costs play a very minor role in the application of an existing application and their lack of mention does not seem critical. Since an experimental, unapproved procedure always has off-label character (strictly: it has no label legally), the lack of mention is of concern. What unequivocally stands as a critical difference is that risks were not mentioned. The opinion–execution gap has been described in other medical contexts as a “knowledge–behavior gap”—with rationalizations and other conditions being identified as reasons for the gap. 14
All downhill from here?
The patient briefing components analyzed here only constitute a fraction of an HCP–patient briefing that is embedded into an HCP–patient relationship. The focus on unidirectional knowledge transmission from an HCP (medical student) to a patient in this study narrows this down in scope for the sake of the experiment to one of the basic building blocks of informed consent: the provision of information. It is thus worrying to find that under the simplified and optimized conditions presented here to the participants (introduction to disease, explanation of the technical background of DHT, written survey and presentation on explanation components, unidirectional explanation without a patient), the audio recordings still showed a significant opinion–execution gap for medical students, particularly with regard to risks. This is in some contrast to the finding that medical students tend to be better communicators at the beginning of their medical education journey, whereas later, such skills deteriorate. 15 Communication skills are also positively associated with clinical outcomes, including patient safety. 16 At the same time, junior physicians are more likely to benefit from the support of AI-based tools in their diagnosis-making process than more experienced physicians 17 and will be more likely users. Such junior physicians would also be more affected by healthcare information technology breakdowns. 18 When even those with recent patient communication training and exposure to this technology class cannot provide an explanation sufficient even according to their self-defined standard, doubt must be cast on the readiness of the wider HCP population to explain this broad technology class.
Limitations and future studies
This study due to its nature as being exploratory has several limitations. First, the low participant numbers do not permit a generalization beyond the immediate context of the study. Second, fewer participants submitted an audio recording than survey responses. This means there is not necessarily a representative match between those participants that submitted the survey and those that submitted the audio recording. This limits the comparability of results from those two study components. A one-on-one match between the survey and audio recording would have been preferred but the online setting made this difficult as the token approach would likely have led to technical issues for some participants during the live sessions and thus a lower participation rate. Third, the audio recording of a simulated HCP–patient explanation also lacks the two-way communication pattern that is typical of an actual HCP–patient communication situation and is thus not entirely representative of real-world communication. Nevertheless, as the focus is on the HCP explanation of technology use to a patient and identifying congruence (or discrepancy) between content components of an HCP–patient briefing and the actual content in a simulation, this limitation has only limited impact.
Despite those limitations, the results do help in formulating hypotheses for future, larger studies and can inform areas of concern: first, we hypothesize that there is a guideline–opinion gap, that is, that medical students and physicians did not deem all stipulated components in the applicable guideline as necessary. Second, we hypothesize that there is an opinion–execution gap, that is, those components that were identified as necessary to be included in an HCP–patient briefing (particularly risks of the procedure) were not actually executed to the same degree. Future studies should explore assessing HCP readiness (also by understanding the reasons for the opinion–execution gap) and propose changes to the medical curriculum and/or continuous training requirements for HCPs.
Conclusion
The presented technology use based on the case vignette falls under the non-standard/experimental use category and therefore warrants a higher standard of explanation to patients. The survey part of this study showed that participants did not deem all stipulated components as necessary (guideline–opinion gap). The audio recording part of this study then demonstrated that some of those components that were identified as necessary to be included in an HCP–patient briefing (particularly risks of the procedure) were not actually executed to the same degree (opinion–execution gap). This finding based on the limited data of this exploratory study should thus form the hypothesis of larger studies differentiating potentially between different technologies, different physician specialties, and seniorities.
In the situation that DHTs are non-standard, relatively recently introduced procedures, we recommend for the time being that they are treated as experimental procedures in terms of the standard of HCP–patient briefings, as applicable. Ideally, the applicability of related guidelines, such as the one from SAMW, should be explicitly mentioned in such. We also recommend exploring opportunities to strengthen training in HCP–patient communication and develop communication materials, such as patient decision-aids, to supplement the information provided by HCPs in this evolving technology area. DHTs line up in a row of new technology classes recently being introduced to the clinical and near-clinical practice. The experience with personal genome testing—for example by using a tiered-layered-staged model 19 for informed consent, may provide pointers that should be explored.
Supplemental Material
sj-docx-1-dhj-10.1177_20552076221147423 - Supplemental material for The use of artificial intelligence applications in medicine and the standard required for healthcare provider-patient briefings—an exploratory study
Supplemental material, sj-docx-1-dhj-10.1177_20552076221147423 for The use of artificial intelligence applications in medicine and the standard required for healthcare provider-patient briefings—an exploratory study by Jeffrey David Iqbal and Markus Christen in Digital Health
Footnotes
Acknowledgements
The authors would like to thank Corine Mouton Dorey, MD, PhD, for her critical insights on the design of the case vignette as well as her supervision in the first session of this study. The authors would also like to thank Carlos Cotrini Jimenez, PhD and Nikola Biller-Andorno, MD, PhD, for their guidance in the design of the study.
Contributorship
JDI designed the study, developed the protocol for this work, and conducted the study sessions. All authors were involved in data analysis procedures. JDI was responsible for the writing of this manuscript with significant input and review by MC. All authors approved the final content of the manuscript.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Ethical approval
The study was approved in accordance with the institutional review process at the Faculty of Medicine, UZH, Switzerland governing studies not requiring cantonal ethics board approval.
Funding
The authors received no financial support for the research and/or authorship of this article. Open Access costs were financed by the University Library Zurich.
Guarantor
JDI
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
