Abstract
Objective
Individuals who work on health data systems and services are uniquely positioned to understand the risks of health data collection and use. We designed and conducted a survey assessing the perceptions of those who work with health data around health data consent, sharing, and privacy practices in healthcare and clinical research.
Methods
A 43-item online survey was distributed via a market research firm to individuals (18+) who work with health data in the United States from March to April 2023. Descriptive statistics were calculated for all variables. Associations with demographic variables were assessed using Pearson's
Results
Most of our respondents (61.7%) reported that they would trust people to use their health data across various sectors, but more respondents trusted those working in academic medical research (86.5%) and healthcare offices (89.9%) compared to those working in industry (68.2%). Despite this reported trust, a strong majority believed that individuals should have complete control over their health data (97.3%), specific consent should be obtained for each use of their health data (92.0%), and that there should be higher standards of consent and privacy for health records data than other types of data (93.7%).
Conclusions
Based on our findings, we might infer that people who work with health data generally trust institutions across sectors to protect their health data. However, many would prefer to have complete control over who has access to their health data and how it is used. These insights should be explored further through qualitative studies.
Introduction
The utilization of large electronic health data sets has advanced rapidly over the past two decades across academic research, healthcare, and industry settings. The resulting regulatory structure prioritizes enabling free flow and aggregation of health data to improve research capacity, clinical practice, and profits over promoting the ability of individuals to control the disposition of their own health data. 1 Simultaneously, most individuals are concerned but are largely unaware of how their own data, including health data, is collected, stored, used, and shared.1,2 This disconnect is especially concerning in the current environment where health data professionals are incentivized to identify new and unexpected uses of health data, including health data that has already been collected. 3
Broad consent requires that individuals agree to the use of their de-identified data for future undefined purposes. 4 The United States (U.S.) federal regulation known as The Common Rule, which sets requirements for clinical research involving human subjects, only requires that broad consent be collected from individuals providing health data as long as that data becomes de-identified. 4 Institutional Review Boards (IRB) that oversee human subjects research waive consent requirements for studies that use secondary de-identified data obtained through broad consent. 5 This framework assumes trust in research institutions to responsibly steward one's health data, which includes protecting it from malicious actors, preventing irresponsible uses, and only sharing it with equally responsible institutions. Previous qualitative work in this field has found that participants have concerns about sharing their health information with researchers and worry that other entities may have access to their data without their knowledge.6,7
The passage of the Health Information Technology for Economic and Clinical Health Act in 2009 incentivized providers to implement electronic medical record (EMR) systems and dramatically increased the percentage of hospital systems and provider offices that utilize EMR systems.8,9 Since then, EMR systems have become a ubiquitous component of healthcare in the U.S. Results from a 2020 survey show that patients who had privacy concerns related to EMR data were more likely to withhold information from their providers than those with no concerns. 10 The shift toward electronic health data services necessitates comprehensive and clear guidelines around how healthcare organizations should collect, use, and share individual's health data.
For-profit industry is also increasingly collecting and using health data. Large companies like Meta and Google partner with healthcare organizations and use tracking technologies to collect health data that is not protected under Health Insurance Portability and Accountability Act (HIPAA).11,12 These companies have been reluctant to share how this data is used, making it difficult for regulators to create standards of privacy, security and informed consent. 13 Consequently, the public has concerns around sharing health data with corporations.14,15
Considering these opaque practices and minimal regulation, it is unsurprising that the public has concerns around who has access to their health data and for what purpose.1,6,7,14–23 A 2022 survey of U.S. patients published in a press release by the American Medical Association reported that almost 75% of respondents are concerned about the privacy of their personal health information and that most were unaware of what companies have access to their data. 1 Furthermore, the public's reported intention to share health data for primary purposes (e.g. communications with care providers) was higher compared to secondary purposes (e.g. public health research) and that willingness to share health data for secondary purposes is influenced by profit prioritization, impact on patient care, length of time the data will be shared for, and trust in parties who can access the data.18,19 Although the U.S. public have concerns around the privacy, security, and management of their health data, they continue to share health data electronically because they appreciate the benefits like ease of access to information and improved communication with providers.20–23
Those who work with health data in their professions witness, or are at least familiar with, practices, risks and benefits associated with the collection, sharing, protection, and use of health data. Unlike the general population whose attitudes toward health data use have previously been studied, those who work with health data have first-hand experience navigating the world of electronic health data as both administrator and user. Therefore, their perspectives could help inform policy considerations and assess the effectiveness of ethics training curriculums for health data management. In this study, we sought to understand the perceptions of people who work with health data in the U.S. around health data privacy, sharing and consent practices.
Methods
Survey design
Survey constructs were informed by previous studies related to public attitudes around data privacy, sharing and consent,16,17,21 and semi-structured interviews conducted in 2021 with professionals who work with health data in the U.S., such as analysts, scientists, managers, or c-suite executives. The goal of these interviews was to explore privacy, security, and other ethical concerns associated with the collection, use, and management of health data. The research team analyzed qualitative transcripts to identify themes and tensions that were not previously captured in the literature on data privacy, sharing and consent. The team used these qualitative findings to inform the design of this survey, which was intended to systematically capture the qualitative themes and tensions in a larger sample through a closed-ended survey.
The 43-item survey was developed by a team of experts in bioethics and biostatistics and was refined through iterative conversations among the study team. The survey (Supplementary File) assessed five domains: (1) personal data sharing behaviors and attitudes; (2) ethical considerations around data privacy and informed consent; (3) attitudes toward professional responsibilities around privacy and consent; (4) perspectives on institutional culture around ethics on health data in healthcare and health research; and (5) demographics. This paper, focusing on personal attitudes and behaviors of people who work with health data related to data privacy, sharing and consent practices, will discuss results related to domains one, two, and five. We will describe the results of survey domains three and four in a future publication that focuses on professional attitudes and behaviors of people who work with health data related to data privacy, sharing and consent practices.
Survey domains one, two, and five consisted of 15 closed-ended questions (including 10 demographic questions) and 13 Likert scale questions, yielding a total of 28 survey items. Respondents were given the option to select “Other” and provide a short text response for demographic questions on political party, education level, race, and gender.
Sampling strategy
The study sample was drawn from nationwide panels of demographically diverse groups through a market research firm. Panel members were eligible for recruitment if they resided in the U.S. and were over the age of 18. A screener question at the beginning of the survey was used to ensure that all respondents’ jobs required them to work with health data on a regular basis. Respondents provided their written consent to participate through the market research firm and completed the survey online through Qualtrics in exchange for points that may be used for online transactions. Incentive amounts and types differed across the panel suppliers that partner with the research firm.
To ensure that the survey was able to be completed properly, the market research firm disseminated the survey for a “soft launch” to 90 respondents. To reduce measurement error, the market research firm removed respondents who completed the survey in under 3 minutes (25% less than the median completion time for respondents included in the soft launch), indicating that they were likely unengaged. 24 Additionally, the firm used digital fingerprinting and encrypted survey links to ensure that only validated panelists could take the survey a single time.
Survey administration
This single-site survey was disseminated from March 22 to April 7, 2023. The sample size was selected to ensure a sufficiently low margin of error (+/-3.3%) at a 95% confidence level.25,26
Statistical analysis
All quantitative analyses were conducted using Stata/MP 17.0 for Linux (College Station, TX). Frequencies and percentages were calculated for categorical and Likert scale questions. Associations for categorical questions were calculated using Pearson's
Since categories for several of the demographic responses had a small number of respondents, assumptions of the
We applied Bonferroni's correction by setting the significance threshold to 0.0005 instead of the typical 0.05 to ensure that we did not make any false discoveries related to conducting over 300 hypothesis tests.29,30 Therefore,
Results
Study population
Surveys were administered to over 300 unique panel suppliers; 3689 people clicked on the survey link; 2913 people passed the eligibility criteria and screener question; 1741 responses were removed from the sample because they completed the survey in under three minutes or for other reasons that would compromise data integrity; 16 responses were removed because the respondents did not complete more than 50% of the survey questions.
The final survey sample included a total of 1156 respondents who were primarily between the ages of 25–44 (60.8%), women (67.4%), non-Hispanic (84.9%), and White (71.7%). Most respondents worked in industry (54.4%), followed by the non-profit sector (24.4%). Respondents’ education levels ranged from high school to doctoral degrees, and the most reported education level was a bachelor's degree (38.2%). Respondents resided across all regions of the U.S. and reported a range of political affiliations, professional roles, and years of experience working with health data (Table 1).
Population characteristics.
Concerns related to data privacy, sharing, and consent in healthcare and research
Half (50.0%) of respondents reported that they had a similar level of concern about who has access to their data compared to friends and family; 34.8% said they were more concerned and 15.2% said they were less concerned (Table 2).
Concerns related to data privacy, sharing, and consent in healthcare and research.
Compared to other groups, those who resided in the Southwest region of the U.S., as well as those who identified as men, Black or African American, American Indian or Alaskan Native, and Hispanic were more likely to be more concerned than their family and friends about who has access to their data (all
Most respondents agreed that to the extent that they have privacy concerns, their concerns are related to financial security (92.3%), people stealing their private information and using it in ways that could cause harm (90.8%), and not wanting other people to know their private information (89.8%). Most respondents also reported that they would be comfortable knowing that the health data collected by their doctor's office was being used in an academic research study on cancer risks (77.8%). A similar proportion reported that they would be comfortable knowing that the study would use artificial intelligence (AI) or machine learning (ML) (79.5%) (Table 2).
Compared to other groups, women were more likely to be uncomfortable knowing that their health data collected by their doctor's office is being used in an academic research study on cancer risks (
Attitudes toward trusting different sectors to protect health data
In fact, 68.2% agreed that they trust people who work with health data in for-profit industry settings to protect their health information. A higher proportion agreed that they trust academic medical researchers (86.5%) or healthcare provider offices (89.9%) with this information. 61.7% of respondents agreed that they trust people who work with health data in all three sectors (Table 3).
Attitudes toward trusting different industries to protect health data.
Respondents who said “strongly” or “somewhat” agree to all three questions in the table were categorized as “agree” and all others were categorized as “disagree”.
Compared to other groups, women were less likely to trust people who work with health data in for-profit industry settings to protect their health information (
Attitudes toward current data privacy, sharing, and consent standards in healthcare and research
Almost all respondents agreed that a person should have complete control over who has access to their individual health data (97.3%). Similarly, respondents agreed that research projects should not use a person's health data unless consent has been received for that specific project (92.0%). Additionally, respondents agreed that there should be higher standards of consent and privacy for health records data than other types of data (93.7%) (Table 4).
Attitudes toward current data privacy, sharing, and consent standards in healthcare and research.
Respondents were divided on whether people generally understand how data is stored, shared, and protected well enough to provide meaningful consent for their data to be collected and used (agreed: 57.9%, disagreed: 42.1%) (Table 4). Compared to the other groups, women, those who work in the academic sector, and those who said they report to someone who reports to the CEO were more likely to disagree that people generally understand how data is stored, shared, and protected well enough to provide meaningful consent for their data to be collected and used (
79.4% of respondents agreed that de-identified health data can be both useful and have no risk for reidentification (Table 4). Compared to other professional roles, those who are CEOs or Presidents were more likely to agree with this statement (
Discussion
Our survey of those who work with health data in their profession found that 50.0% of respondents felt that they are similarly concerned about who has access to their data compared to their family and friends, 15.2% felt they are less concerned, and 34.8% felt they are more concerned. Most of our respondents (61.7%) also reported that they would trust people to use their health data across various sectors, but more respondents trusted those working in academic medical research and healthcare offices compared to those working in industry. Despite this reported trust, a strong majority believed that individuals should have complete control over their health data (97.3%), specific consent should be obtained for each use of their health data (92.0%), and that there should be higher standards of consent and privacy for health records data than other types of data (93.7%).
Our finding that those who work with health data had more trust in healthcare providers or academic researchers compared to those in the industry sector is supported by previous work around public attitudes toward health data privacy.1,15 A 2015 survey study found that although half of respondents were likely to consent to both healthcare and research uses of their data, significantly fewer would consent to healthcare uses compared to research uses of their electronic health data. 17 This differs from our finding that those who work with health data had more trust in health care providers offices than in academic researchers. This discrepancy may reflect the public's increasing comfort over the past several years with sharing health data electronically with their providers as EMR became more widely used and understood.
Additionally, previous qualitative studies reported patient concerns that health data shared for research purposes would be shared with other entities, especially when shared without their consent.6,7 This is consistent with our finding that almost all participants would prefer specific over broad consent for uses of their health data. Qualitative studies have also found that research participants were concerned about their sensitive health information (e.g. chronic disease diagnoses) being shared with health insurance companies that could prevent them from obtaining affordable insurance. 7 This aligns with our findings that data privacy concerns are related to financial security.
Those who identify as Black non-Hispanic, Hispanic, American Indian or Alaskan Native and those who had not completed a college degree have more privacy concerns than those in other groups.1,16,18,19 These results are consistent with our findings that members of racial/ethnic minority populations and those with lower education levels had more concerns around health data privacy, sharing, and consent. However, we also found that women tended to have more concerns around who has access to their data and for what purpose. Our results also indicate that those who identified as Black or African American, American Indian or Alaskan Native, and Hispanic were more likely to report greater levels of concern about who has access to their health data compared to their family and friends. These finding emphasizes that those who work with health data regularly are more attuned to the risks involved with sharing health data electronically, especially for populations who experience disproportionally worse outcomes related to healthcare access and quality (e.g. racial/ethnic minority groups and women). 32 The resulting mistrust of healthcare organizations observed among members of minority populations has been associated with underutilization of health services and, consequently, adverse health outcomes. 33 Further research is necessary to assess potential explanations for differences in data privacy perceptions, attitudes, and behaviors among these under-served groups.
Policies and strategies around data privacy, sharing, and consent should balance the need for societal benefits and organizational profit with the importance of individual autonomy. Consumer Subject Review Boards consisting of committees aimed at optimizing the ethical use of data could help organizations and institutions weigh the risks and benefits of data use decisions, especially in contexts that do not legally require IRB approval. 34 Tiered, Dynamic, or Meta consent models give individuals more control over how their data is protected and shared. 35 Visual analogs, called pictorial legal contracts, enable those with language barriers or lower literacy levels to fully comprehend data privacy agreements and informed consent forms to make an informed decision. 36 Finally, advanced settings on web browsers allow users to pre-select data privacy and sharing preferences for use across all websites on the browser and reduce the burden on users to read and comprehend multiple consent forms. 36 These mechanisms can potentially increase the public's knowledge regarding intended uses of their health data; but there should be strong evidence to support technical feasibility and improved individual control over health data before they are implemented. The effectiveness of these mechanisms must also be weighed against potential increases in costs and any impediments they might create to the acquisition of valuable research data and improved health outcomes. 35 The opinions of those who work with health data regularly should likely be included in these kinds of cost benefit analyses.
In January 2021, the Food and Drug Administration published an action plan for approving AI or ML software applications as medical devices and published a report in March 2024 on the proposed collaborative effort to oversee the development and use of AI software while considering data use transparency and cybersecurity concerns. 37 Additionally, the Federal Trade Commission amended the Health Breach Notification Rule in July 2024 to notify the public when their data collected through a health application has been included in a breach or has been disclosed without their knowledge. 38 In addition to these protections, organizations and institutions should be further incentivized to ethically steward health data through informed consent, by providing practical options for opting out of health data collection, and being transparent with their health data practices. These steps will likely be important for maintaining public trust and adequately respecting individual rights to privacy.
Limitations
The results of this survey must be framed in the context of their limitations. This survey instrument was not validated, but survey constructs were informed by previous studies related to public attitudes around data privacy, sharing and consent,16,17,21 and by semi-structured interviews conducted with professionals who work with health data in the U.S. People who participate in surveys through panels provided by market research firms tend to have greater access to technology and are of higher socioeconomic status than members of the general population. 39 However, those in data-related professions were shown to have median salaries above the median salary across all occupations for 2022 in a study conducted by the Bureau of Labor Statistics, so the generalizability of our results isn’t likely to be impacted by this limitation. 40 Furthermore, panel surveys are susceptible to data quality concerns related to respondents’ attention level and “bots” that can be programmed to take the survey multiple times. 41 To address this, we removed respondents who completed the survey significantly faster than the median completion time, the firm limited the number of surveys panelists can take in a week, and the firm used digital fingerprints to validate respondents. Lastly, the market research firm was unable to provide the number of individuals who were sent a survey invitation, so it was impossible to generate a response rate. This limits our ability to generalize our findings to the entire U.S. population of people who work with health data.
Conclusions
Individuals who work with health data are uniquely situated to understand the risks of health data collection and use, which might position them well to help inform policies around health data privacy, sharing, and consent, and potentially to help gauge the effectiveness of related ethics training. Our results indicate that this population generally trusts institutions across sectors to protect their health data. However, almost all respondents agreed that they would prefer to have complete control over who has access to their health data and how it is used. Exploring these insights further through in-depth interviews with people working in different sectors and at different professional levels could be valuable toward understanding how organizations and institutions can address existing data privacy concerns for the U.S. public. The perspectives of those who work with health data could also be useful as jurisdictions across the globe grapple with policy to balance the trade-off between enabling health data aggregation and protecting the privacy and safety of individuals.
Supplemental Material
sj-docx-1-dhj-10.1177_20552076241290964 - Supplemental material for Data professionals’ attitudes on data privacy, sharing, and consent in healthcare and research
Supplemental material, sj-docx-1-dhj-10.1177_20552076241290964 for Data professionals’ attitudes on data privacy, sharing, and consent in healthcare and research by Katya Kaplow, Max Downey, Darren Stewart, Allan B. Massie, Jennifer D. Motter, Lauren Taylor, John Massarelli, Taylor Matalon, Carolyn Sidoti, Macey L. Levan and Brendan Parent in DIGITAL HEALTH
Footnotes
Abbreviations
Author contributions
BP, LT, MLL, JM, and TM researched the literature and conceived the study. MD and CS were involved in protocol development and gaining ethical approval. BP, KK, JDM, MD, TM, LT, and JM were involved in survey development. BP, KK, and MD were involved in participant recruitment, and manuscript writing. KK, DS, JDM, and ABM were involved in data analysis. All authors reviewed and edited the manuscript and approved the final version of the manuscript.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Ethical approval
This study (s21-00785) was approved by The New York University Langone Health Institutional Review Board on November 2nd, 2022. Respondents gave written consent via the market research firm's platform before starting the survey.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: this work was supported by the Robert Wood Johnson Foundation [78443]. The analyses described here are the responsibility of the authors alone and do not necessarily reflect the views or policies of the Robert Wood Johnson Foundation), nor does mention of trade names, commercial products or organizations imply endorsement by the Robert Wood Johnson Foundation.
Guarantor
BP.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
