Abstract
Technological developments, such as the advent of social networking sites, apps, and tracking ‘cookies’, enable the generation and collection of unprecedented quantities of rich personal and behavioural data, opening up a vast new resource for mental health research. Despite these non-traditional health-related data already forming a vital foundation of many new research avenues, little analysis has been done focusing on the experiences, motivations, and concerns of the individuals already engaged in data sharing and donation practices. This explorative study aims to investigate the experiences of individuals voluntarily donating their data to mental health research, specifically through the open data initiative OurDataHelps.org, which aims to develop effective suicide prevention tools. Qualitative semi-structured interviews and participant observation were used on a small sample of participants, yielding 3 key findings: (1) The relationship between participants and their data traces fluctuated between unconscious agency and hyper awareness through curatorship. (2) Despite having concerns about privacy and surveillance, participants were driven by altruistic motivations to engage with health researchers valued by their community, in the hope that their personal information could be of some benefit to future avenues of research. (3) In most cases represented in this sample group, motivation was found to stem from personal experiences with mental health, suicide, and loss. In the suicide survivor community, the experience of data donation is often valued as a method for emotional processing of a loss, connecting with the experiences of others, or as a way of regaining a sense of ‘purpose’. By understanding the motivations of individual participants, future projects can ensure that data donation processes are a positive experience and ultimately, increase and sustain the huge potential resources for health researchers worldwide.
Keywords
Introduction
The rapid increase in the collection of personal data, made possible through the accelerating use of digital technologies and datafication, has unlocked potential for citizens to become active participants in the personal data ecosystem. Personal data can be sold, studied, and shared, or it can be hidden and anonymised. One currently evolving practice is the donation of personal data to medical research, offering new opportunities to the advancement of understanding and knowledge of health, illness, and disease.
Health-related data are increasingly being derived from non-biomedical sources. Data from Facebook ‘likes’ can predict age, sex, ethnicity, political views, happiness, use of addictive substances, and sexual orientation. 1 Pharmacovigilance studies have shown Twitter posts with references to medical products can be used to identify adverse events related to a large number of conditions. 2 In addition, mobile phone data have been used for public health surveillance activities, from mapping mental health trends on a national scale, 3 to detecting human mobility in affected regions during the Ebola crisis. 4 Non-biomedical data are having an impact on public health.
Underlying these and many other uses of personal data are a series of legal and ethical challenges concerning the surveillance practices used for aggregating data from both online and offline activities.5,6 Digital companies use information about their users to feed targeted online advertising and machine learning algorithms,7,8 and many digital platforms employ ‘take it or leave it’ terms and conditions, compromising the user’s autonomy over their privacy if they wish to access the Internet economy. 9
Regulation of access to, use of, and management of collected data is pertinent given the complexity of the data sharing ecosystem. However, the often-conflicting interests between the general public, non-for-profit groups, corporations, research institutes, and governments mean that even with safe management practices, opportunities exist for personal data to be leaked, hacked, or breached at the time of transmission or storage.10,11
In the face of these rapidly evolving debates and digital initiatives, eliciting the views and experiences of the individuals actively donating personal data to medical research is crucial. Previous studies have identified that citizens are willing to anonymously share personal data if it would advance research for the good of the public.12,13 However, little work has been done towards assessing the experiences of individuals already engaged in such data sharing activities.
This article reports the findings of an explorative 18-month study of the phenomenon of data donation to health research, with the primary question, how do individuals using OurDataHelps.org experience donating their personal data? The aim was to obtain an in-depth understanding of the experience of data donation from the perspective of the participants involved and to explore their perceived value of the experience.
The established qualitative method of participant observation was used to gain familiarity with the online data landscape. Semi-structured interviews were then conducted with users and advisors of the platform Our Data Helps, an online portal collecting personal data for suicide and mental health research. Data were analysed using an open inductive analysis method, a tentative conceptual framework was established, and the findings, limitations, and potential avenues for future research outlined.
Background
The experiences of data donation described in this article are specific to users of OurDataHelps.org, an online data donation platform established in 2016 to advance social, behavioural, and biomedical research. Powered by research and data analytics company Qntfy, the platform’s aim is to amass large data sets to test machine learning models so as to (1) better understand mental well-being, (2) develop new ethical protocols for online media health research, and (3) to support the development of preventive treatments. Collected data types include donors’ publicly posted messages to family and friends on social media networks (Facebook, Twitter, Instagram, etc), wearable sensor data (ie, Fitbit), and workout data (ie, Runkeeper). The platform does not access private or direct messages and they do not interfere with donors’ online behaviours. Although it is predominantly an American platform, the platform receives data donations from across the globe.
Studies conducted by Qntfy are already contributing to literature highlighting the value of non-biomedical data for health research. For example, the results of a study quantifying various aspects of Twitter data via automated methods indicated quantifiable signals relevant to bipolar disorder, major depressive disorder, post-traumatic stress disorder, and seasonal affective disorder. 14 Similarly, another study used sentiment analysis to examine the affective micro-patterns in social media posts. Results suggested micro-patterns in social media posts hold some power to distinguish users with a mental health condition and users with a history of panic attacks or suicide, from their matched controls. 15
Ethics
Throughout its development, this Internet-based research with human participants has followed the digital ethical guidelines of Heider and Massanari, 16 the American Association of Anthropology’s statement on ethics, 17 as well as the Ethical Research Protocols for Social Media Health Research of Benton, Coppersmith, and Dredze. 18 The primary ethical obligation of this research, involving people who have donated their data traces to help advance research concerning mental health, was to do no harm and to avoid negatively affecting the well-being of participants. This was particularly important given the vulnerability of participants who have experiences of suicide and loss. Informed verbal and written consent was obtained. Participant feedback was also always taken on board and things were omitted or altered on request to reflect a participant’s realities and experiences.
Written informed consent was also obtained from participants for their anonymised information to be published in this article. Privacy has been protected through the use of pseudonyms and changed personal details. Moreover, social media posts or comments by, or content from, people who were unaware of the study have not been included.
Research Methods and Analysis
Guiding the research was the question, how is the donation of personal data to OurDataHelps.org experienced? The aim was to explore the phenomenon of data donation so as to further understanding of personal experiences engaging with a digital participatory data collection model. Following a constructivist epistemological perspective, a mixed-method approach was conducted. To begin with, the established qualitative method of participant observation was used to gain an insight into the broader themes linked to the phenomenon of data donation online. To understand the experience of donation specific to OurDataHelps.org users, semi-structured interviews were then conducted. Data were analysed using an open inductive analysis method and a tentative conceptual framework was established.
Participant observation
Over the period of 12 months, participant observation of the Our Data Helps and Suicide Prevention Social Media community (SPSM) was conducted. Involvement in these communities was overt, wherein the researcher remained at the centre of the research process. 19 By observing activity on the online community discussion boards and participating in weekly YouTube stream sessions run by SPSM, rapport with the community was established and a familiarity with their interests, concerns, and colloquialisms was achieved. Participant observation also took the form of informal interviews with advisors of Our Data Helps along with data donors from other donation platforms – Tidepool, Waoo, and Data World. These conversations were instrumental for identifying themes and concerns of individuals within the wider data donation community. Given that the Our Data Helps community comprises anonymised individuals, participant observation was also an intrinsic method of sourcing participants for the semi-structured interviews.
As a method of documentation, the keeping of field notes was integral. Notes were taken in 2 forms: (1) an online repository of thoughts and questions and (2) a handwritten notebook of reflections. 20 The data captured included observations, informal conversations, and records of activities relating to the research. Threaded throughout was both an etic voice (the perspective of the researcher – with auto-ethnographic information), combined with the participants’ voices providing an emic perspective (the perspective of the subject). 21 Data were then thematically analysed, a multi-step process of coding that established meaningful patterns across the data sets. Illustrated in Figure 1 are the broad range themes and viewpoints which emerged and influenced the ongoing research and semi-structured interview strategy.

Code clustering: Participant observation analysis of the Our Data Helps and Suicide Prevention Social Media community.
Semi-structured interviews
Participants in the living experience of data donation were found using the aforementioned methods. In total, 36 individuals were contacted and from this group, 7 were interviewed. These participants were aged between 23 and 65 years; 4 were women and 3 were men. They came from Canada, America, and England. Their education levels ranged from high school graduation to university professorship. The majority (5) worked in research or health-related professions, the minority (2) worked in creative industries. Over half (4) of these participants were suicide survivors (meaning a family member or friend had died from suicide), and all participants were advocates of mental health research.
Location disparities and privacy requests demanded the interviews take place both online (using video calls) and in person. Being semi-structured interviews, broad, open-ended questions were used to give space for exploration of feelings and memories. Interview data were analysed using an open inductive method – a process of open coding. 22 Following traditional methods of Grounded Theory, categories were developed in the final stages from the clustering of codes which then linked to form a tentative conceptual framework. Reported in the section ‘Findings’ are the final themes chosen for relevance to the research question, with extracts from interviews to support the analysis.
Findings
Current data donation landscape
Analysis of the data from participant observation highlighted that data donation exists among an international ecosystem of health-related data sharing activities. This included data philanthropy, data collaboratives, data pooling, data markets, and data brokers. Figure 2 illustrates the network of data donation platforms documented from the observations of online discussions, informal conversations, and semi-structured interviews.

Documented health-related data donation platforms.
The experience of data donation
Underpinning participants’ experiences of donation were various understandings of personal data, as well as a multiplicity of perspectives surrounding its value. To begin with, the value of ‘personal data’ was underlined by notions of privacy and protection. As an algorithmic entity, it was also tied to concepts of the quantified self, emphasising quantitative identifiers as opposed to qualitative identifiers. Moreover, participants’ recognition that their personal data had value did not coincide with comprehension of its essence, extent, shape, or form. Not a single participant understood the extent of what their personal data were comprised of, nor had they any idea of how to access it. Value was tied not to the entity itself but to its potential application and usability for research benefiting the greater good. At the core of each participant’s experience was hope that personal information could be of use to health research.
Theoretically, to quantify the value of personal data for data donors, one could use a differential measurement method, 23 measuring the levels of value donors give to data before the act of donation against the value given after data are utilised in research. The issue with using this approach is that the value of data stems from uses and insights unanticipated at the time of data donation and highlights a fundamental impossibility, at this time, of being able to quantify the value of future uses of data. Moreover, participants in this study had no knowledge of the actual applications of their personal data, rendering this measuring method inapplicable.
The monetary value of personal data – the data donor perspective
From an economic perspective, value is underpinned by desire and exchange, as objects are measured by how much people are willing to give up to obtain them. 24 The value participants attributed to their personal data varied considerably, as the data one participant considered valuable, another individual did not. Nevertheless, all participants in this study understood their personal data were an economic asset that was of particular value for the marketing industries. One interviewee commented that the economization of information was the ‘price to pay’ for tools and services online.
Participants also spoke of feeling a sense of disappointment at the commodification and economization of personal information. However, they spoke as if the very voicing of the idea of changing the economic world was idealistic and idiosyncratic. One can attribute this to the robustness and strength of the capitalist imaginary, which renders even the search for alternative systems unrealistic.
25
Value, in the sociological sense
Value is the social construction of what is good, proper, or desirable and is defined by and dependent on the community in question. 26 For the community using Tidepool, a non-profit open source data platform that collects data from individuals living with type 1 diabetes, the value of personal data lies in the potential to enhance diabetes management and therefore improve the health of individuals. In the Our Data Helps community, personal data took on a twofold valorisation. First, it was valued for its contribution to mental health research developing effective suicide prevention tools. Second, the very experience of data donation was valued by the suicide survivor community for its healing efficacy and its documentation of their distinct bereavement. Underscoring all of this was a familiarity with a community and/or a strong affinity with the cause.
Data donation and the grieving process
Two themes of suicide survivor bereavement were identified as instrumental to Our Data Helps’ development and the gifting experiences of donors. The first was the emotional need to make sense of a suicide death. The second was the ability of data donation to contribute to a sense of ‘purposefulness’. Our Data Helps facilitated both of these bereavement processes by providing a way for suicide survivors to help others with shared experiences and to help change the narrative of both suicide and suicide bereavement – which further demonstrates the way in which social media, online groups, and Internet platforms are becoming resources for support. At the same time, while the altruism of some participants’ gifting of personal data stemmed from their bereavement and experience as suicide survivors, this was not true for all the participants.
Perceptions and concerns of private and public
Experiences shared by participants revealed a general awareness and acceptance that digital technologies have altered the underlying architecture of social interaction and information distribution. Specifically, participants identified the properties which are changing the rules as being: persistence (the Internet does not forget), replicability (copy and paste), search-ability, and invisibility. Conversations spread, contexts collapse, and new technologies consistently destroy attempts at erecting digital walls. Despite the promises and suggestions of ‘private messaging’, ‘lists of curated friends’ and ‘personalization’, the data donors interviewed were conscious of the possible shift of content from private to public. In one participant’s words, ‘
The awareness of continuous surveillance affected participants. Whereas some turned to the collective strength of communities for protection, others established their own set of social structures, using filters and posting rituals. Data donors’ primary concern was what their data were being used for. There was a fear that analysis of Internet activity could reveal personal truths and lead to the suspension of civil liberties (particularly for those on suicide watch). This concern was overcome through a strong sense of trust in the platform – established through the support of the suicide survivor SPSM community, a connection to another data donor, or through a friendship with the platform developers.
Identifying with data
When asked if their activity online was reflective of themselves, participants shared views that ranged from ‘separation by extreme curatorship’ to ‘solidarity by harmonization’. In general, data donors perceived their collections of personal data to be social entities and actively constructed facets of their identities. The role of social media in constructing these identities was elemental, for most participants formed digital traces through daily social networking on Facebook, Twitter, and Instagram. Simultaneously, participants recognised that identities defined in the social web are not complete because the less attractive daily truths and personal details are curated out. Although the accuracy of self-published content is uncertain, what is reflective of individuality are the details, ‘
Discussion
The results of participant observation provided the opportunity for exploration of the thematic framework in which the data donation community and happenings are embedded both online and offline. Highlighted was the international nature of data sharing activities along with the concurrence of technology, health, and legislation as significant discussion topics. Semi-structured interviews then permitted Our Data Helps donors to share their experiences of donation to health research. Participants acknowledged that their personal data donation to research was a value laden experience, one that engages with notions of commodification, 27 yet challenges capitalistic frameworks; a phenomenon signifying a surveillance culture, 28 while also symbolizing communities supporting potential research and health innovations.
Despite sharing concerns of surveillance, the economization of personal information and a collapse of ‘private’ and ‘public’ online, participants were united by a sense of purpose and their hope for better mental health research and hence treatments. Having established trust with the platform, participants chose to make their information available to researchers rather than let it remain in the hands of platforms and companies whose intentions and use of private information are commercially focused or undisclosed. Altruism was thus a key motivational factor. This echoes studies of blood and organ donation,29,30 simultaneously challenging Graeber concept of ‘self interested calculation’ and Mauss theory of ‘obligatory and interested’ gift giving.31,32 Moreover, the experiences of participants who were suicide survivors aligned with literature and clinical studies of suicide bereavement, 33 in that they were driven to do something positive specifically in response to loss. More research is needed to delve deeper into this sensitive area of how data donation is involved in bereavement processes.
What also emerged from the explorations of data donation experiences was evidence that participants felt connected to their data, insofar as they were aware of its existence. Data were a truthful mirror of sorts to the participants’ habits and general persona, but suspicion and distrust were conveyed regarding the authenticity of these representations. Moreover, participants were not informed as to how to access their data or interpret it. This raises questions around the validity of their informed consent – for data were donated that were not understood, and the research uses were not completely known at the time of donation.
Limitations
There are a number of issues and limitations that undermine the findings of this explorative research which used mixed qualitative methods. To begin with, a major limitation was the small sample of participants interviewed and that all participants were English speaking, from a Western background, and were comfortable or motivated to talk openly about their experiences. Accordingly, findings of this study cannot be extrapolated or generalised to a wider population. More time and resources would have enabled a wider and more diverse sample, so as to gain a comprehensive understanding of the experience of data donation internationally – especially because data donation is a new and constantly developing field. A potential avenue for further research would be expanding sample size to include subjects who had either actively opted out of participating in data donation projects, and those who were not involved in the community, to further understand the concerns potentially limiting the participation of more individuals.
Conclusions
Despite the small sample size and a number of methodological biases inherent in using qualitative methods of anthropological origin, this preliminary exploratory study has highlighted several directions for future research. At this stage, only the experiences of the Our Data Helps community of suicide survivors, mental health researchers, and advocates were represented. Nevertheless, the findings offer normative implications that can be taken forward into future research of data donation platforms and the data sharing ecosystem. By recognising and understanding the motivations and concerns of individual participants, future projects can ensure that data donation processes are a positive experience and ultimately, this could help increase and sustain the potential resources for health researchers worldwide. Future research could focus more on quantitatively measuring the value of the experience of data donation and the value of different types of personal data donations. Additional investigations of the validity of informed consent in data donation practices, along with the role of data donation for bereavement processes, are needed. A detailed mapping of the international network of data donation practices would be of great value to the research community for it could identify which communities are adopting and which are excluded from this model of data collection.
Footnotes
Acknowledgements
Professor Undine Frömming supervised the project and guided the development of the methodological and theoretical framework.
Declaration of conflicting interests:
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding:
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Author Contributions
JS conceived, designed and carried out this research project. JS analyzed the results and developed the theoretical framework. She is the sole author of this paper.
