Abstract
Precision medicine and digital phenotyping are two prominent data-based approaches within digital medicine. While precision medicine historically used primarily genetic data to find targeted treatment options, digital phenotyping relies on the usage of big data deriving from digital devices such as smartphones, wearables and other connected devices. This paper first focusses on the aspect of data type to explore differences and similarities between precision medicine and digital phenotyping. It outlines different ways of data collection and production and the consequences thereof. Second, it shows how these sorts of data influence dominant beliefs in the field: The field of precision medicine relying on the dominant understanding of ‘genetic determinism’ imported from genetics, digital phenotyping building on the logic of ‘data fundamentalism’. In the end, the analysis shows how digital data informs potentials as well as challenges of precision medicine and digital phenotyping: a better health care for (some) individuals connected with individualisation and responsibilisation for all, with a prognosed shift from reactive to preventive medicine. Additionally, data-based approaches might facilitate epistemological and ontological redirections for the whole field of medicine that will also affect knowledge production and a reassessment of the value of different types of knowledge (quantifiable vs. non-quantifiable) with all its consequences. Institutionally, it might lead to shifts in distribution of power to experts in big data related technologies, i.e. private companies.
Keywords
This article is a part of special theme on Digital Phenotyping. To see a full list of all articles in this special theme, please click here: https://journals.sagepub.com/page/bds/collections/digitalphenotyping
Introduction
Precision medicine (PM) and digital phenotyping (DP) are buzzwords in medicine. The promise they contain is already expressed in their naming. PM aims to offer a precise and targeted treatment for an individual or a group of people based on clinical data. DP promises to categorise people into clinically relevant categories for the treatment of mental diseases, with the help of digital data. Both promise revolutionary shifts in medicine and healthcare. Both approaches are data-driven, have been developed only in recent decades and hold out the prospect of changing the field of medicine on the basis of digitisable data. They are both said to transform medicine as a scientific field and as a healthcare discipline. PM claims to offer individualised treatment for specific diseases which could not only solve major healthcare challenges, but could shift the whole medical field from responsive medicine to predictive medicine (Hood et al., 2012). DP seeks to revolutionise mental health by collecting continuous ‘real life data’ from digital devices of everyday use such as smartphones and wearables. With this additional data, knowledge gaps in mental health diagnosis could be closed without burdening patients with further tasks (Huckvale et al., 2019). PM and DP, being data-based, require specific statistical and computational expertise, which results in new stakeholders entering the field of medicine.
Data today is said to be the new gold. Many publications deal with the datafication of our lives as well as with the distinct influence of data not only being collected and aggregated but also analysed by algorithmic means (Lupton, 2018b; Ruckenstein and Schüll, 2017; van Dijck, 2014). Technical breakthroughs around data saving, transfer and analysis within the last 20 years have made it possible that approaches like PM and DP have a technical basis to even be thought of and developed. Datafication and an introduction of algorithmic tools such as artificial intelligence (AI) have taken place in different fields including politics, economy, and law, but significantly also in health and the field of medicine (Topol, 2019).
Digitalisation in medicine already started in the 1970s, then by the name of expert systems which were used to get diagnosis and treatment recommendations (Metaxiotis and Samouilidis, 2000). Over the last 15 years datafication, or ‘the conversion of qualitative aspects of life into quantified data’, has picked up (Ruckenstein and Schüll, 2017: 261). The process began with more and more patient information finding its way from paper records to electronic patient files, including the involvement of new techniques with digital output, e.g. in radiology. However, when people recently started using smartphones and wearables, engaging in digital communities, thereby producing also health-related data that can be collected, shared and analysed, datafication picked up momentum (Ruckenstein and Schüll, 2017). Thus, both technical breakthroughs and sociocultural developments such as wearable usage and sharing health-related experiences on social media, e.g. as part of the quantified self-movement, made it possible for data-driven methods to also win ground in medicine and healthcare (Lupton, 2018b; Ruckenstein and Schüll, 2017).
All these processes have resulted in PM and DP not just remaining a vision but becoming reality and constituting two of the most spoken about data-driven approaches in medicine till date. Still, the two approaches have had historically distinct starting points. They use different types of data which is based on different ways of data production. This paper will explore what PM and DP has in common and what separates them based on the type of data they use. In the end, it will show how this data-driven aspect of PM and DP results in similar logics within both approaches, whether in dealing with mental health conditions or diseases based in the realm of the body. First, both approaches will be analysed according to the sort of data they use. In the second part, I will outline the logic behind the main arguments in the field, also based on these types of data. In the end, I will conclude with the challenges PM and DP might pose due to their role as digital data-based approaches. Comparing PM and DP might look like an odd choice, since DP is practically a part of PM, while PM is recently described as an approach to collect as much varied data as possible to compose a patient's health map. Historically, however, big parts of PM predominantly deal with genetic and genomic data, while DP is till date firmly grounded in mental health issues. Also, different stakeholders seem to concentrate on different types of data or different disease types (Mindstrong, 2021; Prainsack, 2017). Contrasting PM based on genetics with DP dealing with mental health conditions, we see the similarities as well as the differences within the working logic in a field that tends to mental illnesses versus a field that predominantly deals with illnesses experts ground in the realm of the body. For this endeavour, I will focus on the following examples: breast cancer prediction, diagnosis, and treatment as an example of PM based on genetics, and the digital platform Mindstrong which claims to treat mental health issues such as depression, bipolar disorder and PTSD based on DP.
Precision medicine
PM is a medical approach that is best known for aiming to offer a specifically targeted treatment regime to a person or a certain group of people, ideally leading to a ‘personalised’ treatment for each person, which in the end could declare common disease labels obsolete (Ferryman and Pitcan, 2018; Prainsack, 2017). Other terms used for this approach in medicine are ‘personalised medicine’ or ‘stratified medicine’. The terms have appeared and been used in different historical, geographical and disciplinary contexts. I will use the term ‘precision medicine’ because it is the more recent term and for the sake of simplicity (Erikainen and Chan, 2019; Ferryman and Pitcan, 2018). Since 1990, PM has also been a buzzword and the closely related field of genomic science has been able to acquire considerable funding. An example is the human genome project (HGP) or the ‘all of us’ project in the United States (Cooper and Paneth, 2020; Ferryman and Pitcan, 2018), both with enormous expenditures. The ‘all of us’ project will reach total costs of about 1 billion US$ once finalised (Cooper and Paneth, 2020). PM represents an interdisciplinary endeavour combining biology, medicine, informatics, computer science, mathematics and statistics (Erikainen and Chan, 2019). Computer science, mathematics and statistics function to store and analyse data. The collaboration between human molecular genetics with modern computer science since the 1950s has been a thriving force to establish PM (Cooper and Paneth, 2020; Ferryman and Pitcan, 2018). The most prominent desiderata of PM for the field of medicine are disease understanding and prediction with a strong focus on cancer, aiming to find a suitable treatment with low incidence of adverse drug reactions (Cooper and Paneth, 2020; Prainsack, 2017). Examples are targeted breast cancer treatment or ‘personalised’ antiretroviral treatment for HIV-positive individuals based on treatment optimisation tools (Kumari et al., 2017; Low et al., 2018). Data about genetic predisposition, lifestyle information, clinical data, etc., all of which has to be ‘structured, digital, quantified, and computable data’ is combined creating a ‘unique thumbprint’ of a person to inform their diagnosis and treatment, also called ‘personal health maps’ (Prainsack, 2017: 3–4). The best-known examples till date, however, involve genetic and genomic data.
Digital phenotyping
Psychiatry got more interested in PM since struggling for a long time under the lack of objective markers for mental health conditions that science finds are a key basis for correct diagnosis and optimal treatment. Under the name of ‘personalized psychiatry’, and other similar labels scientists aimed to identify genomic or molecular biomarkers of mental health conditions to be able to offer targeted treatments to individuals Rüppel (2019: 593) describes being ‘increasingly rearticulated as ‘Big Data’ project’ (Prainsack, 2017). DP is an advancement of this search for biomarkers that includes data collected by digital devices with sensors used in everyday life in its assessment to identify ‘digital biomarkers’ (Dagum and Montag, 2019: 14). DP goals are finding ‘(digital) diagnostic markers’ for mental health conditions by correlating a collection of sensor data and self-reported data and (thereby) monitoring and predicting mental health statuses based on these biomarkers, and behavioural phenotypes (Birk and Samuel, 2020: 1874; Dagum and Montag, 2019). The idea of DP or the digital phenotype was raised by two groups of scientists, Jain and colleagues and Torous and colleagues at the same time, both referring to the usage of large amounts of digital data to find patterns in human behaviours and traits, linking those to ‘disease phenotypes’ (Birk and Samuel, 2020: 1876). The assumption is that mental health conditions show themselves in digital traces of a person and ‘behaviour-expressed symptoms’ can thus be identified as ‘behavioural phenotypes’ (Dagum and Montag, 2019: 14; Garcia-Ceja et al., 2018). DP aims to gain data from ‘naturalistic settings in-situ, leveraging the actual real-world’ and was only possible once digital devices such as smartphones and wearables were ubiquitously available (Birk and Samuel, 2020: 1876). Examples of DP are the assessment of the lacking efficacy of lithium for individuals with ALS in slowing down disease progression, assessed through the analysis of online disease communities (Jain et al., 2015). Other examples are the analysis of insomnia-related tweets, and monitoring and prediction of (clinically) relevant outcomes in people with mood disorders such as major depression, bipolar disorder, and schizophrenia based on sensory measurements but also on peoples’ participation in social media (Barnett et al., 2018; Dagum, 2019; Jain et al., 2015). There is a lot of ongoing research trying to tackle different needs, e.g. the RADAR-CNS project from European Union aiming to develop an open-access platform around mobile health data, and initiatives to provide mental health treatment by smartphones to low-income populations. However, much of the research is still in an experimental stage (Melcher et al., 2020; Ranjan et al., 2019). DP is currently used mostly in the realm of mental illnesses. However, thinking of DP as part of a PM that gathers all sorts of data to compose a personal health map, it could be used for all types of diseases (Huckvale et al., 2019; Prainsack, 2017).
As far as PM aims at incorporating all sorts of data for diagnosis and personalised treatments, DP seems to be the part of a personalisation which Rüppel describes being ‘increasingly rearticulated as ‘Big Data’ project’ (Prainsack, 2017; Rüppel, 2019: 593). DP and PM are both data-driven approaches aimed at understanding diseases and offering targeted treatments, only the data they use can vary from biological, genetic and genomic data to sensor-collected data from digital devices such as smartphones and wearables (DP).
Different sorts of data – different consequences?
This section will focus on an analysis of the distinct sorts of data used in PM and DP, based on the cases of breast cancer treatment for genetics-based PM as an example of disease grounded in the biological and the platform Mindstrong for DP as an example of mental health. In the following, I will point out how the type of data influences the way the dominant beliefs within the fields of PM and DP work.
Type of data and data collection
Ferryman and Pitcan (2018: 3) define PM based on their ethnographic research as ‘the effort to collect, integrate, and analyze multiple sources of genetic and non-genetic data, harnessing methods of big data analysis and machine learning, in order to develop insights about health and disease that are tailored to the individual.’ The data involved is genetic data and a variety of other data including clinical data and lifestyle data (Prainsack, 2017).
One method of data accumulation within PM is to start with data collection just for the means of having a large data pool. The HGP would be an example of such an approach. Human DNA data began to be collected with the promise that it would inform solutions to health issues. The project started in 1990 with the goal of mapping the entire sequence of human DNA. The main rationale behind it was to find new insights into human health and disease, which helped to secure a huge amount of funding for this project (Ferryman and Pitcan, 2018). The other approach is to conduct genome and sample analysis within pathology for specific targets. The examples on which I will focus involve the identification of specific genetic characteristics in the genome of women with a family history of breast cancer or the pathological sample of breast cancer patients, to decide upon targeted forms of therapy. People with inherited BRCA1 and BRCA2 gene mutations have an increased risk of developing breast ovarian cancers and should follow therefore more rigid preventive measures. Oncogenic human epidermal growth factor receptor (HER2) positive breast cancer patients profit from treatment with trastuzumab and lapatinib that specifically target HER2, the receptor regulating ‘cell growth, proliferation and differentiation’ (Low et al., 2018: 502). Additionally, genomic research in breast cancer is still ongoing to identify other targets. In general, this genetic data is analysed, i.e. ‘produced’, at specific times by specific experts in health institutes or external labs.
Conversely, DP works with digital data collected through the usage of smartphones, computers, tablets and wearables, such as ‘FitBit’ or the ‘Smartwatch’, and other devices with digital sensors. A variety of hardware and software sensors come into play collecting data such as inertial sensors measuring walking speed, physiological sensors measuring heart rate and ambient sensors measuring temperature (all hardware sensors); but also software sensors that measure internet activity, social media presence, typing speed, etc. (Birk and Samuel, 2020; Garcia-Ceja et al., 2018). DP can be divided into behavioural phenotyping and digital biomarkers (Dagum and Montag, 2019). Behaviour phenotyping uses digital information such as location, physical activity, mood, speech patterns, typing speed and call activity, but also social media usage and search terms to search for ‘behaviour-expressed symptoms’ (Dagum and Montag, 2019: 14; Garcia-Ceja et al., 2018). It uses, e.g. passive sensing data of GPS location and call logs as a proxy for behaviourally phenotyping loneliness (Birk and Samuel, 2020). Digital biomarkers aim at measuring ‘trait and state changes in neuropathology that can be indicative of disease risk, disease onset, disease progression or recovery’ (Dagum and Montag, 2019: 14). Mindstrong, e.g. claims to have identified ‘a set of digital biomarkers from human-smartphone interactions that correlate highly with select cognitive measures, mood state, and brain connectivity’ that can be related to depression, anxiety, negative, positive affect and other mental conditions (Mindstrong Health, 2021). Assessments Mindstrong uses, e.g. in the ongoing AURORA study on PTSD are ‘continuous-time accelerometry data, keystroke characteristics, time and duration of phone calls, time and character length of text messages, text words/symbols used, time and number of emails, smartphone screen time, and intermittent GPS data’ (McLean et al., 2020: 5). Saeb et al. (2016) claim mobile phone location data can be used to predict depressive symptom severity and might therefore serve as a biomarker for depression. In summary, DP is based on the digital measurement, collection, analysis and interpretation of enormously varied activities to enhance understanding and treatment of mental health conditions. Related data is collected continuously, live and in situ within ‘real-life settings’ (Birk and Samuel, 2020).
Data producers
Within the genetic part of PM, the processes of data production take place in highly regulated and controlled environments. Following the example of pharmaceutical labs, those facilities need to prove that they follow specific protocols even to be granted permission to produce the data. The adherence to the protocols has to be assured through different measures such as quality control, quality assurance and audits (National Human Genome Research Institute, n.d.; U.S. Government, 2021).1 The facilities are frequently audited regarding their adherence to these protocols. Additionally, special devices used by trained expert personnel, e.g. lab technicians, are required to obtain the data. The data can only be produced by people with specialised training (lab technicians) who have access to certain facilities (labs). However, it is not just laborious but also expensive to produce this data. The necessary facilities and lab equipment are costly and so it is to follow protocols. Large teams of trained personnel are needed and have to be paid to develop and adhere to protocols and assure their quality, and are able to hold the approval status for the whole facility (U.S. Government, 2021). As a result of the resource-intensive and high-priced nature of data production, somebody has to bear the costs and the expenditures are sometimes regulated and tried to be kept at a minimum (Low et al., 2018). In the example of breast cancer, testing of the genetic properties of a patient and their sample is done once (Cedars-Sinai Blog, 2019). To summarise, the production of genetic data is initiated actively, conducted at specific time points, and is a laborious and expensive process.
For DP, the collected data is produced and shared by users of digital devices, whose live and local data can be collected continuously. The ways of collection are enormously varied, depending on the digital device and sensors used (Birk and Samuel, 2020; Garcia-Ceja et al., 2018). Often this data is collected as a by-product of practices, such as typing speed, click behaviour and not collecting personal data, as in the example of Mindstrong. Compared to the actively initiated aspect in PM, this data is provided actively (when self-reported) as well as passively (e.g. when cell phone usage is monitored) by the users. Prainsack (2017: 21) calls the users of the devices ‘prosumers’, a combination of producers of data and consumers of information. Most of this data falls under the term big data, which has been described with properties as ‘volume, velocity, variety, exhaustive in scope, resolution, relational and flexible’ (Kitchin, 2014: 1; Rüppel, 2019). This means that it entails huge amounts of data with a high granularity. The aspect of continuous real-life data collection is named as one big advancement of DP compared to other diagnostic tools for mental health conditions (Dagum, 2019). Big data is simultaneously described as precise, because it is produced close to a person and in real life, and as messy or unclean because it might have been collected for other purposes and in an uncontrolled environment with sensors that might not be equipped to collect data in research quality (Huckvale et al., 2019; Kitchin, 2014; Torous et al., 2016). There is a whole industry dealing (in the double sense) with health-related data called the data industry (Cosgrove et al., 2020; Prainsack, 2017). These companies offer services around data saving, collecting, cleaning, trading and analysis (Kitchin, 2014). The producer of the data (or ‘prosumer’) is every individual using digital technologies like smartphones and wearables, or other connected devices with sensors (Garcia-Ceja et al., 2018; Prainsack, 2017). Additionally, any person who interacts on an online platform such as Facebook and Twitter or is simply conducting an online search provides data for this corpus of data.
The focus on the different types of data used so far in PM based on genetic data and DP reveals the following differences between PM and DP: (1) the type of data being lab-generated data versus sensor-derived data from ubiquitous available digital devices; (2) data being collected rarely versus moment by moment, live and in situ; and (3) the producers being professional labs versus users of digital devices. This next section will now show how these characteristics of data influence the logic within each of the two approaches.
Logic within the field
Drawing on Annemarie Mol's ‘logic of care’, I will use the term ‘logic’ for a persisting rational in the field or a style that seems to be appropriate (Mol, 2008: 1). One could say the logic within the field is also a certain dominant rational existing in the field which is not questioned. This logic within the field seems to have different nuances based on the type of data used in the field. Historically, genetic or genomic data was used for PM, which was first expanded to clinical data in general and resulted nowadays in the usage of a whole variety of data that is helpful when composing a personal health map (Prainsack, 2017). As I have pointed out, the most prominent examples of PM till date involve genetic or genomic data. With the establishment of genetics as a key discipline within biology, a certain dominant mindset was also established, driven by the assumption that every question regarding the functions of living organisms including human health and disease could be explained with the help of genetics. This mindset is called ‘genetic determinism’ (Peters, 2012). Cooper and Paneth (2020) point out that in genetically based PM, genetic determinism prevails. Weiss (2017) also calls this logic ‘mendelian fundamentalism’, which is an even more drastic description. Both are based on the assumption that living beings are their DNA, or differently put ‘Genes R Us’, and it is thus possible to find solutions against illnesses knowing about the genetics behind (Peters, 2012: 10). Also, if all information and logic of the DNA are revealed, it would be possible to understand how the body (and the mind) work (Peters, 2012). Similarly Weiner et al. (2017: 989) speak of a ‘genetic imaginary’ being invoked where the biological is seen as key for understanding diseases and a molecular understanding of diseases is supposed to lead to a new type of medicine. Also this logic has its roots in genetics’ ‘molecular vision of life’ (Weiner et al., 2017: 999). The imaginary ‘is being continuously remade and rearticulated’ and gets actualised as new hopes are being constructed based on new biotechnologies such as gene editing, and the development of biological drugs (Milne, 2020: 103). Ultimately, the imaginary works for the persistence of the cultural power of genetics. However, in focussing on technological inventions, it is obscuring the social determinants of health and illness (Milne, 2020).
Erikainen and Chan (2019: 320) showed how the choice of rebranding ‘personalised medicine’ that happened in the U.S. policy context fell on ‘precision medicine’ also because the adjective ‘precision’ has an ‘ethically neutral or even positive’ connotation. I suggest that another essential argument within the field is the very aspect of ‘precision’, which as a characteristic not only seems to speak for the possibility of choosing a targeted treatment but surrounds genetic data like an aura. It is the conception that genetic data is precise: first because of the effort and the ways how it is measured, second because it is conceptualised as being the basis of all living beings and third having the image of being all-encompassing and everything we have to know to base our decisions on; the last two arguments being firmly rooted in genetic determinism and rearticulating the genetic imaginary (Keller, 2002; Peters, 2012; Weiner et al., 2017). It seems like these aspects give the whole field more credibility, and as if the truth is to be found in the details, and the closer we look, the closer we get to understanding it (Erikainen and Chan, 2019; Keller, 2002; Weiss, 2017). The hope to find more molecular markers for breast cancer through genomic profiling is one example of this logic (Low et al., 2018). This argument fits very well into the scientific worlds of medical science and biology, which operate under positivist assumptions. It is the mechanistic assumption that understanding a process detailed enough will reveal the truth behind it. Based on this assumption, analysing the problem more precisely will lead to better suitable treatment options simply because the premises were more precise and thus closer to the truth.2 Hopman (2020: 425) in her study on forensic DNA identifies the logic of accuracy within “the search of the uniqueness of the individual”. In a search for the genetic uniqueness of people, this logic results in the accumulation of data. It functions also as a ‘logic of expansion’ that constantly searches for new genetic ‘“territories” to map’ (Hopman, 2020: 428–429). In this constant search for more data, it permanently has to attract more money. The logic resembles very much the search for the ‘unique thumbprint’ to compile personal health maps Prainsack (2017: 3) speaks about within PM. Also, PM is constantly aiming to attract more money, which will seem to be well invested as ‘precision’ implies PM will make medicine ‘more effective, and thus also cheaper’ (Prainsack, 2017: 79).
Several points of criticism can be raised to counter this logic. First, experts in genomic medicine reject that ‘the more we learn about the genome, the more distant it seems to be from a role as a causative agent in most widespread diseases’ and instead acknowledge the role of genomic medicine as ‘a way to do science, not medicine’ (Cooper and Paneth, 2020: 67). Medicine based on genetic knowledge might be helpful for diseases firmly grounded in genetics, however, not for other diseases. Thus, for multifactorial diseases, other measures such as public health seem to be more effective, even if they may not have the connotation of ‘precision’ and ultimate truth (Cooper and Paneth, 2020; Weiss, 2017). Second, critical social science perspectives would challenge the idea that a solution can only be found based on mechanistic principles. As we have seen in the genetic imaginary argument, they privilege technical solutions over taking into account systemic implications and social determinants of health. Those principles may be important for reasoning in the field. However, decisions in the field of medicine seem to be far more complex, entail much more information and rely on manifold practices as studies, e.g. in medical anthropology, show (Kim et al., 2018; Spinnewijn et al., 2020).
For DP, the logic behind the argument is distinct but arrives at a similar conclusion. Generally for DP, the following goals are expressed: early disease detection and – surveillance, identifying and incentivising healthy behaviour, developing new, more targeted interventions and treatment strategies (Jain et al., 2015). Additionally, stakeholders claim that DP will be ‘providing a more comprehensive and nuanced view of the experience of illness, [because] an individual's interaction with digital technologies affects the full spectrum of human disease from diagnosis, to treatment, to chronic disease management’ (Jain et al., 2015: 462). Several experts within DP expand their hopes towards a description of how this additional knowledge gained through big data analysis will also lead to changes in classification and diagnosis and treatment of diseases ‘in ways that matter most to patients’ (Burnett, 2015; Huckvale et al., 2019; Jain et al., 2015: 463). The Research Domain Criteria Initiative (RDoC) in the U.S. aiming at establishing a new classification system based on ‘basic science’ for mental health conditions is an institutional step in this direction (Rüppel, 2019: 571). The continuous, in situ and live monitoring of individuals’ behavioural activities collected as big data are stylised as the missing piece in understanding mental health because they can be collected continuously and close to real life (Huckvale et al., 2019; Jain et al., 2015). At the same time, current diagnostic practices for mental illnesses relying on self-reporting of symptoms are framed as not conclusive enough for diagnosis (Huckvale et al., 2019).3 One quest of DP is the search for digital biomarkers of a field frustrated by a long unsuccessful search for biological markers (Birk and Samuel, 2020; Brietzke et al., 2019). A variety of data is used, aimed at understanding mental health diseases better. This is framed as having the potential not only to guide new ways of measurement and treatment but also to change the classification of diseases and therapeutic measures (Burnett, 2015; Huckvale et al., 2019; Jain et al., 2015). This is interesting when looking at different conceptualisations of big data. Within some fields like data science, big data is still regarded as unclean and messy and there exists a whole industry that focuses on practices of ‘cleaning’ and preparing big data for analysis. Stakeholders who are pro-DP mark common practices of disease identification like interviews between patients and doctors as subjective or at least not sufficient and at the same time see digital device collected data as the new Holy Grail. Thus, it seems as if lived experiences (or digital traces of everyday practices), once they are collected through digital devices which are able to provide them continuously, in situ and live, gain in worth also because it was a digital device which gathered them and because they are collected close to the individual in ‘real-life settings’ (Mau, 2017; Quinn, 2021). Self-tracking from digital devices is framed as providing trustworthy data in contrast to the individual body's perception which is marked as untrustworthy or at least not reliable enough to be the sole ground on which diagnosis should be based upon (Lupton, 2015).
I propose that this way of conceptualising data and guessing it would change a whole field can be termed as ‘data fundamentalism’, a term coined by Crawford (2013). Data fundamentalism is defined by ‘the notion that correlation always indicates causation, and that massive data sets and predictive analytics always reflect objective truth’ (Crawford, 2013). Additionally, digital traces are conceptualised as (digital) ‘biomarkers’ to use a common concept known in science. Accordingly, Mindstrong's homepage indicates ‘[T]o identify the digital phenotyping features that could be clinically useful, Mindstrong used powerful machine learning methods to show that specific digital features correlate with cognitive function, clinical symptoms, and measures of brain activity in a range of clinical studies (Mindstrong Health, 2021).’ Thus, the case of DP seems to be a combination of data fundamentalism and biologisation of digital traces. It correlates digital features to states of health and illness (‘digital biomarkers’) which have big data as a prerequisite and thus, comes to the conclusion that those ‘reflect objective truth’. However, so far no digital biomarker merits the scientific definition of a biomarker, i.e. ‘a characteristic that is objectively measured and evaluated as an indicator of normal biologic processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention’ (Dagum, 2019: 14). Different scientists working on DP come to the conclusion that digital biomarkers are objective, and ‘reliable information on each individual's behaviour’ (Brietzke et al., 2019: 223; Huckvale et al., 2019; Jain et al., 2015). Insel, e.g. compares the possibility of monitoring brain function by DP to a ‘continuous glucose monitor in the world of diabetes’ (Metz, 2018). These statements show how medical experts in the field stylise DP to provide real and reliable information about peoples’ behaviour that can be correlated to mental health conditions. They go even further when claiming they could indeed predict mental health status, e.g. relapses into depression, before individuals themselves and professionals would be able to know about it (Dagum, 2019).4 They conceptualise big data on behaviours as a reflection of lived experiences. Actively and passively generated data from digital devices of people is framed as resulting in scientifically measured, analysable proxies for mental health and thus hold objective (and thus ultimate) truth. I would like to add that it is not just about the amount of data (which is usually important in big data) and the collection by a digital device, but data that has been collected close enough to the subject and so continuously that it seems as if it would be a direct window to peek into real life.
Looking into the depth of the type of data and in line with other authors I propose that for DP, truth is thought to be found in peoples’ behaviours: what time they get up, when and how long they sleep, when and in what way they take part in social media activities, how often the text/call, how much they move, etc. From a social science perspective, it is the digital traces of people that may represent digitisable traces of pieces of everyday life practices. However, it is known that the social is digitalised only in a rudimentary form (Birk and Samuel, 2020). Also, many aspects of our everyday lives and routines are not recognised by digital devices, as our complete experiences with those practices are also not digitisable (Mau, 2017; Quinn, 2021). There are different ways to conceptualise the collected digital features. The data about a person has been described as ‘data doubles’, ‘data traces’, ‘data silhouettes’ and ‘digital fingerprints’ (Lyon, 2003: 22; Mindstrong Health, 2021; Quinn, 2021: 2). Loi (2019: 158) criticises this sort of framing that assumes being a one-to-one reflection of a person and suggests, digital information can indeed be labelled as ‘(digital) extended phenotype’ because it may actually show health-related conditions of a person and additionally bears traces of how the user and other people retroact with it. His framing strengthens the aspect that it is also ‘a collective creation, involved in social feedback loops’. Lupton claims that the ‘human-data assemblages’ connect body and data, but that people make their data, as well as data, makes people (Lupton, 2016, 2018b: 1). Still, these ‘data bodies’ are ‘lively’ in that they are ‘unstable and generative’ and ‘lead a life of their own’ (Lupton, 2018b: 2; Mager and Mayer, 2019: 95, 98–99). These data doubles ‘are not innocent or innocuous virtual fictions […]. They have ethics, politics’ and are thus, ‘no longer “doubles” […] but they are intrinsically interwoven with bodily practices and biopolitics’ (Lyon, 2003: 27; Mager and Mayer, 2019: 99). Companies like Mindstrong might have less interest in tracing back data to one particular individual but more in aggregating data about a collective to gather more data for digital biomarkers. Still, the logic within the field of DP works in the way that the continual flux of ‘real life data’ gives us access to ‘real life’, so to say to a deeper truth or a truth behind it all, which is an authenticity claim. This seems to be equivalent to being closer to people. Critical data studies rather suggest that data traces and the individual behind retroact with each other and differ according to who is zooming in, how and with which epistemology and ontology (Grommé, 2016; Loi, 2019).
This assumption of having data being produced continuously and so close to the subject of interest (and thus, tied to the logic of accuracy) upholds the illusion of knowing a deeper truth about this very subject. It claims to have access to all the relevant information about the data subject needed for certain assessments. This seemingly exhaustive knowledge, continuously streamed from real life and conceptualised as objective is the basis of the assumption that we will be better in deciding which treatment options are ideal, even more so, we will be able to classify disease categories anew. In contrast, a critical social scientist perspective remarks that ‘data doubles’ lead their own lives and sometimes do not have that much similarity with the person from whom they are derived and therefore defy the illusion of being a window to real life (Lyon, 2003: 22; Mager and Mayer, 2019).5
Consequences and future aspects
PM will have different impacts on the field of medicine as scientific discipline and healthcare depending on what is the centre of analysis: the field's ontology and epistemology or the patient. The logic behind data analysis is driven by mathematical and statistical aspects leading to a shift to calculate outcomes for the individual from population data. In the examples of breast cancer diagnosis, the sample of individuals with breast cancer is compared to knowledge about ‘genetic characteristics of a population sub-group’ (Low et al., 2018; Prainsack, 2017: 4). This logic of PM from the 1990s and 2000s meant, clinically collected data of certain subpopulations served to find targeted treatment for the individual (Prainsack, 2017). This logic stemming from epidemiology also pervades other medical disciplines. Additionally, PM has a ‘systems medicine’ approach when it comes to the object under investigation. The focus here has been redirected from molecular or cell biology to systems and models of disease pathways. This development presents an ontological and epistemological change in biomedicine (Erikainen and Chan, 2019). When centring the patient, there seems to be a twofold shift. One shift is from the individual to the population when it comes to the logic of prediction, where population data is used for. At the same time, there is a shift from population medicine to individualisation when it comes to how responsibility is framed. This means that rather than looking at population disease risk, the discipline focuses on individual risk or individual prediction. This entails a shift from the population to the individual regarding the responsibility for health. Consequently, the individual can more easily be made responsible, and population systemic aspects are not being taken into account. Erikainen and Chan (2019: 314) describe this with ‘responsibilization’ of the individual (rather than medical professionals) which comes from being seen as individual with autonomy first and not as part of a collective with a solidaristic arrangement (Prainsack, 2017). Additionally, healthcare shifts from being reactive to being proactive, which can be seen by predictions that are thought to be possible based on genomic data (Erikainen and Chan, 2019; Ferryman and Pitcan, 2018). Breast cancer is an example of that when Low et al. (2018: 503) speak of ‘genomic profiling by clinical sequencing’ as the next step to identify cancer risks in healthy individuals. This is already a reality for people with a strong family history of cancer who are screened for BRAC1/2 gene mutations to decide about more frequent check-ups, preventive measures, or specific forms of treatment.
Similarly, for DP, experts of the field claim how DP is all about ‘person-centered care’ which also here results in more responsibility being given to each individual (Huckvale et al., 2019: 88). However, there are additional far-ranging consequences when analysing the processes within DP. I will now focus on those related to data types. Regarding the field's ontology and epistemology DP even more so than PM is hoped to finally provide long longed for (digital) biomarkers for mental health conditions. The RDoC funded in 2009 by the NIMH is one prominent example for the drive for a change of the field aiming at moving psychiatry beyond so far ‘descriptive diagnosis’ for mental health conditions with a new classification system based on ‘basic science’ (Rüppel, 2019: 571). Big data is framed as a missing piece that might help to understand diseases in their ‘expression in terms of the lived experience of individuals’ (Burnett, 2015; Huckvale et al., 2019; Jain et al., 2015: 463). The continuous, in situ and live monitoring of a patient's activities are stylised as the missing piece in understanding mental health.
‘Empowerment of the patient’ is a frequently found trope in DP, hoping that people take agency over their health which seems just another form of ‘responsibilisation’ found within PM in general (Erikainen and Chan, 2019; Prainsack, 2017). However, it is still questionable who will profit from digitalised psychiatry and who will rather see the downsides. While it has been shown that access to health information, e.g. through wearable use is empowering for some people, usually those with more resources, social minorities might find it rather disempowering (Birk and Samuel, 2020; Prainsack, 2017). DP is discussed to disregard social inequalities and conceptualise it rather as individual mental health conditions (Birk and Samuel, 2020). Real-life monitoring might bring diagnostic and therapeutic opportunities, however, it opens also the possibility to surveillance (Dagum and Montag, 2019; Lyon, 2003). Banner (2019: 7–8), e.g. describes how surveillance through DP might be used ‘to regulate, define and control’ certain populations such as POC or disabled people and ‘used to enforce neoliberal regimes of austerity’, bringing more risk to historically discriminated populations than to others.
Big data in general and related data in health, in particular, is considered to be the new gold. Health data is commodified and profit is made from peoples’ health data (Banner, 2019; Cosgrove et al., 2020; Prainsack, 2017). Digital platforms are programmed with the company's purpose of collecting as much data as possible and it is a fairly frequent practice that data is sold and analysed for completely distinct purposes compared to the primary reason for data collection. Cosgrove et al. (2020) lay out that 92% of mental health apps are known to share data with third parties such as Facebook and Google without users having the choice to weigh in on that practice. This practice is called data repurposing and has been critiqued from ethical and social scientist perspectives (Kitchin, 2014).
To recapitulate and expand: What we see in both PM and DP is the assumption that knowing the truth is enhanced through the data-driven techniques of the field. For DP, the basis is continuous, in situ and live monitoring of everyday life practices. For PM based on genetics and genomics, it is genetic data marked as holding the ultimate truth about life and being extra precise. The digital form of data is an essential aspect for the expressed logic because only then can necessary practices of collection, transfer, analysis and interpretation be possible. Digitalisation, datafication, big data and their analysis provide additional information to both fields to be able to claim being closer to their own construction of knowledge. With this newly generated knowledge, experts in the field claim that they can offer more fitting treatment options and finally, better health. Ultimately, data is an essential component for knowledge production within the field through which the field seeks to differentiate itself from other, less digital-data-prone fields. Being data-driven leads to further consequences, such as the commodification of individual's health data and how through ‘personalisation’ responsibility shifts to the individual.
Discussion
While both PM and DP have a focus on digitisable data, I have pointed out the differences and similarities of PM and DP related to several aspects of data use. First, I have laid out the type of data and data producers: genetic data deriving from lab facilities for PM versus big data from a variety of digital devices with sensors used in DP. Then I have shown the different systematic of data collection: active and rare data production for PM versus moment-by-moment, in situ and live data collection for DP. The focus on digital data also informs the dominant understanding within the fields of PM and DP, and results in genetic determinism and advancing the genetic imaginary for PM based on genetics and data fundamentalism in DP. Both logics share the aspect of trust in digitisable data, which is thought to hold the ultimate truth leading to better health. With this logic, these two data-intensive approaches uplift themselves over other less digital-data-prone medical disciplines. Proof for this aspect is the discursive distancing of experts of DP. They frame digital traces as ‘objective’ ‘biomarkers’ for mental illnesses against the ‘subjective’ standard of self-reports of mental health symptoms which can also be due to the unsuccessful search for genetic and molecular biomarkers within psychiatry (Birk and Samuel, 2020; Brietzke et al., 2019: 223; Huckvale et al., 2019: 1). In medicine, as in other scientific disciplines, objectivity ranks higher than subjectivity (Reiss and Sprenger, 2020). Information seems to gain in resilience and worth once it has been collected by a digital device (Lupton, 2015). Even more so if it has been collected close to the patient as is the case in DP. ‘Real-life’ data is conceptualised as being ‘objective’ versus self-reports being framed as ‘subjective’. Consequences might be a shift from reactive medicine to proactive and preventive medicine resulting in individualisation and responsibilisation of the patient through both PM and DP. Although DP may offer more possibilities of exchange and support for patients, the constant flow of ‘real-life’ data might open the door to more surveillance through connected devices, which is the other side of 24 h monitoring that, e.g. Mindstrong proposes for its users (Metz, 2018).
When data is the primary focus, the rationale is usually that more data, and sometimes more precise data as in the case of PM, is better. This is a standard rationale within big data, even though it has been criticised extensively also by the stakeholders involved (boyd and Crawford, 2012; Kitchin, 2014). This means medicine and computer science and statistics now share this same mantra being more data is better. To satisfy this need for data within PM and DP, health institutions have to move further into the direction of digitalisation. This might bring new stakeholders, experts of data-intensive tools, into the picture who offer related products and solutions for problems. The value system of those new stakeholders will find a place in medicine and health. One result will be a further commodification of disease and health and an increased dependence on those entities that deal with data collection, analysis and interpretation, which are often private companies (Lupton, 2018b; Prainsack, 2017; Ruckenstein and Schüll, 2017; Saukko, 2018).
Healthcare for patients may change extensively. Individualisation seems to lead to responsibilisation. Care might change from reactive to proactive and predictive. How will it affect people to know that they are continuously being monitored? To which recursive effects will this lead? Digital sociologists and STS scholars discuss that it is unclear how much patients or so-called data subjects will ultimately profit (Lupton, 2018a). Those people with more resources might indeed be empowered by tailored treatment and access to their own health data and more information. However, critical data scientists point out that social minorities might not be the ones profiting but experience the downside: being monitored can just as easily result in surveillance. People at the social margins might be depersonalised and disempowered (Prainsack, 2017). This opens an entirely new array of questions regarding surveillance and what Foucault (2008: 1) called ‘biopolitics’, the establishment of state control over functions and processes of life. These aspects are especially salient for DP and will be even more salient once PM and DP are combined.
Being data-driven approaches, many advantages, and disadvantages of datafication and matters pertaining to big data and AI come into play. As critical scholars in social and human science have shown, only certain knowledge is quantifiable and thus digitisable. Many aspects of life cannot be translated into numbers, such as experiential, intuitive, tactile and emotional knowledge (Quinn, 2021). If data-driven methods gain more importance in the field of health and illness, it is supposed to have huge consequences on basic epistemologies and ontologies around health and illness. Some of the non-quantifiable and non-digitasable aspects will not find their way in, e.g. decision support tools. Not only will the outcome be different if they do not inform the decision, but they will also be hidden and in the end, valued less (Lupton, 2015; Mennicken and Espeland, 2019; Quinn, 2021). This is even more critical as both fields are rich in descriptions how data and knowledge gathered in PM and DP will change classification of diseases, identifying influences data-based approaches are expected to have on basic epistemologies and ontologies in medicine and healthcare. Therefore, critical scholars should have a close eye on these paramount changes. Prainsack (2017: 188) explains how the data valuing logic in PM follows a ‘tacit hierarchy of utility, with digital and computerable data on top, and unstructured, narrative, and qualitative evidence at the bottom’. A logic that was introduced by the technologies used and not by professional experts. DP in particular shows an all-too-simplified view on mental health through its search for digital biomarkers based on behaviour phenotypes. Birk and Samuel (2020) rightfully critique the usage of digital data as a proxy for social life for many reasons. First, important knowledge will not be taken into account and may ultimately be lost. Many non-quantifiable aspects have for a long time been essential for how decisions have been made within medicine. Social science shows medicine had always had a more ‘human’ aspect to it than science would claim. Practical knowledge, intuition and nuances in the physician–patient relationship have always been very important not just to how decisions around health and illness have been made, but also to what constitutes the professions in health care (Kim et al., 2018; Spinnewijn et al., 2020). Second, if decisions are taken based on historical data, existing categories might be reified and naturalised (Mau, 2017; Mennicken and Espeland, 2019; Quinn, 2021). This entails also assumptions around normality and bias in algorithms which is harder to dismantle because of black box phenomena (Birk and Samuel, 2020).
Finally, all the processes described before can also be conceptualised as different aspects of biomedicalisation, the transformance of biomedicine by technoscientific interventions such as computers and genetics, with the consequences that the biological gains in importance. In the analysis before I found the following processes of biomedicalisation: a focus on health, surveillance and risk which we find in PM (genomic profiling) but above all in DP, then the ‘technoscientisation of biomedicine’ which can be found in both logics of PM and DP, and a change in biomedical knowledge represented by the aim for new disease classifications through data in both PM and DP (Clarke et al., 2003: 166). Also, the responsibilisation of the individual in both PM and DP and commodifying health data in DP fits under this umbrella term.
Conclusion and outlook
PM and DP have the potential and are already changing medicine and healthcare. The basis for these developments is also the digital data used and the logics of the field stemming from it. Since possibilities for data collection have risen, PM is seeking to include all types of data to construct a patient's ‘unique thumbprint’ (Prainsack, 2017: 3). From the side of DP efforts exist to introduce clinical data to the phenotype analysis, so-called ‘enriched data for DP’ (Liang et al., 2019: 290). Both fields work closely together with data and computer science and share a similar big data logic. Thus, combining all data available, no matter if genetic and clinical data or ‘real-life data’, would be the next logical step. Some experts already depict a horror scenario of genetic determinism in psychological diseases if biological and real-life data would be collected for mental health conditions (Comfort, 2018).
Additionally, the COVID-19 pandemic has shown us how viral genetic information can be used to influence every aspect of our life, including surveillance and quarantine restrictions. Also, it is thought to change our mental healthcare, e.g. through the increased access to telehealth (Melcher et al., 2020). These transformations call for social science to engage in the field. There are many questions still to be asked: epistemological and ontological redirections in the field of medicine and healthcare, transformations in categories around health and illnesses with new technologies, changes in doctor–patient relationships and how institutions and people handle health-related data. Institutionally, it will be interesting to analyse power shifts amongst stakeholders and the changes of values and logics that accompany them. Last but not the least, addressing questions around health equality and which consequences these developments entail for population health and individual patients is crucial in these changing times.
Footnotes
The author would like to thank three anonymous reviewers for their very insightful comments and Tamara Schwertel, Regina Ammicht-Quinn and Ursula Offenberger for their valuable feedback. Furthermore, she would like to thank Mrunmayee Sathye and Lukas Haeberle for their help on the manuscript.
Declaration of conflicting interests
The author declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Funding
The author received no additional financial support for the research, authorship and/or publication of this article.
