Sage Journals: Discover world-class research

Abstract

Recital 33 GDPR has often been interpreted as referring to ‘broad consent’. This version of informed consent was intended to allow data subjects to provide their consent for certain areas of research, or parts of research projects, conditional to the research being in line with ‘recognised ethical standards’. In this article, we argue that broad consent is applicable in the emerging field of Computational Social Science (CSS), which lies at the intersection of data science and social science. However, the lack of recognised ethical standards specific to CSS poses a practical barrier to the use of broad consent in this field and other fields that lack recognised ethical standards. Upon examining existing research ethics standards in social science and data science, we argue that they are insufficient for CSS. We further contend that the fragmentation of European Union (EU) law and research ethics sources makes it challenging to establish universally recognised ethical standards for scientific research. As a result, CSS researchers and other researchers in emerging fields that lack recognised ethical standards are left without sufficient guidance on the use of broad consent as provided for in the GDPR. We conclude that responsible EU bodies should provide additional guidance to facilitate the use of broad consent in CSS research.

Keywords

Computational social science broad consent GDPR research regulation European Union

Introduction

During a pandemic, a computational social scientist wants to recruit 1500 young adults to study the effects of the pandemic on their lives. When designing the study, the researcher has not fully decided on the purpose of the data processing, except that it is to be used for scientific research. The researcher plans to collect personal information but remains unsure of which information exactly may be needed. Such information may be sensitive, including data on ethnicity, religion, and vaccination status. When the researcher takes the research proposal to the University’s Research Ethics Committee (REC), the REC is reluctant to offer approval, stating that the lack of specificity regarding the purpose of personal data processing could raise issues regarding the validity of study participants’ consent. To avoid this, the REC recommends that the researcher specifies in more detail the purpose of data processing. Anticipating several interesting insights that are currently challenging to predict, the researcher opts not to define the purpose of data processing beyond scientific research. Instead, they check whether there may be another legal basis for data processing. The researcher prioritises maintaining the autonomy of data subjects regarding the data processing, and thus limits the pursuit of alternative legal bases to those based on informed consent. In the European Union’s General Data Protection Regulation (GDPR) they find Recital 33, which seems to permit data subjects to consent to entire areas of scientific research (rather than clearly delineated topics). This is limited to cases where it is not possible to fully identify the purpose of personal data processing for scientific research purposes at the time of data collection, so long as the research adheres to recognised ethical standards. Intrigued by this option, the researcher looks for recognised ethical standards in the field of computational social science but fails to find any. The researcher now wonders whether the lack of recognised ethical standards in computational social science means it is not possible to rely on Recital 33 GDPR, leading them back to square one.

This vignette illustrates the problem addressed in this article: Computational Social Science (CSS) researchers, who would stand to greatly benefit from being able to use broad consent as entailed in Recital 33 GDPR, as we argue below, cannot do so due to a lack of recognised ethical standards in the field, impeding the conduct of cross-border research within the EU. The lack of recognised ethical standards for broad consent can lead to different interpretations by national data protection authorities, meaning researchers engaged in cross-border studies may face challenges in navigating different consent standards across EU Member States. This legal and ethical uncertainty, coupled with potential delays in obtaining approvals from different ethics committees with varying understandings of recognised ethical standards, can also hinder the potential for seamless collaboration in cross-border research.

Different EU regulations have implications for the conduct of research within the Union.¹ When there is a gap between the regulatory mechanisms for research and specific fields or types of research, valuable research might not take place (Rothstein and Knoppers, 2015; Skolbekken et al., 2005). In this article, we highlight one such gap; the gap between broad consent as provided for in the GDPR and CSS practice. Although we contend that GDPR’s broad consent was intended to facilitate research in cases where the exact purpose of data processing is difficult to predict, Recital 33’s requirement for recognised ethical standards presents a limitation in the context of CSS, and arguably in the context of any field of research that is emergent and accordingly cannot fall back on or refer to ‘recognised ethical standards’.

Legal context: Processing of personal data in GDPR

Any processing of personal data within the remit of GDPR requires a legal basis (see Article 6 GDPR and Article 9 GDPR for special categories of personal data). Article 6(1) GDPR enumerates legal bases for the processing of personal data. They include that the data subject has given consent to the processing of their personal data for one or more specific purposes, that it is necessary to process the data for the performance of a contract, that data processing is needed to comply with a legal obligation, that it is necessary to protect vital interests or to perform a task carried out in the public interest, that the data is processed in the exercise of official authority, and for legitimate interests pursued by the data controller or a third party. Regardless of the legal basis relied on for data processing, principles entailed in Article 5 GDPR must also be complied with. Article 5 GDPR requires that personal data must be processed lawfully, fairly, and transparently, collected for specified, explicit, and legitimate purposes and not be processed in ways that may be incompatible with those specified purposes; limited to what is necessary in relation to the purpose of processing, accurate, stored (in a format that enables the identification of data subjects) for necessary periods only, and processed securely to maintain integrity and confidentiality. Additionally, the controller is responsible for ensuring that these requirements are met and needs to be able to demonstrate this.

Article 9 GDPR prohibits the processing of special categories of personal data, unless certain exemptions apply. Special categories of personal data are, among others, data related to ethnic origin, political opinions, religious or philosophical beliefs, the processing of genetic data and sexual orientation. The exemptions for the lawful processing of such data, which are listed in Article 9(2) GDPR, include cases where the data subject has given explicit consent, where processing is required by law, for reasons of substantial public interest, to protect vital interest and for achieving research or statistical purposes with appropriate safeguards in accordance with Article 89 GDPR. Article 89 GDPR allows for derogations and exemptions from certain rights and obligations when processing personal data for research purposes. Importantly, Article 89 GDPR does not provide a legal basis for processing data but allows Member States to restrict data subjects’ rights under national legislation regarding the processing of personal data for research purposes.

Article 6(1)(b–f) GDPR, Article 9 GDPR, and Article 89 GDPR highlight the legal options for the processing of personal data beyond the specific consent of Article 6(1)(a) GDPR. In this article, as exemplified in the vignette above, we focus on consent as a legal basis for the processing of personal data, focusing specifically on the lawful application of broad consent under the GDPR as provided for scientific purposes in Recital 33. Recitals offer background information on legislation, including the reasons for adoption (den Heijer et al., 2019). In doing so, the recitals, although not legally binding, help with the interpretation and clarification of the Articles. As such, broad consent as entailed in Recital 33 does not, on its own, constitute a legal basis for the processing of personal data. Yet, as we argue, it plays a central role in interpreting the lawful application of broad consent in the scope of the GDPR. Recital 33 must thus be read in conjunction with the type of consent provided for in Article 6(1)(a) GDPR and Article 9(2)(a) GDPR, where it softens ‘the requirement of specificity’ of informed consent (European Data Protection Board (EDPB), 2021; European Data Protection Supervisor, 2020: 19; see also EDPB, 2020a). Regarding Recital 33, the EDPB points out that the softening contained in Recital 33 should not be seen as an exception to Article 5 GDPR principles, especially lawfulness, fairness, transparency and purpose limitation.² Furthermore, the question of the legal interpretation of Recital 33 is important despite its legally non-binding status because it has been implemented in legally binding national legislation, including in EU Member States (e.g. Austria³ and Estonia⁴) (see also Grady et al., 2015).

Background: On the legality of different types of consent

In medical research, where much of relevant literature originates, the potential benefits and harms are associated with an individual study that is outlined in terms of both time and content (Faden and Beauchamp, 1986; Maloy and Bass, 2020; Richter and Buyx, 2016). This allows for the fulfilment of the specificity requirement of Article 6(1)(a) GDPR or Article 9(2)(a) because consent is granted for a specific purpose. In addition, the principles of Article 5 GDPR must be complied with when processing personal data.

With the increased use of biobanks⁵ in biomedical research in the late 1990s, it became apparent that the specificity requirement of informed consent to research participation was ill-suited for some types of research (Sheehan, 2011). Richter and Buyx (2016) argue that biobank research, especially of a genetic and genomic nature, differs from traditional research on humans in that potential benefits and harms are less concrete and foreseeable than in traditional medical research. Detailed information about the nature, purpose, and duration of the use of data, often viewed as necessary to fulfil the specificity requirement of informed consent, is difficult to reconcile with the intention of biobank research. This similarly applies to research in CSS, where the same dataset can be used for different research purposes and projects (see e.g. Salganik, 2019). In response to these issues, scholars have put forward several adaptations of informed consent, including open, blanket, dynamic, and broad consent. In this context, it is important to differentiate between consent for research participation and consent as a legal basis for data processing under GDPR. A researcher must have a legal basis for processing data falling under GDPR. This legal basis does not necessarily have to be informed consent, as explained in the previous section. If the researcher wishes to or has to rely on a form of informed consent as the legal basis for data processing (as is the case in the vignette above), however, they will need to ensure that the form of informed consent they choose meets both the ethical requirements for informed consent to research participation mandated by their university or other responsible body, and the legal requirements for processing personal data imposed by GDPR. This section explains how various forms of consent for research participation interact with GDPR provisions that allow the processing of personal data. It is meant to help researchers who prefer to use informed consent as the basis for legal processing to understand more about the legality of the different types of informed consent for research participation that are available.

Open consent, as described by Lunshof et al. (2008), involves donors willingly providing consent for the unrestricted use and disclosure of their data for research purposes. Hallinan and Friedewald (2015) find that open consent fails to meet GDPR’s ‘specific and informed’ consent criteria due to uncertainties about future processing purposes, unidentified data recipients, unanticipated international data transfers, and the broad scope of genetic data collected, arguing that the GDPR’s strict specificity requirements may not align well with the biobanking context. Indeed, they suggest reconsidering GDPR’s consent conditions for biobanking, noting that open consent, while not meeting GDPR requirements, has substantive merits.

Blanket consent refers to situations where individuals donate their samples or data and explicitly consent to its use without any limitations or restrictions including forensic and commercial use (Helgesson, 2012; Wendler, 2013). The Article 29 Data Protection Working Party (2011) opposes blanket consent, instead suggesting rules to prevent its misuse and subsequently underscoring the need for granular, purpose-limited consent.

Dynamic consent addresses the limitations of open and blanket consent in biobanking and large-scale research by emphasising the importance of ongoing communication, participant control over data-use preferences, and the ability to withdraw that consent at any time (Stein and Terry, 2013). Commentary on the legality of dynamic consent within the GDPR remains limited, although Teare et al. (2021) note that explicit consent is recommended in the GDPR for data processing, while at the same time arguing that dynamic consent may help meet the changing legal requirements in different settings. For instance, they discuss whether the digital and updateable nature of dynamic consent could help meet requirements around people’s ability to withdraw consent and have their data deleted or forgotten under the GDPR’s ‘right to be forgotten’.

Broad consent to participate in biobank research provides some flexibility for unforeseen uses. This type of consent involves three key elements: individuals agree to data collection and use in biobanks, grant ‘broad’ consent not tied to specific projects and enable personal data processing for research (Hallinan and Friedewald, 2015; Richter and Buyx, 2016). Despite vague wording in the GDPR itself, as well as a lack of a specific reference to the concept as such, broad consent as a legal basis for data processing has found its way into various legal systems, including Austria⁶ and Estonia.⁷ Details on the exact configuration and conditions of broad consent differ between jurisdictions (Dove and Garattini, 2018). Hallinan (2020) examines the legality of broad consent within the GDPR in the context of biobanking, emphasising the crucial role of legislative intent. Early GDPR drafts required specific consent for research, but the final text, including Recital 33, loosened these requirements for scientific research. Hallinan contends that this change reflects a deliberate legislative intention to support broad consent. Consequently, Hallinan argues that guidance from the EDPB or Article 29 Working Party that contradicts this intent may be considered as an undemocratic overreach, as broad consent seeks to grant research participants control over their data’s future use, aligning with GDPR principles. Thus, Hallinan suggests that broad consent should not be prohibited in principle, even when it contradicts statements from the Working Party. Indeed, the author holds that broad consent was widely used before the GDPR and argues that regulators are unlikely to focus on breaches of broad consent post-GDPR citing limited resources, a lack of complaints, and the complexity of overseeing genomic research (Hallinan, 2020).

CSS and other research that could benefit from broad consent

In this article, we argue that the language in Recital 33 hinders the implementation of broad consent in research domains or practices lacking recognised ethical standards. To illustrate this argument, we use the example of CSS, a burgeoning and interdisciplinary research field positioned at the intersection of social and data science that uses computation to explore behavioural patterns of groups and individuals (Cioffi-Revilla, 2017; Lazer et al., 2009; Salganik, 2019). While CSS relies on the use of extensive digital datasets for a diverse range of difficult to predict inquiries, the same data can be used to answer very different research questions (Hox, 2017). Recital 33 GDPR reads that it should apply in cases in which it is ‘not possible to fully identify the purpose of personal data processing for scientific research purposes at the time of data collection’. Accordingly, CSS is an example of a field that could draw significant benefits from the utilisation of broad consent (Salganik, 2019).⁸

Recognised ethical standards in Recital 33

European law does not specify how to assess the recognition of ethical standards in CSS or any other discipline, making such evaluation a challenging task. The term ‘recognised ethical standards’ lacks a universal definition within legal frameworks, including upcoming regulations. Neither Recital 33 GDPR nor Recital 16 of the current draft EU AI Act proposal (5 November 2023) define the term. It is hence unclear when a practice attains the status of a recognised ethical standard within the meaning of the law. In attempting to provide a legal analysis of ‘recognised ethical standards’ in Recital 33, one can make use of four traditional methods for legal interpretation: grammatical, systematic, historical, and teleological. Recitals are generally not the subject of interpretation, but rather part of the interpretation of legal norms (Riesenhuber, 2021). Given the lack of methodology for the analysis of recitals and the need for a systematic approach to understanding the Recital, the term ‘recognised ethical standards’ in Recital 33 will nevertheless be subjected to a traditional legal interpretation.

First, in grammatical terms, the term ‘recognised’ implies that ethical standards enjoy widespread acknowledgment within a pertinent group of individuals, such as a research community. Therefore, recognition suggests that these standards are not only routinely employed but also embraced by practitioners as valid in the realm of research ethics. Second, ‘recognised ethical standards’ can be subject to a systematic interpretation, which involves analysing the provision’s context. Recital 33 is concerned with consent to data processing for scientific purposes, hence the scientific context must be considered when interpreting the term. This limits the interpretation of the term to the ethical standards established in research with personal data, rather than any recognised ethical standards. Third, as part of the historical interpretation regarding European secondary law, a variety of documents, including parliamentary and council documents or proposals of the Commission, are used to clarify ‘present texts in the light of previous circumstances and [take] into account the legislator’s intention’ (David and Brierley, 1985). While the term ‘recognised ethical standards’ was absent from the initial proposal for the GDPR by the Commission, it was subsequently introduced during the Council’s general approach in April 2017 and eventually endorsed by the European Parliament (see Procedure 2017/0002/COD). Since this term does not appear elsewhere in the documentation, any additional implications or historical interpretations remain uncertain. Fourth, from a teleological perspective, which addresses the purpose of a norm, the mention of an ethical standard would involve looking at how this aims to safeguard the data subject. The objective is hence to partially offset the reduction in the level of protection associated with the use of broad consent by linking it to compliance with recognised ethical standards. The teleological interpretation therefore shows that Recital 33 refers to the recognised ethical standards concerned with the protection of data subjects (and not, for instance, those primarily aimed at safeguarding good scientific practice). This means that the term can be narrowed down to at least one set of ethical standards. Yet, even if there were clear means to determine when something should be considered a recognised ethical standard outside of the law (e.g. when journals require it for publication, or when research funding bodies mandate it), the relationship with the law must be clear to avoid confusion. A clearer and more precise option would be to refer to the ethical standards required by national legislation, recommended by professional bodies, or recognised by a specified source. In the absence of such a standard, we discuss potentially applicable ethical standards from disciplines and fields adjacent to CSS before arguing why we cannot consider them as recognised.

Issues related to applying social science research ethics principles to CSS

Research is largely governed by non-binding legal instruments, such as professional codes of conduct or legal regulations for specific parts of the research process (Pöschl, 2010). Ethical conduct in biomedical research is guided by documents such as the Helsinki Declaration (World Medical Association, 1964) and the Belmont Report, both of which also apply to social science research (Alasuutari et al., 2008). From these documents, general research ethics principles have emerged, including Respect for Persons, Beneficence, Justice, Respect for Law, and Public Interest (added by the Menlo Report) (Directorate of Science and Technology, 2012; Salganik, 2019).⁹

By discussing issues related to the key principles of anonymity¹⁰ and confidentiality, we illustrate that social science research ethics principles are not fully applicable to CSS. Anonymity and confidentiality are vital ethical principles in social science research, safeguarding subjects’ privacy and preventing harm (Israel and Hay, 2006). Anonymity refers to keeping research participants’ or data subjects’ identities hidden, while confidentiality involves safeguarding personal information (ibid., 2009). In the context of CSS, definitions and operationalisations of these concepts are uncertain. Online platforms can create a false sense of anonymity, leading participants to share personal information more freely than they would in offline contexts (Meho, 2006). Guaranteeing anonymity in CSS research is technically challenging due to the risk of re-identification from multiple linkable data sources (de Montjoye et al., 2015; Enyon et al., 2016; Oboler et al., 2012). Beyond CSS, Risks of reidentification also threaten the privacy and confidentiality of participants in genetic and genomic research. It is very difficult to fully anonymise genetic data leading to the potential exposure of sensitive data. As possible solutions, Wjst (2010) suggests the need for more open consent processes without the promise of anonymity, better encoding techniques, legal constraints on data access, and restricted access for collaborators only under confidentiality agreements. Similarly, this applies to the issue of reidentification (Rocher et al., 2019). While researchers in the biomedical field can rely on other recognised ethical standards and refer to ongoing discussions about how to approach reidentification in the context of biobanking, researchers working in the fields of AI and CSS lack such guidance (see Staunton et al., 2022).

Ethical dilemmas also arise when participants consent to anonymised data use, yet re-identification becomes possible through new datasets (Leslie, 2022). This situation prompts consideration of the validity of consent and anonymity in CSS research, further bringing to light the ethical implications of individuals consenting to the use of their anonymised data, even when they are aware that re-identification may be possible in the future. As discussions about its design and requirements demonstrate, the concept of informed consent and its specificity requirement is not ideal for digital research. As the above discussion demonstrates, how ‘recognised’ this ethical standard is within digital research remains unclear. As anonymity and confidentiality are cornerstones of research ethics in biomedical and social science research, the fact that questions around applicability and configuration regarding CSS are far from settled represents a major problem.

Issues in applying data science research ethics principles to CSS

Keeping the above problem in mind, Salganik’s (2019) localisation of CSS at the intersection of data science and social science redirects our attention to research ethics principles from data science. While data science research shares similarities with CSS, such as both using digital data to address research questions, there are important differences too, such as data science datasets often being much larger than CSS datasets (Ada Lovelace Institute, 2022). Moreover, whereas CSS questions also focus on social science topics, data science research questions have broader scopes. Importantly, traditional research ethics principles from social and biomedical fields may not fully apply to data science due to distinct privacy issues and the nature of risk (Ada Lovelace Institute, 2022; Leslie, 2022; The Turing Way Community, 2022). Unlike social and biomedical research, risks in data science may emerge over time during the research process, thereby requiring continuous ethics reviews (Ferretti et al., 2021). Additionally, data science research often lacks direct interaction between researchers and participants (Friesen et al., 2021). The evolving landscape of data science research calls for a re-evaluation and adaptation of research ethics principles.

Emerging research ethics principles specific to data science include fairness, accountability, responsibility, and transparency (Barocas et al., 2022). These principles have gained prominence in recent years as essential components of ethical data science and AI practices (AI HILEG, 2019; Friedman and Nissenbaum, 1996). First, fairness is used to operationalise justice, through the prevention, monitoring or mitigation of unwanted bias and discrimination (Jobin et al., 2019). Fairness regarding data sets used in data science research tends to highlight the importance of acquiring and ‘processing accurate, complete and diverse data’ (ibid., 2019). Second, accountability and responsibility are rarely defined, despite frequent mentions of ‘responsible AI’ (Gotterbarn and Miller, 2017; Jobin et al., 2019; Mittelstadt, 2019). Specific suggestions related to accountability and responsibility include operating with ‘integrity’ and making it clear who is responsible for what and who bears legal duty (Lagoze, 2014; Orr and Davis, 2020). The issue of who should be held accountable and liable is also up for debate, potentially including AI engineers, designers, ‘institutions’, or ‘industry’ (ibid., 2019). Further debates surround whether humans should always bear the ultimate responsibility for harm emanating from technical artefacts (Novelli et al., 2023). Fourth, transparency¹¹ refers to efforts to improve explainability, interpretability, or other forms of communication and disclosure. Transparency is primarily advocated as a strategy to reduce harm and advance AI. Many sources recommend enhanced information sharing by individuals creating or using AI systems to increase openness, while the details of what should be shared vary widely across them, including potential impact and limitations (Jobin et al., 2019).¹² These discussions show that ethical principles for data science research are still evolving and lack universal recognition (Ada Lovelace Institute, 2022). RECs show inconsistency in adopting and defining specific principles for AI and data science research, whereby there is a lack of uniform application with different institutions having varied requirements (ibid., 2022). Judging the relevance of existing principles in data science is challenging, especially as the relationship with biomedical ethics principles remains unclear.

The lack of recognised research ethics in CSS has led to proposals for a ‘roadmap’ to develop CSS-specific ethics, drawing from Responsible Research and Innovation (RRI) principles (Leslie, 2022). This roadmap suggests procedural tools for the CSS research community, emphasising context, impact anticipation, stakeholder analysis, normative criteria for impact assessment, reflection on purposes, positionality, power, inclusive engagement, transparency and responsibility. Although the roadmap is an initial step towards CSS research ethics, it is not yet recognised as universally applicable by CSS researchers, or comprehensive. The overall lack of recognised ethical standards prevents CSS researchers from being able to rely on the GDPR’s broad consent.

In summary, the challenge in applying biomedical and social science research ethics to CSS arises from the difficulty of implementing some of their fundamental principles (e.g. confidentiality). This is compounded by the lack of established ethical guidelines specific to CSS and uncertainty surrounding their formulation. Moreover, data science research ethics are in a state of flux and lack concrete definitions, making them unsuitable to serve as ‘recognised’ for broad consent in CSS research (Ada Lovelace Institute, 2022). Regarding CSS and the search for recognised ethical principles to make broad consent usable, this means that research ethics principles from biomedical and social research are insufficiently applicable, whereby research ethics principles from data science are still too nascent to be considered ‘recognised’. This highlights the lack of a singular, universally recognised standard for scientific research, as standards vary across disciplines, fields, countries and universities. Introducing a specification in the Recital that identifies sources to consult for determining the recognition of an ethical standard would be an important next step in addressing this situation, a point to which we return in the conclusion. In summary, we argue traditional social science, biomedical, and data science research ethics principles are not fully applicable for CSS research ethics, and CSS research-specific ethical standards are yet to be developed.

Why we should be wary of top-down alignment

In addition to the ongoing discussion about the substance of CSS research ethics principles, another difficulty arises when we consider how fragmented research ethics principles are within the EU. Historical differences also influence ethics assessment approaches, with proactive stances in Germany and Austria and reactive approaches in English-speaking countries such as the United Kingdom or Ireland (Brey et al., 2015). These differences in values are evident in ethics governance, such as medical research ethics. For instance, Veerus et al. (2014) studied the differences in obtaining informed consent from vulnerable groups among EU Member States. Some countries treat all vulnerable groups the same (e.g. Italy, Portugal and Malta), while others do not (e.g., France, Germany, Spain). Certain countries also waive the requirement for informed consent from vulnerable groups in specific cases (e.g. Denmark, Poland) and others have RECs directly involved in obtaining informed consent from vulnerable groups (e.g., Lithuania) (ibid., 2014).

Such fragmentation raises question whether the law should generally aim to align existing REC requirements across the EU, such as the treatment of vulnerable groups discussed above. Yet, we can draw on lessons from history to answer this question as, indeed, there is evidence of unexpected problems arising from attempts to streamline ethical approval procedures (Hauskeller et al., 2019). For example, the EU Clinical Trials Regulation (EU CTR) aimed to address harmonisation issues related to clinical trials involving multiple EU Member States, such as the need to submit applications separately to each country’s competent authorities and ethics committees for regulatory approval (European Medicines Agency, 2022). The harmonisation and streamlining of the approval process in the EU CTR altered the interactions with ethics committees. In result, the assessment of a planned trial was divided into two parts: the ‘technical-scientific aspects’ were handled by one ‘reporting’ Member State with involvement from other relevant Member States. For the second part of the assessment, each Member State assessed specific aspects, including informed consent requirements, compensation of participants, recruitment, and the suitability of individuals and trial sites (Tusino and Furfaro, 2022: 42). EU CTR aimed to address scarce coordination among RECs by centralising the assessment review, however, scholars ended up warning of the serious issues involved in splitting the approval process into two parts (Lukaseviciene et al., 2021; Tusino and Furfaro, 2022: 42). Tusino and Furfaro (2022) argued that the separation into ‘technical-scientific’ and ‘local’ ethical aspects undermined the purpose of RECs in evaluating the scientific merit and ethics of a study. Hence, narrowing the ethics review to only the second part of an assessment can side-line RECs and defeat their purpose (Gelling, 1999; Petrini, 2014). To avoid aligning research ethics through the law leading to unintended consequences it is sensible to wait for the recognition of research ethics principles in the field before intervening (Tusino and Furfaro, 2022: 42).

Implications of legal uncertainty

There are several implications of the legal uncertainty created by the reference to recognised ethical standards in Recital 33 GDPR.

Consent misconception

The verbatim inclusion of Recital 33 in certain national laws creates legal uncertainty. For instance, the risk of a ‘consent misconception’ arises when ethical compliance and legally valid consent for data processing are conflated (Dove and Chen, 2020). Dove and Chen (2020) hence argue against the merging of ethical standards and data protection law in consent matters. However, establishing a strict separation may prove challenging, as legal considerations alone might not adequately address data subjects’ interests without ethical dimensions. Nevertheless, this conflation can hinder research, as participants and researchers may misinterpret consent requirements. Additionally, uncertainties surrounding broad consent might push researchers to seek alternative legal bases that do not require consent for data processing. Indeed, researchers may feel the need to do so even when they would prefer a legal basis relying on explicit consent to respect an individual’s autonomy and control over their personal data. Beyond this confusion, the lack of consistency in relevant terms (particularly concerning broad consent) and varied research exemptions across national laws impede international research collaboration (Mostert et al., 2016).

Liability issues

Non-compliance with GDPR obligations entails the right to claim damages for individuals whose GDPR rights are infringed upon (Article 82). This raises pertinent questions about liability for researchers making use of broad consent. The initial question to ask is who the controller is. In addition to case law of the European Court of Justice,¹³ the EDPB identified several key questions to help assign the different roles available (controller, joint controller and processor). In the area of research, GDPR designates the organised entity conducting the research, such as a university or research institution, as the default data controller. However, individual researchers can also be considered controllers if they process data for personal purposes outside institutional oversight (EDPB, 2020b). In collaborative projects, if a GDPR violation occurs, each controller involved is individually accountable for the entire damage, allowing the responsible controller to subsequently seek reimbursement from others (Kuner et al., 2021). To ensure efficient legal recourse, the burden of proving responsibility should not be placed on the data subject (Kuner et al., 2021).

Liability for controllers ultimately hinges on three cumulative conditions: breach of a GDPR provision, occurrence of material or non-material damage, and the establishment of causality (Kuner et al., 2021). Nonetheless, the extent to which fault influences liability remains a subject of debate, with varying interpretations available in the scholarly literature. In terms of application, researchers may breach GDPR if they use broad consent in a manner inconsistent with legal requirements. Ambiguities in regulations hence render the implementation of broad consent susceptible to liability cases. While guidelines emphasise the importance of transparency, this may not necessarily absolve controllers from liability (European Data Protection Board, 2021). Thus, the utilisation of broad consent entails significant legal uncertainty, particularly for researchers in the field of CSS.

Impact on scientific freedom

The legal uncertainty surrounding broad consent can also have adverse effects on scientific freedom, which is protected by various national and international legal sources. For instance, Article 13 of the EU Charter of Fundamental Rights guarantees freedom of research, including research involving data. While ethical and legal restrictions are generally compatible with fundamental rights, unclear requirements such as those in Recital 33 may deter researchers due to uncertainty and concerns about liability. This uncertainty could indirectly restrict research activities, leading to self-censorship and potential avoidance of some research projects. Such chilling effects may arise when researchers perceive disproportionate liability risks or encounter numerous liability procedures, impeding their research activities (Pöschl, 2010). While provisions mandating compliance with ethical standards themselves do not infringe on fundamental rights, the practical application of broad consent presents significant legal challenges and risks for researchers and institutions. In conclusion, Recital 33, as it currently stands, leads to uncertainties surrounding broad consent and, subsequently, has a detrimental impact on research by hindering potential projects and raising concerns about compliance with data protection norms.

Conclusion

This article has argued that GDPR’s broad consent is of insufficient use wherever recognised ethical standards, as required by GDPR, are missing. GDPR provides no definition of which sources to consult to determine the status of recognition of an ethical standard. We illustrated this issue using CSS as a case study. In CSS, recognised ethical standards do not currently exist. CSS, a relatively novel field of research at the intersection of social and data science, cannot fully make use of research ethics from social sciences (originating in the biomedical field) because CSS research challenges notions of key principles such as confidentiality and anonymity, even more than the emergence of biobank research did over 2 decades ago. While research ethics principles from data science and AI research seem to be more applicable in principle, we argued that they are still far from being recognised. Notably, while we focus on CSS, this logic is in principle applicable to any research field that cannot fulfil the recognised ethical standards requirement. These fields are likely either emergent fields, relying on new methods or types of datasets, or interdisciplinary types of research. We showed that ethical standards that come from social science (and originate in biomedical research) are only of limited help, whereas those that come from data science are insufficiently developed to be viewed as recognised. This means that the Recital’s reference to ‘recognised ethical standards’ ultimately provides CSS researchers with little practical guidance. While we see value in acknowledging the importance of ethics, which may develop and include hardened customs more rapidly than legislated law, this reference is premature in Recital 33 GDPR, at least in the case of CSS. We thus conclude that CSS researchers who search for guidance on how to use broad consent during their research will not be successful.

While this article does not offer a complete solution to the problems raised, it suggests several options. The first option may be the legal establishment of a minimum standard for recognised ethical standards on an EU-level, even if there is a risk that this standard evolves into a norm over time. In this case what was intended to be an interim measure becomes the long-term norm. The second option may be a clearer specification of an organisation or body whose recommendations and publications may be consulted by (CSS) researchers, allowing them to identify which ethical standards are viewed as recognised in their field. This could be a professional or academic organisation (e.g., the EU counterpart to the ‘Computational Social Science Society of the Americas’), but it could also be the EDPB. The second option offers two benefits. Firstly, it provides CSS researchers with a sense of security regarding the EU’s interpretation of ‘recognised ethical standards’. Secondly, such a process would ensure that the recognised ethical standards reflect the practice-informed concerns of CSS researchers. A third option would be to refer to either national legislation or RECs (national or local). For instance, recognised ethical standards could be defined as those referred to as such by national legislators in relevant laws or guidelines. Ethics committees (e.g. at universities) or their national umbrella organisations could confirm that individual studies do not (in their view) violate recognised ethical standards. Yet, granting such power to ethics committees, is more likely to be an interim solution, as it could lead to a high degree of fragmentation and lack of uniformity. Nonetheless, this would assist researchers, provide more certainty in ethics, and encourage important EU cross-border research. Promoting a more effective integration of these essential aspects, the EDPB’s role in establishing a bidirectional (instead of a unidirectional) connection between law and ethics in research, as previously discussed, holds promise. Lastly, similar to the lack of legal clarity regarding what constitutes recognised ethical standards in CSS, the absence of these principles is also problematic when viewed on its own, as it leaves researchers uncertain about what is ethically acceptable in their field. For some, this uncertainty might lead to conducting research that should not be pursued, while others may abstain from research that is essential. Therefore, having recognised ethical standards would not only permit the use of broad consent in accordance with Recital 33 GDPR, but it would also provide researchers with greater support concerning the research ethics in their field.

Footnotes

Declaration of conflicting interest

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

All articles in Research Ethics are published as open access. There are no submission charges and no Article Processing Charges as these are fully funded by institutions through Knowledge Unlatched, resulting in no direct charge to authors. For more information about Knowledge Unlatched please see here: . Funding for this research was provided by Digitize! Computational Social Science in Digital and Social Transformation, funded by the Austrian Federal Ministry of Education, Science and Research.

ORCID iD

Seliem El-Sayed

Notes

References

Ada Lovelace Institute (2022) Looking before we leap: Expanding ethical review processes for AI and data science research. Report of the Ada Lovelace Institute.

AI HILEG (2019) Ethics guidelines for trustworthy AI. European Commission. Available at: https://digital-strategy.ec.europa.eu/en/library/ethics-guidelines-trustworthy-ai. (accessed 15 February 2024).

Alasuutari

Bickman

Brannen

(2008) The Sage Handbook of Social Research Methods. Thousand Oaks, CA, USA: Sage Publications Ltd.

Arras

(2008) The jewish chronic disease hospital case. In: Emanuel

Crouch

Grady

, et al (eds) The Oxford Textbook of Clinical Research Ethics. Oxford: Oxford University Press, pp. 73–79.

Article 29 Data Protection Working Party (2011) Opinion 15/2011 on the definition of consent. Available at: https://ec.europa.eu/justice/article-29/documentation/opinion-recommendation/files/2011/wp187_en.pdf.

Barocas

Hardt

Narayanan

(2022) Fairness and Machine Learning: Limitation and Opportunities. Fair ML Book. Available at: https://fairmlbook.org/pdf/fairmlbook.pdf. (accessed 15 February 2024).

Brey

Rangi

Toljan

, et al. (2015) International differences in ethical standards and in the interpretation of legal frameworks. Report, European Commission. Available at: https://satoriproject.eu/media/D3.2-Int-differences-in-ethical-standards.pdf (accessed 15 February 2024).

Cioffi-Revilla

(2017) Introduction to Computational Social Science: Principles and Applications. Cham: Springer International Publishing.

David

Brierley

(1985) Major Legal Systems in the World Today: An Introduction to the Comparative Study of Law. London: Stevens.

10.

de Montjoye

Y-A

Radaelli

Singh

, et al. (2015) Unique in the shopping mall: On the reidentifiability of credit card metadata. Science 347(6221): 536–539.

11.

den Heijer

Abeelen

TvOvd

Maslyka

(2019) On the use and misuse of recitals in European union law. Amsterdam Law School Research Paper 2019-31. Amsterdam, NL: Amsterdam Center for International Law.

12.

Directorate of Science and Technology (2012) The menlo report: Ethical principles guiding information and communication technology research. Department of Homeland Security. Available at: https://www.dhs.gov/sites/default/files/publications/CSD-MenloPrinciplesCORE-20120803_1.pdf. (accessed 15 February 2024).

13.

Dove

Chen

(2020) Should consent for data processing be privileged in health research? A comparative legal analysis. International Data Privacy Law 10(2): 117–131.

14.

Dove

Garattini

(2018) Expert perspectives on ethics review of international data-intensive research: Working towards mutual recognition. Research Ethics 14(1): 1–25.

15.

Enyon

Fry

Schroeder

(2016) The Ethics of Internet Research. In: Fielding

Lee

Blank

(eds) Sage Handbook of Online Research Methods. Thousand Oaks: Sage Publishing Limited, pp. 23–41.

16.

European Data Protection Board (2020a) Guidelines 05/2020 on consent under Regulation 2016/679, Version 1.1, Adopted on 4 May 2020. Available at: https://edpb.europa.eu/sites/default/files/files/file1/edpb_guidelines_202005_consent_en.pdf (accessed 15 February 2024).

17.

European Data Protection Board (2020b) Guidelines 07/2020 on the concepts of controller and processor in the GDPR. Available at: https://edpb.europa.eu/our-work-tools/our-documents/guidelines/guidelines-072020-concepts-controller-and-processor-gdpr_en (accessed 15 February 2024).

18.

European Data Protection Board (2021) Document in response to the request from the European Commission for clarifications on the consistent application of the GDPR, focusing on health research. Available at: https://edpb.europa.eu/sites/default/files/files/file1/edpb_replyec_questionnaireresearch_final.pdf (accessed 15 February 2024).

19.

European Data Protection Supervisor (2020) A preliminary opinion on data protection and scientific research. Available at: https://edps.europa.eu/sites/default/files/publication/20-01-06_opinion_research_en.pdf (accessed 15 February 2024).

20.

European Medicines Agency (2022) Regulatory harmonisation of clinical trials in the EU: Clinical Trials Regulation to enter into application and new Clinical Trials Information System to be launched. Available at: https://www.ema.europa.eu/en/news/regulatory-harmonisation-clinical-trials-eu-clinical-trials-regulation-enter-application-and-new-clinical-trials-information-system-be-launched (accessed 30 January 2024).

21.

Faden

Beauchamp

(1986) A History and Theory of Informed Consent. New York: Oxford University Press.

22.

Ferretti

Ienca

Sheehan

, et al. (2021) Ethics review of big data research: What should stay and what should be reformed? BMC Medical Ethics 22(1): 1–13.

23.

Friedman

Nissenbaum

(1996) Bias in computer systems. ACM Transactions on Information Systems 14(3): 330–347.

24.

Friesen

Douglas-Jones

Marks

, et al. (2021) Governing AI-driven health research: Are IRBs up to the task? Ethics & Human Research 43(2): 35–42.

25.

Gelling

(1999) Role of the research ethics committee. Nurse Education Today 19(7): 564–569.

26.

Gotterbarn

Miller

(2017) Yes, but . . . our response to: “professional ethics in the information age”. Journal of Information, Communication and Ethics in Society 15(4): 357–361.

27.

Grady

Eckstein

Berkman

, et al. (2015) Broad consent for research with biological samples: Workshop conclusions. The American Journal of Bioethics 15(9): 34–42.

28.

Hallinan

(2020) Broad consent under the GDPR: An optimistic perspective on a bright future. Life Sciences, Society, and Policy 16: 1–18.

29.

Hallinan

Friedewald

(2015) Open consent, biobanking and data protection law: Can open consent be ‘informed’ under the forthcoming data protection regulation? Life Sciences, Society and Policy 11(1): 1.

30.

Hauskeller

Baur

Harrington

(2019) Standards, harmonization and cultural differences: Examining the implementation of a European stem cell clinical trial. Science as Culture 28(2): 174–199.

31.

Helgesson

(2012) In defense of broad consent. Cambridge Quarterly of Healthcare Ethics 21(1): 40–50.

32.

Hox

(2017) Computational social science methodology, anyone? Methodology 13(Supplement 1): 3–12.

33.

Israel

Hay

(2006) Research Ethics for Social Scientists. London: Sage Publications Ltd.

34.

Jobin

Ienca

Vayena

(2019) The global landscape of AI ethics guidelines. Nature Machine Intelligence 1(9): 389–399.

35.

Jones

(1993) Bad Blood: The Tuskegee Syphilis Experiment. New York: Free Press.

36.

Kuner

Bygrave

Docksey

, et al. (2021) The EU general data protection regulation: A commentary/update of selected articles ( May 4, 2021). Available at: https://ssrn.com/abstract=3839645 (accessed 15 February 2024).

37.

Lagoze

(2014) Big Data, data integrity, and the fracturing of the control zone. Big Data & Society 1(2): 1–11.

38.

Lazer

Pentland

Adamic

, et al. (2009) Computational social science. Science 323(5915): 721–723.

39.

Leslie

(2022) Don’t ‘research fast and break things’: On the ethics of computational social science. SSRN Electronic Journal 1–61.

40.

Lukaseviciene

Hasford

Lanzerath

, et al. (2021) Implementation of the EU clinical trial regulation transforms the ethics committee systems and endangers ethical standards. Journal of Medical Ethics 47(12): e82.

41.

Lunshof

Chadwick

Vorhaus

, et al. (2008) From genetic privacy to open consent. Nature Reviews Genetics 9(5): 406–411.

42.

Maloy

Bass

(2020) Understanding broad consent. Ochsner Journal 20(1): 81–86.

43.

Meho

(2006) E-mail interviewing in qualitative research: A methodological discussion. Journal of the American Society for Information Science and Technology 57(10): 1284–1295.

44.

Mittelstadt

(2019) Principles alone cannot guarantee ethical AI. Nature Machine Intelligence 1(11): 501–507.

45.

Mostert

Bredenoord

Biesaart

, et al. (2016) Big Data in medical research and EU data protection law: Challenges to the consent or anonymise approach. European Journal of Human Genetics 24(7): 956–960.

46.

Novelli

Taddeo

Floridi

(2023) Accountability in artificial intelligence: What it is and how it works. AI & Society 1: 1–12.

47.

Nyrup

(2022) The limits of value transparency in machine learning. Philosophy of Science 89(5): 1054–1064.

48.

Oboler

Welsh

Cruz

(2012) The danger of big data: Social media as computational social science. First Monday 17.

49.

Orr

Davis

(2020) Attributions of ethical responsibility by Artificial Intelligence practitioners. Information, Communication & Society 23(5): 719–735.

50.

Petrini

(2014) Regulation (EU) No 536/2014 on clinical trials on medicinal products for human use: An overview. Annali dell'Istituto Superiore Di Sanita 50(4): 317–321.

51.

Pöschl

(2010) Von der Forschungsethik zum Forschungsrecht: Wieviel Regulierung verträgt die Forschungsfreiheit? In: Körtner

UHJ

Kopetzki

Druml

(eds) Ethik und Recht in der Humanforschung. Vienna, New York: Springer, pp. 90–135.

52.

Richter

Buyx

(2016) Breite Einwilligung (broad consent) zur Biobank-Forschung – die ethische Debatte. Ethik in der Medizin 28(4): 311–325.

53.

Riesenhuber

(2021) European Legal Methodology. Cambridge Antwerp Chicago: Intersentia.

54.

Robinson

Unruh

(2008) The hepatitis experiments at the willowbrook state school. In: Emanuel

Crouch

Grady

, et al (eds) The Oxford Textbook of Clinical Research Ethics. Oxford: Oxford University Press, pp. 80–85.

55.

Rocher

Hendrickx

De Montjoye

Y-A

(2019) Estimating the success of re-identifications in incomplete datasets using generative models. Nature Communications 10(1): 1–9.

56.

Rothstein

Knoppers

(2015) Harmonizing privacy laws to enable international biobank research. Journal of Law, Medicine & Ethics 43(4): 673–674.

57.

Salganik

(2019) Bit by Bit: Social Research in the Digital Age. Princeton, NJ, USA: Princeton University Press.

58.

Sheehan

(2011) Can broad consent be informed consent? Public Health Ethics 4(3): 226–235.

59.

Skolbekken

J-A

Ursin

LØ

Solberg

, et al. (2005) Not worth the paper it’s written on? Informed consent and biobank research in a Norwegian context. Critical Public Health 15(4): 335–347.

60.

Spitz

(2005) Doctors From Hell: The Horrific Account of Nazi Experiments on Humans. Boulder: Sentient Publications.

61.

Staunton

Slokenberga

Parziale

, et al. (2022) Appropriate safeguards and article 89 of the GDPR: Considerations for biobank, databank, and genetic research. Frontiers in Genetics 13: 4.

62.

Stein

Terry

(2013) Reforming biobank consent policy: A necessary move away from broad consent toward dynamic consent. Genetic Testing and Molecular Biomarkers 17(12): 855–856.

63.

Teare

HJA

Prictor

Kaye

(2021) Reflections on dynamic consent in biomedical research: The story so far. European Journal of Human Genetics 29(4): 649–656.

64.

The Turing Way Community (2022). The Turing Way: A Handbook for Reproducible, Ethical and Collaborative Research. Zenodo.

65.

Tusino

Furfaro

(2022) Rethinking the role of Research Ethics Committees in the light of Regulation (EU) No 536/2014 on clinical trials and the COVID-19 pandemic. British Journal of Clinical Pharmacology 88(1): 40–46.

66.

Veerus

Lexchin

Hemminki

(2014) Legislative regulation and ethical governance of medical research in different European Union countries. Journal of Medical Ethics 40(6): 409–413.

67.

Wendler

(2013) Broad versus blanket consent for research with human biological samples. Hastings Center Report 43(5): 3–4.

68.

Wjst

(2010) Caught you: Threats to confidentiality due to the public release of large-scale genetic data sets. BMC Medical Ethics, 11: 1–4.

69.

World Medical Association (1964) Declaration of Helsinki: Ethical Principles for Medical Research Involving Human Subjects. Helsinki, Finland.

No recognised ethical standards,no broad consent: navigating the quandary in computational social science research

Abstract

Keywords

Introduction

Legal context: Processing of personal data in GDPR

Background: On the legality of different types of consent

CSS and other research that could benefit from broad consent

Recognised ethical standards in Recital 33

Issues related to applying social science research ethics principles to CSS

Issues in applying data science research ethics principles to CSS

Why we should be wary of top-down alignment

Implications of legal uncertainty

Consent misconception

Liability issues

Impact on scientific freedom

Conclusion

Footnotes

Declaration of conflicting interest

Funding

ORCID iD

Notes

References