Abstract
Open access has moved from the periphery to the mainstream in the last few years, and most recently there have been calls to make research data “accessible, useable and reusable”. While there are many good arguments for this development, including that it makes the research process more transparent and enables others to re-use the data collected, it also has negative implications for social science research in certain contexts. The case addressed here is “elite” interviewing in the context of conducting research in Sweden. In this case there is already a strong legislative focus on openness with implications for research ethics. This suggests that open data access implementation, particularly in the context of specific legislative frameworks, needs to be reviewed to ascertain ethically correct interviewee and research subject protection in the social sciences.
Keywords
Introduction
Open access has moved from the periphery to the mainstream in recent years (Suber, 2012). Originating in broader movements towards open access (e.g., Costa and Leite, 2016), in Europe the open access movement has recently been driven by a focus within the European Commission on making “research data that result from publicly funded research publicly accessible, useable and reusable” (2012/417/EU). The aim is to make research more accessible, transparent, and trustworthy, and of greater benefit to society. The underpinning argument is that the provision of open access to research data will allow anyone to examine whether “the right conclusions are drawn” given the research data used for a particular study.
While the basic logic of this argument is hardly disputable, strategic selection organisation-focused or “elite” interviewing, which is potentially fundamental to possibilities for research on ongoing policy and legislative implementation, could potentially be made impossible in some cases as a result of implementing open access to research data. At the same time, these types of data may be crucial for “whistleblowing” and researching structural or organisational disorders, and such research cannot be undertaken unless interviewees are accurately protected.
It would thereby be of concern if open access policies, in their implementation in a specific legal context, would themselves result in limiting research in areas that themselves foundationally aim to improve transparency and support a democratic society. While this may not be the case in all legislative contexts, nor for all types of data, this article discusses a specific case in a specific legal context where open access taken together with the legislative situation at large may compromise research. The focus is specifically placed on elite interviewing and the conditions for it in Sweden. Sweden is hereby not a typical case, but rather stands out at as a case where a strong focus in legislative frameworks is already placed on openness and transparency (cf. Griffin & Leibetseder, 2019). The sections below discuss, firstly, the nature of the specific qualitive social science data in focus here, and then the present issues with sufficiently protecting research subjects so as to allow for the research to be ethically conducted, within different national frameworks, as a basis for the discussion of the Swedish case. The final sections discuss the implications and requirements for implementing open access in relation to sensitive qualitative social sciences research data.
The paper builds on a review and summary of the issues seen as impacting the integration of open access to research data in Sweden. These include both the existing legislative situation with a strong focus on transparency, as well as the current implementation of open access to research data in which cataloguing of data is coming to be required. As this is a developing area of interpretation, following upon governmental requirements implemented from 2016 onwards, the selection of material for review and summary with regard to the Swedish case has been informed by public meetings on the issue organised by universities, the Swedish National Data Service, the Swedish Research Council, and the Swedish Association of University Teachers and Researchers, contact with university legal services, and documents developed at university services including university instructions to scholars. The material thus includes both legislation and cases referred to, policy documents such as governmental or agency policies, reports and the like, and university documents attempting to clarify the legal situation and its interpretation. The methodology with regard to material for the article is thus akin to tracing a development – in this case, an issue and the implications of it – through different sources (cf. George & Bennett, 2005; Walgrave & Varone, 2008). As the area is still in development, the documentation drawn upon here constitutes the best information available to the author at the time of writing.
The Nature of Qualitative Social Science Data, in Relation to the Specific Focus of Organisational and Institutional Studies
The open access initiative has developed over a long time, with the purpose of making research publicly available (e.g., Costa & Leite, 2016; Bartling & Friesike, 2014 eds.). It has often been seen as a positive development. For instance, Suber (2012) describes open access as being “easy, fast, inexpensive, legal, and beneficial”, and there are plenty of statements in the literature on the potential benefits of open access to research data and secondary analysis of data (e.g., Tsai et al., 2016; Kuula, 2011; Sherif, 2018). The development of open access across Europe in recent years can be seen as linked to the development of regulation at European Commission level. A 2012 Recommendation, for instance, calls on member states to implement the general aim of, for instance, making research data “accessible, useable and reusable”. However, it also opens for variations in application as it is very generally phrased, does not discuss different types of data, and notes among other things the need to take into account intellectual property rights (2012/417/EU, see also European Commission, 2021).
Despite a relatively positive assessment of open data in literature (e.g., Suber, 2012; Tsai et al., 2016; Sherif, 2018), the literature in this field from a social science perspective varies. Some of it targets the potential benefits of open access to research data, and notes that acceptance of open access is mainly a cultural question (e.g., Kuula, 2011). However, it is also social science literature that has been perhaps the most critical of it and most clearly states the prerequisites required for, particularly, qualitative social science data to be made open access (e.g., Kirilova & Karcher, 2017; Chauvette et al., 2019; DuBois et al., 2018). For instance, Elman et al. note that “[q]ualitative data sharing presents a range of challenges that differ both in kind and degree from those involved in quantitative data archiving” (Elman et al., 2010: 24). 1 It is also generally noted that research open access development has been grounded in and mainly targeted other data than qualitative social science data. This potentially also rests on a relatively strong historical emphasis on natural science as “science”, in relation to which a lesser focus has been placed on social science and in particular qualitative studies (see e.g.,, Snow, 1961; Henderson, 1993; Hollis & Smith, 1991; Worster, 1996 for a discussion of the “two cultures” in research). In addition, the social science literature generally mentions the context dependence, interpretative role, and sometimes also sensitive role of qualitative social sciences data – in short, that it also matters who collected the data and under what circumstances, factors that can be hard to describe in meta code for any anonymised dataset (e.g., Kirilova & Karcher, 2017). Contrary to a focus on the openness of data, it has also been noted that “[d]ata archives worldwide stipulate that personal information concerning research participants is to be kept confidential and that guarantees of confidentiality and anonymity are to be honoured” (Parry & Mauthner, 2004: 143).
In the social sciences, one common method for collecting research data in focus here is the use of in-depth, for instance semi-structured, interviews. Such interviews are typically conducted about a topic that is highly sensitive –– or that may even be so current and contextually dependent that there exists no clear written or “objective” data on it. In some cases, it would not even be possible to develop clear written or “objective” data on a topic. Crucially, social science concerns not only “hard facts” or neutral data (as in fact no data is ever neutral) but often also explicitly opinions, conflicts, barriers to development, crisis management, systemic disorders, issues related to power relations, social structures, and how people think and act in relation to these social structures, barriers, and conflicts (e.g., Schön & Rein, 1994; Macnaghten & Urry, 1998). Such research can be done in multiple areas. For instance, interviewees may be employees at companies or in the government, and may describe issues not in line with company or government policy, such as the organisation in fact not following regulations. Making such statements could be interpreted as the interviewee being disloyal to their employer and could thereby be grounds for their dismissal, for instance (Fransson, 2013). At the same time, it is data that is crucially important to conduct research on, for instance environmental norms or why environmental regulation is not implemented (cf. Eriksen et al., 2015). Accordingly, the data that is sensitive in these ways is just the type of data that many social scientists seek to collect, and can also be crucially important data for research (cf. Wellstead et al., 2013; Wellstead et al., 2014). (See Appendix 1 for a background on how qualitative social sciences data may be collected and treated.)
Elite Interviewing as a Sensitive Case
These types of issues, which have started being described above, fall into several specific branches of research: among others, those involving implementation or organisational working and compliance, which may be undertaken to a great extent in organisational and institutional research, such as business or policy studies. Such studies often, but not always, focus on specific organisations and positions in them. Some may even focus on “elite” interviewing, which selects highly specific groups and individuals who may be seen as placed in a vulnerable position because of the power and position they hold. The case of elite interviewing will here be used as one example of a sensitive case, to illustrate the complexity that may exist even beyond for instance cases that may more commonly be seen as sensitive, such as cases in criminological or health research (e.g., Israel, 2004; Gibson et al., 2013).
With regard to elite interviewing, it has been noted that even within social science, “[t]he particularities of conducting policy research in a highly politicised domain, and the vulnerabilities of participants usually conceptualized as ‘elite’ within these spheres, have been underexplored in the qualitative methods literature” (Lancaster, 2017: 101, cf. Littig, 2009 on differences and similarities between expert and elite interviews). While elite research participants are often selected because they are actors with certain power, this power in fact makes them sensitive to power plays by others, should their sensitive information get out. As Lancaster notes, “[w]hile policy processes research may not fit the usual categories of ‘sensitivity’ to which ethics committees are attuned (such as traumatic events, violence, death or drug use), the vulnerabilities of participants … were at times significant due to the personal, professional and political issues in play” (Lancaster, 2017: 101). The privileged and specific nature of participants, e.g., in specific high-level panels or organisations, which cannot be anonymised per se as their nature is a crucial focus of the research, means that such interviews or other types of research data (such as observations) require considerable consideration. Knowledge that sensitive information exist if, for instance, meta data on it is made available may even lead to this information being particularly sought out and requests made that data be shared. Thus, Lancaster notes that “‘sensitivity’ is not an ‘unproblematic or commonsensical’ concept but rather relational and emergent” (Lancaster, 2017: 101).
What is more, in the case of elite interviewing in which the selected groups or panels are not large enough to allow for the successful anonymisation/pseudonymisation of data, it could also be expected that the sensitivity of the data may remain over time. Even if confidentiality is applied broadly to certain material, legal confidentiality may in some cases be reassessed or removed after a certain time (such as 10 years). This is a matter that may so far have been comparatively little discussed in qualitative method, as the focus has often been placed more on best practice (e.g., Bos, 2020), and the variations in legal requirement on data may differ highly between countries (e.g., Corti et al., 2000; Riksarkivet, 2022; OFS, 2009). However, in the increasingly formalised setting of research, also these legal frameworks may come to increasingly play a role. With regard to elite interviewing, people on, for instance, high-level panels or boards may at a later point in their careers be found in even more sensitive positions, acting for instance as national or EU politicians or high-level civil servants or on company boards. Elite interviews, for example in the case of international high-level boards, may thereby constitute highly sensitive material for long periods into the future, with considerable ethical implications for researchers in conducting research, and possibly paramount implications for research participants.
This type of elite interviewing is by nature not necessarily a large research area (and the direct contribution of qualitative research to policy-making is sometimes even discussed, cf. Veltri et al., 2014). It is also quite possible to undertake studies in these areas that do not have the problems highlighted here (for instance where the focus is on lower-level organisations that can be de-personified/pseudonymised by removing areal identifiers, such as local councils). However, despite the potentially limited and numerically relatively small role of these specific types of studies on elite type participants, who cannot be easily anonymised or pseudonymised, they can be of great importance indeed, and even play a role in democratic functioning. Research into for instance the development and implementation (or non-implementation or areas of avoidance) in relation to regulation and policy is crucial for understanding the extent to which, for instance, law is applied in specific cases, and what may hinder this. Research on compliance is crucial for understanding, for example, how different companies, authorities, and courts manage – or do not manage – something like environmental demands. It is thereby not only broader society that may support a “whistleblower” function (cf. e.g., Lewis et al., 2014; Hollings, 2011) but also research (Martin, 2014).
In this way, while journalism has often been seen as a means of transmission (sometimes struggling to be recognised as research or working within academic frameworks, e.g., Bacon, 2006; Matheson, 2018), research can also play roles often attributed to journalism: highlighting and publicising a problem. However, here it may become a problem that it is requested that research data be openly available in a way that journalism sources are not. Under strategic selection with a focus on organisations, the selected interviewees may from the outset constitute a more delimited group, and qualitative social science interviewing as well as other data collection becomes highly complicated indeed.
Protecting Organisational or Institutional Study Participants including “Elite” Participants in Studies
Following from the previous section, in sensitive elite interviewing – at national, EU, or international levels, but sometimes also at regional levels where it becomes apparent where the case is situated – data may thus both be a particularly high-level and crucial type of research data, but also constitute high-risk data for participants. At the same time and for the same reasons it might be difficult to share this type of research data openly while properly protecting the interviewee.
With regard to protecting data, a set of this type of personal data has come to be increasingly protected in the EU under GDPR legislation and various types of formal causes for confidentiality that exist in many countries. In GDPR regulation as well as in many countries, data on political, ethnic, or sexual belonging, or similar topics, is by nature regarded as falling under specific ethical grounds for consideration, so as to remove requirements for open access from this specific data (in Sweden PUL1998:204). However, these grounds may not be applicable in all the cases we discuss here, such as in relation to implementation, organisational working, or compliance. Data may also be made confidential due to the existence of sensitive personal data beyond these categories. However, as will be discussed below, these are not sufficient in all cases to entirely remove interview data from open access. To manage data under the requirements both for data sharing and for protecting sensitive personal data, the focus in much open access literature and supporting documents in different countries has come to lie on the means for de-personifying data through the use of pseudonyms, whereby specific direct identifiers are replaced with more general statements that convey the approximate meaning of a term (for instance, stating “city” instead of the city name). (https://www.ukdataservice.ac.uk/manage-data/legal-ethical/anonymisation/qualitative.aspx). Risks of indirect identification remain even under this method, as full pseudonymisation would need to remove not only formal identity markers (name, locations, direct identifiers) but also details like dialectal expressions, typical statements, organisational references, references to social structures, and information that in summary throughout a full interview may in some way identify the interviewee or the social context to which he/she belongs (e.g., organisation, company, etc.). Removing these details does thereby not necessarily prevent as much as make “indirect identification” through the summary information provided in an interview more difficult – particularly if the selection of interviewees already targets a smaller identified group. 2 The issue is compounded by that what is formally regarded as “research data” may include not only transcripts from interviews or recordings, but also sound files or any other recordings which may be considerably more difficult to fully de-personify.
How far the protection of interviewees under these circumstances can go, then, seems to differ according to country. Examples in the literature from Canada highlight the possibility to select only some interviews for open access in order to limit risks of identification: for instance in a study including interviews in several areas, only those that were conducted in one area were made open access (after de-personification/pseudonymisation in the interview as well) (cf. Chauvette et al., 2019). As another workaround, research agencies in different countries, such as the UK, advise that one consider such concerns already when designing the study: Perhaps sensitive issues that may pinpoint an interviewee can be divided into several interviews, or delimited in topic (https://www.ukdataservice.ac.uk/manage-data/legal-ethical/anonymisation/qualitative.aspx). However, this type of procedure may result in issues that may bias an interview, or that may result in the interviewer not being able to freely choose the topic of study to best match concerns. 3 While qualitative interviewing typically strives for rapport and for the interviewee to speak freely (e.g., Obelenė, 2009), dividing a larger relevant area into sub-topics, with separate interviews held for each one, might itself impact the openness of discussion or make the interviews less relevant to the larger topic.
Another possibility is that the research, due to its sensitive nature and potential implications for interviewees, may in its entirety be classed as confidential, in which case data will then on principle not be shared. However, it may differ between countries to which extent the researchers themselves are able to themselves make the decision on what data to share or not. The conditions under which confidentiality may be broken may also vary to a large extent between countries and specific cases (e.g., Israel, 2004).
The Swedish Case
Sweden may stand out in this respect, as one of the countries with the strongest requirements for open access to information at large, as all data that is received at an authority, as a rule, is considered public (cf. Prop. 1994/95:19, p. 461). Under Swedish law, the digital record of, for instance, an interview received at the university as a state body falls under public transparency requirements. These requirements are the same at a public research body as at any other public body (e.g., Borglund & Engvall, 2014), and would include any sound or text file that is received as part of an employee’s (in this case, the researcher’s) work. It is thus not possible to ensure participants that only anonymised data will be shared, or to at all guarantee that specific ethical requirements (such as not sharing specific data) will be maintained. This is because confidentiality is not only assessed based on the research itself, but also on who is requesting to see or use it, and for what purpose (e.g., OSL 24 ch. 8 § third para). This means that it will be very difficult for the individual researcher at a public body – a large part of research – to assess how well any conducted research and the research subjects therein are protected, as well as what protection they can be guaranteed. 4
The general requirements for transparency are thereby so far-reaching that interviewing in sensitive cases may mainly have been possible due to the fact that interview data has seldom (if ever) been centrally stored, labelled, or made available for easy identification and access. These practical and informal delimitations to the far-reaching formal transparency – and researchers’ limited knowledge of these formal requirements – may thereby have meant that sensitive interviewing could only be undertaken as long as the practical delimitations meant that it was seldom easy to identify or request interview data. As open access is presently implemented as part of a government bill (prop. 2016/17: 50) and related assignments to Swedish authorities and institutions, concerns for research may in particular result from the interaction between existing Swedish regulation and broader open access requirements.
The Swedish policy framework that impacts open access is thus not only open access-specific but rather include general principles that are gaining increasing impact through a general implementation of open access. Thus, these types of requirements have long been present in Sweden with regard to public institutions, but as research data previously has commonly not been made subject to procedures for public cataloguing or available to the open access movement and its institutionalisation through EU requirements, potential issues regarding interviewee protection and transparency have perhaps most often not been recognised, making this type of research possible to undertake nevertheless (but see Swedish National Agency for Higher Education, 2005). Restrictions to public availability in the Swedish case mainly center on the specific grounds for confidentiality outlined above (such as political and ethnic belonging), general confidentiality requirements (e.g., OSL 24 ch. 8 § third para), and now also taking into account GDPR grounds for confidentiality. However, it has also been noted that GDPR in Sweden does not necessarily prevent data from being made public, or even allow the person from which the data originated to retract it (Swedish legislation decision 2018-10-02 dnr 327/18, decision 2019-02-13 in court case 5437-18). As Swedish legislation, together with for instance GDPR requirements, constitutes the legal situation against which assessments are made, the “right to be forgotten” in GDPR is not always applicable, to the extent that data cannot be fully retracted (Swedish legislation decision 2018-10-02 dnr 327/18, decision 2019-02-13 in court case 5437-18). This will have effects not least on the right of interviewees to retract statements, for instance if an interviewee would wish to do so late in the process after realising any potential impact of their statements. Thus, as with all these issues, there are significant impacts on the rights of interviewees – the actual original “data owners”.
This situation also implies that the role of restrictions to data that are stated by researchers at a public institution in any data management plan they develop is delimited. While a data management plan is to include statements by the researcher on the sensitivity of the data and could, for instance, include a decision on the researcher’s part to not make the data open, upon any request data availability is in Sweden assessed first under the law rather than in relation to researcher statements. Assessment in this situation is made not necessarily by the researcher but by those assigned by the university to assess data availability and public access; that is, persons who are typically not active in the particular field of the research and may thereby not necessarily be able to assess what may be sensitive data in the particular situation and case (e.g.. with regard to level of de-personification required to hinder interviewee identification). The transparency requirements in Sweden have already led to legal assessments that sensitive data must be shared, contrary to the researcher’s ethical assessment of it (cf. Swedish National Agency for Higher Education, 2005). In one sensitive court case in which open access was requested for interviews, the researcher even destroyed the data rather than share it (e.g., Swedish National Agency for Higher Education, 2005, for an overview of this development and concerns related to it).
Thus, it has broadly been concluded that a researcher cannot guarantee confidentiality of data in Sweden (e.g., Swedish Research Agency, 2017).
If a request for data is received, the university will assess whether it falls under the legal requirements for confidentiality (general or specific) and is sufficiently protected personal data. Depending on assessment, the university may make data or specific parts of data available, which in the case of sensitive interviewing in limited groups could allow for the indirect identification of interviewees. Even if data is placed under confidentiality requirements, such requirements may be lifted after specific time periods (OFS, 2009:400). For elite interviews, however, as has been noted above, data may not necessarily be less sensitive for interview participants far in the future (as they may potentially at a later time hold even higher positions, thus being more politically vulnerable). This may imply potential risks to interviewees that an ethically minded researcher working on sensitive areas (for instance, implementation or legislation issues relevant to elite interviewing) cannot expose research subjects to. It also means that it is not possible for the researcher to know beforehand exactly how the data use may be delimited, given that some assessments are only made at the point at which data is requested.
As a result, the general discussion upon which international literature on responsible qualitative data sharing has focused (for instance responsible anonymisation, e.g., Chauvette et al., 2019, https://www.ukdataservice.ac.uk/manage-data/legal-ethical/anonymisation/qualitative.aspx) is not sufficient for data at public institutions in Sweden, as it is not possible to know which data – including original non-anonymised or non-depersonified files – may be judged as legally obliged to share with whom, and when. This also means that in Sweden you cannot know exactly how data will be treated in the future. At the very least, you cannot promise your research participants confidentiality.
The present requirements for making information available about what data exists as well as for depositing data (archiving) in common university-based repositories thus fundamentally mean that the researcher cannot limit the access to sensitive data. As a result, some research of the type we have discussed here may be both ethically and practically (obtaining agreement from participants) impossible to undertake, as no guarantees can be made to participants.
Discussion and Conclusion
In relation to the specific nature of the data discussed here, the question remains as to whether it is at all ethically possible to undertake specific research dependent on sensitive data under the legal and open access requirements in some contexts. In the Swedish case, the answer for some data may be “no”: however, the implementation of open access is still under development and a crucial question may be to what extent the requirements or social sciences sensitive data is taken into account in this implementation.
Crucial research in qualitative social science include studies that may come the closest to revealing why we are not taking action on environmental, organisational, or other change in institutions, and how this could be incentivised. However, if the integrity of the research subject cannot be protected, social science researchers will likely not be able to collect this kind of research data in the first place, and accordingly social science research will be limited to research topics that do not take up issues like power relations, conflicts, organisational structures, relations, and structural and systemic inequalities in elite bodies; i.e., those that may be the most important to policy research and implementation. Ultimately, this raises the question of whether initiative towards transparency and open access, despite all good intentions, is in this case in fact limiting rather than extending possibilities for research. In the Swedish case, open access implementation without consideration to pre-existing legal requirements may risk limiting crucial research in qualitative social science and thereby even make the research it intends to make publicly available impossible to conduct in the first place.
Thus, in order to successfully implement open access while avoiding inadvertently limiting qualitative social science, there are a set of issues that need to be resolved. While the discussion here focuses on the Swedish case, it is, as literature has illustrated, also crucial in other countries and contexts to assure that “ethical considerations are of paramount importance in the archiving of qualitative research data” and that research can be undertaken even in cases where “some qualitative research material may be intrinsically impossible to archive” (Corti et al., 2000, sections 7 and 5, respectively). The Swedish case regarding research at public institutions can thereby illustrate – potentially in an atypical or even extreme way – the concerns around qualitative data that have been highlighted in literature (e.g., Wessels et al., 2014; Elman et al., 2010; Kirilova & Karcher, 2017), and the need to consider the nature of qualitative social sciences research data and the specific requirements of this data in relation to open access.
This manuscript has noted the ways in which qualitative data has perhaps not been the primary target of regulations regarding open access to research data, and that considerations specifically involving qualitative data, particularly in the types of cases discussed here, must be given more attention (e.g., Chauvette et al., 2019; Wessels et al., 2014). The quality of research that is enabled by a specific research design must be primary to the considerations that open access places on issues: such as revising the study design, dividing up interviews, or even not undertaking certain studies. This means that the researcher must be the one who determines open access to sensitive data, and hold intellectual property rights to the data so that they can limit its use or availability in different cases. 5 This means that the researcher, not the university or other body where the research is undertaken, must be able to act so as to, on an ethical basis, limit access or use in cases in which this is clearly motivated. Thus, the importance of according a role to the researcher – and in the Swedish case, potentially treating research differently than other documents created or maintained at public bodies – may be paramount. It does not challenge open access to research data as a general principle – its benefit has been noted widely – but recognises the types of data that have so far been peripheral to discussions of open access in relation to research regulation and policy (cf. Class et al., 2021).
Following this, legislation and policy need to correctly enable and take account of research data generated in the social sciences. This might require review of existing legislation in consideration of sensitive social science data. In Sweden, this is in line with the requirement that researchers in Sweden, in accordance with the Swedish Higher Education Act, should be able to independently choose their problem and method (Högskolelag (1992:1434) t.o.m. SFS 2021:1282 1 kap. 6 §). This might mean, for instance, heeding the calls that have already been made at university level to review issues in national context (Swedish National Agency for Higher Education, 2005). In Sweden, one way to attend to the issue could relate to exempting data gathered at Swedish public institutions for the purposes of research from general transparency requirements, in line with what seems to be the case for data in several other countries (e.g., Corti et al., 2000). In addition, the EU should clarify potential considerations in relation to its existing open access regulation and develop a discussion of different types of research data, taking care to monitor implementation so that EU regulation does not in fact create more problems than it solves – making data less rather than more possible to use or even collect.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
