Abstract
Biobanking involves the assembling, curating, and distributing of samples and data. While relations between samples and data are often taken as defining properties of biobanking, several studies have pointed to the challenges in relating them in practice. This article investigates how samples and data are curated, connected, and made mobile in practice. Building on an analysis of data collected at five hospital-based biobanks in Austria, the article describes and compares biobanking in three types of biobank collections: ‘departmental collections’, ‘project-specific collections’ and ‘hospital-wide collections’. It draws attention to the invisible work going into this infrastructure and highlights the central role of visions to make samples and data travel to a different location and thus support biomedical research. It shows that while visions of future travels are often epistemologically uncertain, they are informed by social ties and relationships between the collectives involved in the curation of samples and data on the one hand and the imagined users on the other. Finally, we point to the importance that policy actors in this domain consider the aspects we identified—and, in particular, reflect the temporalities inherent in such a research infrastructure.
Introduction
This anecdotal observation points to two interrelated themes that have become salient in the field of biobanking: the ‘underuse’ of samples, i.e., the phenomenon that biobanks might invest efforts and resources to collect and store samples, which are then hardly ever used (Scudellari, 2013); and the understanding that this underuse could be avoided if samples were associated with clinical information and data.
Indeed, the connection of samples (also described as ‘biomedical resources’, ‘biomedical materials’, or ‘biospecimens’) and data is frequently referred to as a defining property of biobanks. For instance, the European ‘Biobanking and Biomolecular Resources Research Infrastructure’ (BBMRI) described biobanks as ‘places’ that ‘store all types of human biological samples, such as blood, tissue, cells or DNA’, as well as ‘data related to the sample’ (BBMRI-ERIC, nd). Similarly, the ‘International Organization for Standardization’ (ISO) defines ‘biobanking’ as the ‘process of acquisitioning (…) storing, (…) and distributing defined biological material as well as related information and data’ (ISO, 2018: 2).
However, the short conference exchange described above suggests that the relations between samples and data are often more easily envisioned in definitions than made on the ground in practice. Studies on biobanking from a social science perspective have also documented this problem and pointed towards numerous challenges. Samples need to be connected to or turned into data in order to make them mobile or amenable to ‘flow’ (Pinel and Svendsen, 2021; Stephens and Dimond, 2015), i.e., to make them reusable in multiple sites. However, ‘increased attention on ensuring [samples’] correct ethical guardianship’ (Stephens and Dimond, 2015: 5) often poses challenges, particularly on collections, which were brought into being under different ethical frameworks. In some cases, tensions have been found between the expectations of researchers towards biobanks and those funding these infrastructures (Aarden, 2017). Finally, biomedical professionals’ reluctance to share their samples and data is also perceived to be a significant hurdle to successful biobanking (Tupasela, 2021). Thus, relations between samples and data in biobanking and the mobility of these ‘sample/data packages’ cannot be taken for granted; associations between samples and data and their movements across time and space are achievements that need to be understood in more detail.
In this article, we explore biobanking at five locations of ‘hospital-based biobanks’ in Austria, where biomedical professionals have managed to both assemble samples and data and relate them to one another. We asked what kind of samples and data were assembled and how, by whom samples and data were linked to one another in practice, and what visions sustained and materialized within this form of ‘sample/data labour’, i.e., the often invisible work involved in curating and relating sample and data.
When we write about relations between samples and data, we follow the understanding of our interviewees. They separated samples and data when, for example, emphasizing that samples needed ‘clinical data’ to become useful or ‘sample data’ to support reproducible knowledge claims. They equated data with different sets of information that needed to be curated along predefined standards to function as ‘evidence for (…) knowledge claims’ (Leonelli, 2016: 69), while they interpreted samples as biological materials that could be transformed into multiple datasets in the future. They also deemed different groups of biomedical professionals skilled enough to take care of samples and different sets of data. However, while our interviewees separated samples and data, the boundary between them was not that clear-cut. Interviewees, for example, discussed whether Austrian data protection regulations could also be extended to samples. Or, the boundary blurred when they talked about packages of samples and data. We use the term sample/data to refer to such moments.
In what follows, we will start by outlining three major bodies of writing around biobanking and biomedical research: the social studies of biobanking, critical data studies, and STS work on expectations, visions, and imaginaries. These will help us frame and better understand what makes samples and data mobile in practice. We then outline the methodological approach that informed our research. In presenting our findings, we will follow the temporal order of curation practices in biobanking. We will start with those situations in which biomedical professionals join forces to plan biobank collections and unpack the future sample/data travels imagined in such situations. Subsequently, we discuss the practices involved in bringing various artifacts entangled in biobanking into being, including the documentation of informed consent, the curation of samples, sample data, and patient data, as well as the attachment of data to samples to turn them into mobile resources able to travel beyond the biobank and support biomedical research. We describe the sample/data labour and its distribution of responsibilities, comparing it along the three types of biobank collections that we have deduced from our data, i.e., ‘departmental collections’, ‘project-specific collections’, and ‘hospital-wide collections’. We will show that while the distribution of the sample/data labour differs between these three types of collections, it is sustained by imagined sample/data travels and social ties between the collective of actors performing this labour and the imagined users for whom they perform it. We conclude with reflections on what must be considered when creating a biobank and turning it into a sustainable research infrastructure.
On biobanking, data practices, and futures
Social studies of biobanking
Biobanking assembles the practices of collecting, handling, storing, and distributing samples and related data for biomedical research purposes. However, findings from the social studies of biobanking show that successful alignments of these practices cannot be taken for granted. Relating samples to data has been described as a particular bottleneck. Previous research has shown that the connections between samples and data are vital for making samples amenable to travel beyond the contexts in which they were produced (French et al., 2019; Morrison, 2022; Pinel and Svendsen, 2021; Stephens and Dimond, 2015). Martin French and colleagues noted that while the ‘possession and accumulation’ of samples were defining properties of biobanking, without data ‘the biobanking enterprise would add limited value to health research’ (French et al., 2019: 146). Yet, as Neil Stephens and Rebecca Dimond showed, making these entanglements requires ‘significant labour, networking, and bureaucratic and promissory work’ (Stephens and Dimond, 2015: 6).
Several studies have pointed to tensions in sustainable biobanking. Although there is a rising pressure to bring together, merge and share ever-larger amounts of data, Aaro Tupasela (2021) reflected on the reluctance of biomedical professionals to engage in such sharing practices. Elaborating on ‘data hugging’, he sought to capture the intimate care relationship researchers can develop with ‘their’ samples/data, and underlined that the phenomenon ‘suggests that value creation’ in biobanking is ‘far more complex and multi-faceted’ (Tupasela, 2021: 518) than imagined in policy discourses. Aarden (2017: 759) also stressed the difficulties in aligning institutional expectations of sample exchange and value generation through biobanking ‘with the ways scientific practitioners interpreted the value of tissue collection’.
Thus, social science research has shown that relations between samples and data are not givens but must instead be conceptualized as achievements. This conceptualization invites us to ask what makes samples and data amenable to being assembled, connected, and made mobile in practice.
Connecting with critical data studies
When engaging with the question of how samples/data can be made mobile, insights and concepts from Critical Data Studies offer valuable guidance. Numerous studies have engaged with movements of data, introducing concepts such as ‘circulations’ (Morrison, 2022); ‘flows’, ‘nonflows’, and ‘overflows’ (Hoeyer et al., 2016); ‘movements’ (Prainsack, 2017: 66–67); and ‘travels’ or ‘journeys’ (Bates et al., 2016; Leonelli, 2014, 2016; Leonelli and Tempini, 2020). While the concept of ‘data journeys’ was developed to shed light on different phenomena, in our use we focus on the cross-cutting observation that data mobility is always an effect of the relations in which data are entangled—it is not inherent to their nature, as the metaphors of ‘circulation’ and ‘flows’ might otherwise suggest (Leonelli, 2016; Prainsack, 2017: 66–67).
Critical data studies have underlined that the mobility of data is an effect of their entanglement in practices and infrastructures in which the technical and the normative are enmeshed in multiple ways. Data ‘journeys’, Leonelli observed, are ‘enabled by […] social agency’ (Leonelli, 2020a: 9). Klaus Hoeyer and colleagues noted that making data move depends on people's ‘groundwork’ (Hoeyer et al., 2016: 387). Clemence Pinel et al. (2020) stressed the gendered dimension of data work, which they framed as ‘care work’. Others have underlined that data practices are often ‘outside of the formal job descriptions’ (Morrison, 2022: 151), taking the shape of largely invisible ‘infrastructural work’ (Nadim, 2016). And, as Hoeyer et al. (2016) pointed out, this data work also involves ‘ethics work’ of biomedical professionals who must decide which types of data can flow and which must not overflow while simultaneously caring for relations to research subjects and patients.
Data practices and travels have also been described to be shaped by social ties, hierarchies, and norms while also generating new forms of sociality. Addressing social ties and entanglements that mediate data flows, Aaro Tupasela underlined that biomedical professionals ‘care for the samples and data, as well as the ways in which they are used’ (Tupasela, 2021: 520). Similarly, Clemence Pinel and Mette Svendsen noted that data practitioners make ‘attachments to places and people’ (Pinel and Svendsen, 2021: 2), and ‘communities or institutions.’
Relationships with people were also salient in our material, yet they took different shapes. The social ties we observed related neither primarily to patients nor did they take the shape of ‘national attachments’ and imagined communities described by Pinel and Svendsen (2021: 6). In our conversations with biomedical professionals, the social ties that gained the most visibility were those with potential future users.
How futures matter
To further elaborate on the nature and scope of the values and social ties involved in efforts to make samples and data amenable to move, it is essential to bring the reflections we have made thus far into a conversation with the body of literature on the performative role of future-oriented expectations, visions, and imaginaries in biomedicine (e.g., Borup et al., 2006; Jasanoff, 2015; Martin et al., 2008). This literature describes aligning future-oriented expectations and crafting shared imaginaries of futures worth striving for as a constitutive part of contemporary biomedical practices. Sheila Jasanoff's work points to the key role played by the ‘sociotechnical imaginaries’, which have managed to become ‘collectively held, institutionally stabilized, and publicly performed visions of desirable futures’ (Jasanoff, 2015: 4). Imaginaries have a powerful prescriptive character supporting specific futures that ought to be attained in specific contexts, tacitly prescribing the horizons of actions to be taken. Moreover, imaginaries are both ‘products and producers of networks of humans’ and ‘nonhuman actors’, around which ‘scientific practices and communities are organized’ (Fujimura, 2003: 192). Paul Martin and colleagues introduced the concepts of ‘communities of promise’ to shed light on how diverse groupings of actors ‘form and operate in reference to imagined futures’ (Martin et al., 2008: 30) for technoscientific objects. Imagined futures at once define these objects and the ‘relations within and between [the] communities’ (Martin et al., 2008: 30) that assemble around them.
Biobanks have been a prominent example within this body of literature. Several studies have shown how expectations about future biomedical and economic values facilitate investments from funding agencies and policymakers into biobanking infrastructures (Aarden, 2017; Martin et al., 2008; Tarkkala et al., 2019; Tutton, 2007). In this article, we shift attention from the vision of policymakers to those of biomedical professionals who plan, build, and maintain biobank collections on the ground and explore how values, visions, and hopes make the practice of curating samples and data meaningful for those pursuing them. This shift in attention will help us to show that these practices are always performed for someone, i.e., by the collectives that curate particular biobank collections for the future users they imagine.
Methods
We base our arguments on a multi-sited case study in which we engaged with biobanking at five biobank locations in Austria. All of them were associated with the ‘Austrian node’ of BBMRI-ERIC—a pan-European research infrastructure, which connects biobanks in more than twenty member states. Some of the five biobanks in Austria existed before the emergence of BBMRI, whose scope and structure they also helped to shape. For others, the emergence of BBMRI and efforts to contribute to this European research infrastructure with an ‘Austrian node’ acted as catalysts for their establishment. Moreover, notably, all of them were ‘hospital-based’ or ‘clinical’ biobanks. The choice to build an Austrian research infrastructure by connecting hospital-based biobanks reflects the strong role of medical universities in biomedical research and of hospitals managed by regions in the federal public health care system in Austria. However, the choice to disentangle materials from the context of biomedical care practices and to repurpose them as resources for research also generates frictions.
As with other types of biobanks, hospital-based biobanks seek to collect samples and data to make them amenable to be (re)used for (biobank-based) biomedical research in the future (Boeckhout and Douglas, 2015; French et al., 2019). However, hospital-based biobanks do not start from scratch when recruiting research participants or assembling samples and data. In the case of hospital-based biobanking, participants are often already present as patients. Biomedical professionals regularly take samples from patients and use technical devices or ‘platforms’ (Keating and Cambrosio, 2006) to datafy samples, transforming, say, blood samples of patients into values of particular markers to inform decisions on therapies. They also store tissues taken from patients’ bodies during procedures such as biopsies or surgeries. Thus, hospital-based biobanking can build on existing clinical resources, practices, and procedures. However, work and resources are still required to transform samples and data from the worlds of care into resources for biomedical research (French et al., 2019; Stephens and Dimond, 2015). And, as we will argue, this infrastructural work also requires actors to perform it.
Our multi-sited case study involved five hospitals where clinicians, nurses, and technicians provided care to patients. At all sites, medical universities were attached to the hospital and oversaw the clinical research being conducted. Each of the staffed medical universities either had established biobanks and funded technical devices, such as freezers and liquid nitrogen tanks or were in the process of establishing such biobanks. While the biobank staff performed a lot of the biobanking-related work, they could not perform all of it. At each location, biobanking also involved the practices of clinicians, nurses, and technicians affiliated with the hospitals. Thus, biobank collections, or specific assemblages of samples and data, emerged at an intersection between hospitals and the universities.
In terms of materials, we draw from the transcripts of 27 semi-structured interviews we conducted with biomedical professionals and policymakers between May 2020 and August 2021, fieldnotes from ethnographic observations at biobanks (in December 2019 and February 2020), and public workshops between September 2019 and August 2021. In total, 24 participants provided consent for the recording and the transcription of the interviews. One interviewee was interviewed twice. Interviews lasted between 25 and 120 min and were about 73 min on average. All interviews but one were conducted in German. We analysed the transcripts in German. Our research was also informed by our involvement in the Austrian node of BBMRI as ELSI experts. We regularly attended meetings of the entire consortium. While we did not use notes or minutes from these meetings as research materials, the questions we asked in our interviews and data analysis were partly shaped by them.
The collection, analysis, and interpretation of our materials were informed by the major tenets of ‘assemblage thinking’ and ‘situational analysis’ (Clarke et al., 2017; DeLanda, 2016; Timmermans and Shostak, 2016). We followed samples and data to several situations along their curation processes in biobanking. We identified these situations deductively (drawing on literature and our own observations) and fine-tuned them inductively. Among others, these situations included the following: those in which biobank collections were envisioned and planned; those where biomedical professionals informed patients about biobanks and asked them for their consent; situations where samples and a variety of data and databases were produced, curated, and taken care of; and those where access to samples and data, and therewith travels of samples and data were authorized.
When analysing and interpreting data, we compared the situations by unpacking the actors actively involved or passively implicated, relevant practices, and the regulations, constraints, visions and values that shaped these practices, drawing on the descriptions and reflections of our interviewees (Boltanski, 2011). Specifically, we were attentive to the travels of samples and data that biomedical professionals envisioned, adapting the concept of ‘data journeys’ (Bates et al., 2016; Leonelli, 2014, 2016; Leonelli and Tempini, 2020) to the properties of our case study. We did not ‘accompany data in their journeys from the material origin through human interactions with the world (…) to their dissemination (…) and ultimately to their use as evidence for claims’ (Leonelli, 2020b: vi). Rather, we followed the artifacts involved in biobanking along their curation processes, distilling visions of desirable and undesirable samples and data travels from the ‘constant comparison’ (Charmaz, 2006) of the descriptions and reflections of biomedical professionals.
Moreover, we also analysed and compared the uneven ways in which responsibilities for the practices involved in biobanking were shared and distributed between biomedical professionals situated in hospitals and those affiliated with the biobank to make sense of both differences within single biobank locations and similarities that cut across several locations. This helped us distil three types of biobank collections, which were envisioned and materialized by different collectives of biomedical professionals: departmental, project-specific and hospital-wide biobank collections. We specifically analysed differences and similarities across these three types of biobanks to better understand what shaped readiness to engage in transforming artifacts from the worlds of hospital care into resources for future biomedical research.
Findings
We will present our findings following the temporal order of curation practices in biobanking, starting with envisioning collections, followed by documenting consent and curating samples. In the last two sections, we reflect on the information that is added to samples allowing them to travel: (a) sample data that document the production process of a sample and (b) diverse kinds of patient data.
Envisioning biobank collections
Biobank collections began to emerge when collectives of biomedical professionals joined forces to envision infrastructures that would enable biomedical research in the future. They started to form ‘collectives of curators’ who committed to systematically collecting, curating, and distributing samples and connecting them to data. Their visions were never set in stone. They were adjusted to newly emerging opportunities and constraints. Nonetheless, they were performative. Shared visions on how and by whom samples and data would be used in the future and where sample/data packages should travel sustained the sample/data labour, the distribution of responsibilities, and defined entitlements for the subsequent uses for each collection.
Collectives of curators emerged along several lines, took different shapes, and crystallized into three different types of biobank collections. The first were ‘departmental collections’, realized by institutionalized collectives associated with established clinical departments. Such collections focused on a range of diseases and syndromes, which were treated frequently in the respective departments. The samples they ‘banked’ (Interview 2020/10) were routinely produced when diagnosing and treating patients.
The second kind of biobank collection were ‘study/project-specific collections’, which were brought into being by biomedical professionals situated within or across departmental and disciplinary boundaries. These collectives emerged through a shared interest in studying a specific condition and through the related practices of envisioning, materializing, and using the collection they would build together. Unlike departmental collections, the collectives attached to ‘project-specific collections’ did not predate their collection; joining forces to curate sample/data brought these collectives into being.
While visions for these first two types of collections often came from actors situated in the clinic, a third type – ‘hospital-wide collections’ – was mainly envisioned by biomedical professionals associated with the biobank who were not tied to a specific clinical or research-related community. Like departmental collections, hospital-wide collections sought to leverage clinical routines to mobilize samples for research purposes. Yet, they were often driven by institutional actors with a clear vision that such an infrastructure would advance biomedical research more generally. These actors attempted to ‘find all[ies]’ and ‘convince’ clinicians of the ‘advantage’ (Interview 2020/13) of biobanking, inviting them to contribute to hospital-wide collections, for example, by notifying biobank staff when material that was no longer needed for clinical diagnostics was available to be frozen for storage or by providing standardized information material to inform patients about biobank collections.
For all three types, the planning of biobank collections involved aligning future-oriented expectations about which kinds of biomedical research could and should be enabled through these collections. Similar to the ‘communities of promise’, which Martin et al. (2008) described, the collectives of sample/data curators assembled around futures of sample/data packages they envisioned, i.e., on how, for which purpose and by whom they ought to be used, where sample/data ought to travel and who should accompany their travels. Specifically, the collectives of curators imagined shared visions on future sample/data travels.
Most collections assembled materials from particular diseases or subtypes of diseases to facilitate future research on specific conditions. However, when these collections were envisioned, the exact epistemic nature of future research was often underdetermined. Biomedical professionals often did not yet clearly perceive the research questions that should be answered or the hypotheses that ought to be tested. As we will show in the section below, this epistemic under-determinacy of future research was a major reason why biomedical professionals insisted on the value of investing resources to store samples and data instead of datafying the samples immediately and keeping the data alone.
However, while the epistemic nature of future research was often underdetermined, all types of collections were informed by the imagined social ties that would be created through curating and using samples/data. As in other technological realizations (Oudshoorn and Pinch, 2003), biomedical professionals’ relationships to ‘their’ biobank collections were mediated by imaginations about future users. Visions of who the work and effort of assembling resources would be done for and how future users would relate to the collectives of curators sustained the data/sample labour.
Imagined users differed significantly between the three types of biobank collections. In departmental and study-specific biobank collections, those collecting, curating, and storing materials primarily did so for the collective they felt to belong to. This included current as well as future members such as PhD students or post-docs. For instance, one informant explained: [W]e try to store as much tissue as possible so that material is not only available for this study but also that maybe our grandchildren can still do research on it [laughs]. Because the difficult thing about clinical research is that if you have an idea today, you really often need years until you have an adequate number of samples, (..). And if then, of course, a younger person comes along and already finds 50 samples (…) that are well processed, that have been well treated, where all the information is stored in a digital system, then he saves himself a lot of work and can start with research topics much faster (Interview 2021/3).
1
This did not exclude external users from having access to the resources assembled in the collections. However, travels of samples and data across the boundaries of a collective of curators were based on the understanding that these would travel to biomedical professionals who were either already part of the collectives’ wider networks (cf. Tupasela, 2021) or would become part of their network through the sharing of samples. When we asked one interviewee if samples from their departmental collection could also be shared with others, they explained: ‘Yes, yes, that's sometimes also the purpose of the whole’, noting that the purpose of the biobank collection should ‘also be a bit (…) an interface for potential collaborations’ (Interview 2020/10). Thus, biomedical professionals envisioned samples as ‘materials’ for their own future research or as ‘resources’ to establish new collaborative relationships. In their visions, biobank collections fulfilled two tasks. First, ‘banking’ samples would link their current clinical practices of patient care with their (future) research practices on the mechanisms of diseases. Second, when samples travelled to other spaces, biomedical professionals expected that some members of the collective of curators would travel along with the samples as collaborating partners and co-authors of publications.
In the case of hospital-wide collections, although specific users were not clearly known during the planning stage, what did already exist were visions about the types of potential future users. The imagined users of hospital-wide collections ranged from early-stage researchers who might struggle to have access to samples to external corporate actors such as start-ups or pharmaceutical companies. In contrast to departmental and study-specific collections, imagined users of hospital-wide biobanks were not necessarily part of the collectives of curators. Making sample/data accessible was conceptualized as either a service to the biomedical research community or a positioning strategy as a potential collaboration node within the wider research landscape.
These different visions of desirable future travels of samples and data were materialized in the distribution of labour and responsibilities, which we will discuss in more detail in the following section, as well as in tacit or explicit procedures involved in authorizing access to samples and data. In the case of departmental collections, (often) tacit ‘access procedures’ entitled heads of departments to authorize or deny requests for the use of samples and data, giving them the authority to control their travels and uses. In the case of ‘study-specific’ collections, explicit ‘access rules’ were often codified when collectives of curators negotiated and aligned their expectations about future uses of samples and data. Most often, its founding members agreed on procedures to handle request for access, and also secured the mutual right to veto sample/data travels. In the case of hospital-wide biobank collections, heads of departments or principal investigators did not have the explicit authority or right to control future travels of samples and data; here, the biobank staff was entrusted to facilitate the travels of samples and data. They often drafted explicit rules and procedures that made transparent how users could make requests for samples and data. Still, as we will show in the section on clinical data below, clinicians nonetheless retained some power over the travels of samples and data.
Biobank collections were thus planned by specific collectives of biomedical professionals who joined forces as collectives of curators and assembled around imagined future sample/data travels. While these travels were often epistemologically underdetermined, they were informed by imagined social ties between the collectives of curators and future users. As we will show in the next sections, imagined relationships between the biomedical professionals who took care of biobank collections in the present and the collections’ imagined future users also sustained the practices of curating samples and data and the distribution of sample/data labour.
Documenting consent
Once specific collections were planned, they began to take material shape when biomedical professionals informed research participants or patients about biobank activities and asked them for their consent.
The first element that needed to be produced in biobanking was a document that recorded that a patient had been informed about a biobank collection and had agreed to the storage of samples, their use for biomedical research purposes, and occasionally linking samples with additional health data. Given that this document contained personal data, such as the patient's full name, the document itself was not meant to be mobile. Signed consent forms belonged to the category of ‘stable entities that do not—or should not—move’ (Morrison, 2022) but allowed samples and data to become mobile. They disentangled samples from biomedical care practices, enabled their repurposing for research practices, and also entitled biomedical professionals to envision sample/data travels, without the need to reconsult patients. In turn, without patient consent, a sample was regarded as ‘worthless’ (for biobanking) ‘because you may not use it’ (Interview 2021/6).
In each collection type, clinicians informed patients about biobank collections and asked them to provide their consent to biobanking samples. In departmental and study-specific biobank collections, it was often the responsibility of clinicians to prepare the information sheets and consent forms and to ask the local ethics committee to approve these forms (along with the biobank collection). Biomedical professionals involved in departmental collections described informing and asking patients for their consent as a routine practice and pointed to distinct ‘logistics apparatus’ that ensured signed consent forms would be collected and properly stored. In the case of study-specific biobank collections, it was also primarily the responsibility of clinicians to get informed consent. In hospital-wide biobank collections, biobank staff assisted clinicians with this work, providing standardized information sheets and consent forms, and storing signed forms. Nonetheless, collecting consent documents seemed to be a bottleneck in hospital-wide collections. Cases in which patients after being informed about a biobank collection did not then provide their consent were described as rare. An absence of consent was more seen as ‘up to the doctor […] or the person who is providing the information’ (Interview 2021/1), or who did not provide sufficient information to patients. Indeed, in hospital-wide biobank collections, it seemed to be challenging to convince clinicians of the value of biobanking as they perceived informing and asking patients for their consent as yet further adding to their workload. They had to be convinced of the vision and value of contributing to making samples amenable to move.
Curating samples
Samples came in many shapes and forms: as frozen native tissue disentangled from patients’ bodies in surgeries and stored in liquid nitrogen tanks, as liquid serum samples kept for storage in ‘minus 80’ (degree freezers), or as ‘formalin-fixed paraffin-embedded’ tissue blocks stored in drawers. Some samples were ‘left-overs’ from clinical routines. Others were ‘add-on’ blood samples, which were taken during routine diagnostic blood draws. In particular, tissue samples that had been removed for therapeutic purposes were ‘banked’ if no longer needed.
Most of our interviewees attributed the value of samples to the fact that they were not yet data but amenable to being datafied in multiple ways in the future—even in ways epistemically unthought or technically unfeasible in the present. Biomedical professionals framed samples as ‘materials’ or ‘raw materials’ (Interview 2021/3) for future research, which situated them between taking care of patients’ bodies in the present and doing research on mechanisms of (subtypes of) diseases or in the future. Their motivation to collect, process, and store samples was linked to the expectation that they would be able to return to some of these ‘materials' in the future. Then, so they hoped, they would be able to combine the samples with samples from other patients or with clinical data on the development or outcome of a patient's condition. Another important support rationale for biobanking samples was the ability to wait for novel ‘methods’ and ‘technologies’ that would help to ‘extract’ data from these samples. In that sense, our interviewees’ reflections on the importance of keeping samples are closely related to the expected emergence of ever new ‘inscription devices’, which, in the case of biobanks, will ‘transform pieces of matter’ (Latour and Woolgar, 1986: 51) into data, thus managing to extract information allowing the existing samples to be represented and seen in potentially fundamentally new ways.
Only rarely did interviewees ponder over the idea that the samples could be immediately datafied, which would allow biobanks to store data rather than samples. This was either seen as ‘too expensive’: ‘you rarely know exactly what [kind of research questions] you want to answer in the beginning’ (Interview 2021/6); or, it was argued that there never was ‘a uniform data set’ to be ‘generate[d]’ from samples. The ‘very purpose of [storing samples in a] biobank,’ as one participant explained, ‘is to freeze material with the foresight that I then potentially use completely different technologies as right now’ (Interview 2020/10).
A strong sense of technological progress underpinned biomedical professionals’ reflections on what kind of data might be ‘extracted’ from samples in the future, rendering technical devices and machineries companions of imagined sample travels. This informed the handling of samples in two ways. On one hand, samples were collected and processed in such a way as to make them amenable to be used with newly emerging technologies by ‘adapt[ing] the materials to technologies’ (Interview 2020/10). On the other hand, however, the hope that future technologies might extract new kinds of data from samples also sustained the choice to keep samples for storage even when they seemed to have limited epistemic value in the present. Thus, biomedical professionals deemed the samples to harbour a wealth of ‘bioinformation’ (Parry and Greenhough, 2017), which future technologies might be able to extract.
To become amenable to being datafied in the future, samples needed to be collected, transported, handled, and stored in specific ways. Samples had to be taken from patients’ bodies, transported to laboratories or departments of pathology or histology where they were handled and processed to make them durable over time, and finally, stored. These practices involved the time, labour, and expertise of a broad range of biomedical professionals, including nurses, clinicians, technicians, and biobank staff, as well as technical equipment, such as freezers or liquid nitrogen tanks, which required space and constant maintenance. In the case of departmental biobank collections, nurses and clinicians performed much of this work by building upon routine practices in clinical care. However, they delegated the storing of samples to biobanks, often underlining that the latter already had the technical equipment in place. The ‘huge infrastructure with the right devices’ (Interview 2020/10) needed to store samples, as one clinician told us, also required space. Another informant underlined that the biobank also had systems in place for aliquoting samples in an automated way to minimize mistakes from human labour when preparing samples for storage. Curating biobank samples was more distributed in the case of ‘study-specific’ biobank collections in which staff of biobanks often assisted clinicians in sample logistics and the handling of samples. One of our informants underlined the value of the biobank team's support, ‘which (…) can be called at any time to then pick up the samples and (…) to archive and to process’ them, adding that one ‘wouldn’t have the time as a clinician’ (Interview 2021/3). Finally, in the case of ‘hospital-wide’ biobank collections, much of the work in collecting samples and making them durable was performed by biobank staff.
Thus, in every type of collection, biomedical professionals assembled and stored samples envisioning a future in which these samples could be datafied in new ways. However, to make this future possible, various kinds of data had to be produced and curated. These included sample data, which documented the details of the practices with which samples had been produced and data from the patients’ bodies the samples had been disentangled from.
Adding sample data to samples
In every biobank collection type, biobank staff had developed systems that linked samples to a pseudonym and a storage place. Moreover, they also took care of a biobank-specific kind of data, which they referred to as ‘pre-analytic data’, ‘metadata’, ‘sample data’ or ‘quality data’. In particular, interviewees associated with biobanks underlined the importance of this data, which was also reflected in their use of the term ‘quality data’. In their understanding, curating this data distinguished current more future-oriented biobanking practices from previous modes of archiving samples, which they deemed outdated.
Sample data recorded the ‘history of a sample’ (Interview 2020/9), documenting the practices through which the samples were produced in detail—from the moment they were disentangled from a patient's body to the current condition of their storage. The data documented when did someone perform the (…) surgery, when did the [sample] leave the body, when was it handed over to the carrier, when did it arrive (…); when was it cryo-asserved, when was it frozen, when did the fixation time start, how long was the fixation time, were there any (…) strange incidents during transport (…) and the storage. (Interview 2020/2)
Sample data made samples amenable to move across context, time, and space, allowing the realization of envisioned data travels. The data helped to combine and aggregate samples produced in different contexts while still properly ‘evaluat[ing] the results, no matter which analysis’ (Interview 2002/9). As one of our interviewees explained: I’m not going to compare a sample where I know (.) we processed it within one hour, [with a sample] (..) it took us eight hours [to freeze]? And this one-hour sample and eight-hour sample, if I then really compare RNA expression patterns, I don’t know whether [different patterns] arose—if they differ (…) – because of the different pretreatment [of samples], or if the samples are really different. (Interview 2020/9)
Data recording external factors, such as ‘the temperature, time, fixation time’, needed to be ‘record[ed] and to be document[ed]’ to make samples ‘comparable’ (Interview 2020/13).
However, several of our informants noted that in the context of hospital-based biobanking, curating sample data was often more easily envisioned within biobanking guidelines than actually achieved in practice. In particular, documenting the lives and histories of samples outside of the biobanks was described as a challenge. It required technicians, nurses, and clinicians to document their routine practices in ways not required within hospital care. They had to make explicit what was often tacit in their practices. Biobank staff reported that they repeatedly had to explain the importance of this kind of data to clinicians and ‘chase’ this kind of data. They also used various devices, such as detailed ‘sample accompanying documents’ (Probenbegleitscheine) or ‘time loggers’, which automatically recorded the time in which tissue was kept in a transporting device. This enabled recording of data whose key value biobank staff felt their colleagues often failed to appreciate if the envisioned future use of sample/data was to be achieved.
Curating and relating samples to patient data
Finally, ‘patient data’ were essential to make samples amenable to moving from the hospital to biomedical research. These consisted of ‘clinical data’, generated during the treatment of patients in hospitals and documenting the development of a disease in a patient's body and its response to treatments. At times, ‘patient data’ also included additional data that biomedical professionals collected with surveys or ‘outcome data’ they retrieved from statistical databases. All of our informants underlined the importance of patient data in hospital-based biobanking. One interviewee noted that biobanking became ‘exciting once sample-associated and patient-associated data flow together’ (Interview 2020/1), and stressed that samples were only really ‘valuable’ if they came with ‘corresponding data’. Another biomedical professional referred to ‘archives’ of samples without ‘clinical annotations’ as ‘dark data’, which could not be ‘exploited’ in research (Interview 2021/6). And another succinctly claimed that a ‘biobank [collection] without the associated clinical data is essentially worthless’—a ‘sample cemetery’ (Interview 2020/3). Thus, taking up the theme of the expert quoted in the introduction, this informant underlined that a sample without associated patient data would not travel beyond the biobank and thus not support the future vision on which biobanking rests.
While patient data were produced and curated at several moments (often long after samples had been taken), their necessity became only palpable when samples were to be used in biobank-based research. On the one hand, patient data informed decisions about which samples should be used to answer a particular research question or test a specific hypothesis. Patient data were crucial for choosing the ‘right raw materials’ for a particular project and to not ‘use material senselessly’ (Interview 2021/3). On the other hand, they were necessary to interpret new scientific data that were ‘extracted’ from the samples as they allowed to associate them with clinical phenomena and outcomes.
However, as was the case with all other entities in a biobank, patient data first had to be curated in order to be amenable to being connected with samples and their corresponding new scientific data. While clinical data were routinely stored in ‘clinical information systems’, their curation for biobanking involved several challenges. First, there were organizational frictions and access barriers. For example, the complex access systems with specific ‘role concepts’ (Interview 2020/3) that were in place only allowed clinicians who took care of a patient to access respective clinical records. Second, data deemed good for clinical care were not necessarily always good enough for biomedical research purposes. One of our informants explained that much of the information in clinical information systems took a narrative form rather than a coded one: ‘80 percent is free text. If I’m lucky, the diagnoses are coded (…)’ (Interview 2021/1), also adding that (…) there are things like, ‘is in’ (…), or ‘could be related to such and such’ or ‘would be compatible with this or that diagnosis.’ (Interview 2021/1)
In addition, while clinical information systems were set up in such a way as to ensure that all information that relates to a single patient could be retrieved, they were not optimized to query for specific (subtypes of) diseases. It was also not possible to make a statistical evaluation from the [clinical information system], because everything is in [narrative] form. The findings are available, the PDFs are [stored], but then you have to click in again and look, what is the receptor finding for this or that molecule. (Interview 2020/17)
Thus, clinical data, too, had to be curated—that is: transcribed, extracted, and re-assembled in new databases—so it could be attached to samples in meaningful ways and help make samples move beyond the context in which they were produced.
The distribution of responsibility for this data work varied considerably across the three types of biobank collections. In departmental collections, biomedical professionals associated with the departments took care of the curation of patient data. During interviews, they described clinical databases with curated data as central parts of their collections, often highlighting the value of clinical data and clinical databases. Here, we saw some traces of what Aaro Tupasela (2021) described as ‘data hugging’. Curating clinical data and maintaining databases allowed some clinicians to outsource the storage of samples to centralized biobanks while still controlling the future travels of samples. When describing the collaboration between the department and the biobank, a clinician explained that the biobank could not use the samples without the clinical data he oversaw: ‘[t]hey definitely have to cooperate with us if they want to get relevant data’ (Interview 2020/12). Other clinicians took care of the clinical data because they felt they had the necessary resources and expertise and thus only outsourced the storage of the samples.
Similarly, in the case of study-specific biobank collections, databases with patient data were constantly curated. In one case, diagnostic data were retrieved from clinical information systems (to which the entire group had access under the condition that they would not extract personal data). In another case, collaborating clinical partners were responsible for entering clinical data into a shared database to which biobank staff added sample data.
However, assembling clinical data was trickier in the case of ‘hospital-wide’ collections as they were often envisioned without the involvement of clinicians. In one hospital-wide collection, collaborators of the biobanks banked samples whenever they were informed by clinicians that there was leftover tissue from a surgery that was not needed for diagnostic purposes. Collaborators of the biobank also had permission to retrieve data from the clinical information system under the condition that they would not transfer personal data. In another collection, the samples were associated with a ‘minimal data set’, limiting the information to age, gender, and the type of tumour. But these samples were not routinely associated with more detailed clinical data, which was stored in a separate clinical information system to which the collaborators of the biobank did not have access. Thus, connecting samples and data was described as a bottleneck in this hospital-wide collection. Receiving a request for samples meant for collaborators of the biobank that they had to ask clinicians to provide the corresponding datasets. Yet, it was described as hard to convince clinicians to provide data as they tended to ‘sit on th[eir]’ data and felt an ‘emotional ownership’ over the attached samples.
Across discussions, there was an often-unspoken question in the room: How should clinicians be compensated for their temporal investments in biobanking in a competitive and time scarce environment? While allocating resources for these tasks was seen as one solution, the preferable return on their investment was the facilitation of collaboration between users asking for samples and clinicians involved in the collection of samples who made them valuable by adding clinical data. Thus, while hospital-wide biobank collections were often not planned with the vision that samples would mediate relationships between the clinical curators of the samples and sample users, the establishment of such relationships was often necessary to enable connections between samples and data. As in the case of departmental and study-specific collections, clinicians had to ‘travel along with’ (Hoeyer et al., 2016) samples as only their data could make the samples flow.
Concluding discussion
In this article, we started from the insights that samples stored in biobanks need to be associated with data to become mobile and reflected the difficulties to relate samples and data in practice. We engaged with biobanking at five locations of hospital-based biobanks to understand how sample/data associations can be achieved. We explored how collectives of biomedical professionals assembled, curated, and connected samples, data, and other artifacts to open up potential journeys of sample/data packages. In this concluding discussion, we want to highlight five key points.
First, our research shows that future visions of desirable sample/data travels, and the power to envision such travels, play a key role in sustaining the often-invisible sample/data labour in biobanking. Shared visions and imaginaries of future sample/data travels were a key part of infrastructure that rendered biobanking robust, sustaining the practices of curating samples and data and enabling connections between samples and data. Imagined sample/data travels were often epistemologically underdetermined. This indeterminacy of future biobank-based research explains why biomedical professionals insisted on the value of storing samples and data rather than simply extracting the data from samples and storing those alone. They assumed that new technologies would allow for the datafication of samples in ways that were currently neither possible nor imaginable. Storing samples and keeping samples stored allowed to envision the possibility of radically novel insights despite epistemically and technologically uncertain futures. However, epistemic openness was only possible through the co-presence of a vision of social relations that would accompany biobanking. Collectives of curators imagined both possible sample/data travels and who ought to enable and benefit from these travels.
Second, visions of desirable futures are thus closely intertwined with social relations and imagined communities. This turned out to be key for aligning sample and data practices and for sustainable biobanking. Our research thus adds evidence to the body of literature that emphasizes the importance of social ties, emotions and affects, as well as visions and imaginaries in data practices and research infrastructures. Previous research has highlighted that ‘social ties’ between biomedical professionals and patients (Hoeyer et al., 2016) and ‘attachments’ to the imagined community of the nations (Pinel and Svendsen, 2021) shape the directions of data movements. In our research, social ties and relations were key; they sustained sample/data labour and also mediated the connections between samples and data. However, while we saw the sociality involved in biobanking as central, the sociality we observed went beyond the two forms of ties described in the literature. Patients were present in biomedical professionals’ reflections, visions, and imaginaries, however, not as the most prominent actors. We encountered multiple imaginations of ties between the collectives who curate biobank collections and potential future users. When collectives of biomedical professionals joined forces to plan, build, and maintain biobank collections, they envisioned future users and their relationships with them. We thus speak of imagined communities (Anderson, 1991) as we encountered traces of the dynamic of socially organized imagination with the biobank as its material underpinning. The often-invisible work involved in curating samples and data and connecting them was thus profoundly shaped by imagining epistemic and social communities. Imagining robust social communities allowed the epistemic futures to remain uncertain.
Third, our research also underlines that it is key to reflect on whose visions matter the most and on who has the power to transform visions into sustainable sample/data travels. In each biobank collection type, the visions of clinicians, particularly those sitting atop hierarchies, mattered most In the case of departmental and study-specific collections, heads of departments and PIs were key in envisioning collections and bringing them to life. They also had both the explicit and partly implicit power to (not) authorize requests for samples and data. Only in the case of hospital-wide collections, were clinicians not closely involved in the planning of collections. They could still, however, make or break connections between samples and clinical data, (dis)enabling samples from travelling. Thus, when biobanks sought to make samples and data mobile, they had to ensure that the visions and expectations of clinicians could be met. While the power of clinicians in envisioning sample/data travels, and in controlling them by enabling or breaking relations between samples and clinical data might be specific to hospital-based biobanking, which is particularly prominent in Austria, our findings underline the importance of exploring who has a voice in collective efforts of imagining future sample/data travels—and in controlling who and what ought to travel along with sample/data.
This ties to our fourth conclusion, i.e., the multi-layered and often invisible work that was key to turning the envisioned hospital-based biobank collections into reality. We showed how the diverse artifacts (samples, consent documents, sample data, and patient data) had to be both curated and connected in order to make samples and data amenable to travel beyond the world of hospital care. While the hospital-based biobank collections often capitalized on clinical routines, it involved work and resources to disentangle samples and data from the worlds of hospital care and curate them in ways that meet the requirements for biomedical research. Biomedical professionals had to document that they had informed patients about biobank collections and documented their consent. Clinical data needed to be extracted from clinical information systems, to be standardized, and curated to make samples amenable to be used for research purposes. Biomedical professionals also had to make the actions and practices traceable with which samples were taken, treated, and stored, transforming the tacit knowledge of routine practices into sample data, which traced the history of samples and opened up the possibility to aggregate different kinds of samples in future research. Thus, hospital-based biobanking involved substantial work of biomedical professionals, including nurses, surgeons, clinicians, technical assistants, and collaborators of biobanks, who needed to curate samples and sets of data. While this work was supported by staff hired for the building of research infrastructures in all five locations, a substantial part of the sample/data labour was ‘in-kind’ work which biomedical professionals performed next to the work captured in their job descriptions. While they were not necessarily paid for this work, they needed to deem this work meaningful and worthy of doing.
Fifth, each of our findings also have normative implications. The value that biomedical professionals vested into the epistemic indeterminacy of samples suggests that institutional actors seeking to assure the support for biobanks should understand and embrace an open future approach and also accept related risks. It is also key to acknowledge the complex temporalities inherent to biobank infrastructures in a research environment that is ridden by short-term thinking (Felt, 2017): return on investment often runs on much longer timeframes than usually aimed at within these policy circles.
Moreover, our work underlines the importance of reflecting on how deeply working environments matter when the labour involved in curating resources for biobanking remains largely invisible. For example, in universities where the quality of individual achievements is often measured mainly in terms of publications, participants’ reflections tell us that this reward structure may not necessarily support engagement in preparing and documenting samples for the biobank. Thus, at a time when data move strongly to the forefront in biomedical research, it seems key to reflect on the specific kind of sample/data labour that is needed for successful biobanking, i.e., to combine the work of carefully curating biological samples with the work of connecting them to data. This must be made visible in terms of recognition and reward within the professional structures of contemporary research systems in order to increase the motivation to engage in such work.
Last but not least, the power of visions on desirable sample/data travels in sustaining sample/data labour and enabling relations between samples and data underlines the importance of giving those feeding databases and platforms a voice and agency in directing future travels of the samples/data they helped to assemble. To create sustainable biobank infrastructures, space needs to be made for those performing the labour to also participate actively in feeding their visions and values into the processes of building these infrastructures. While policymakers often associate biobanking with the vision of making resources accessible without specifying to whom and for which purpose, in our cases, these policy visions had gained a much different meaning and importance on the ground. Our observations show that when biomedical professionals contribute to the building of research infrastructures, they might reimagine their purposes, which can be different from the top-down envisioned ones. In our case study, biomedical professionals sought to make samples mobile, but not unconditionally flow. Not cutting the ties between curators of sample/data and their users might be a way to assure the quality of sample/data and to make them sustainably mobile over time and space.
Footnotes
Acknowledgements
A previous version of this article has been presented at the Annual Meeting of the Society for the Social Studies of Science, 6–9 October 2022. The authors would like to thank the panel's participants and Luca Chiapperino for their helpful feedback. The three anonymous reviewers’ careful reading and commenting helped us to improve the manuscript. Harry Wynands copy-edited the manuscript and also raised questions that led to sharpening our argument. Last but not the least, we would like to thank our interviewees and BBMRI project partners for making time and sharing their visions with us.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by a grant from the Austrian Federal Ministry of Education, Science and Research (Biobanking and Biomolecular Resources Research Infrastructure Austria, BBMRI.at, grant number 10.470/0010-V/3c/2018, December, 2018–November, 2023).
