Abstract
Various biological resources, such as biobanks and disease-specific registries, have become indispensable resources to better understand the epidemiology and biological mechanisms of disease and are fundamental for advancing medical research. Nevertheless, biobanks and similar resources still face significant challenges to become more findable and accessible by users on both national and global scales. One of the main challenges for users is to find relevant resources using cataloging and search services such as the BBMRI-ERIC Directory, operated by European Research Infrastructure on Biobanking and Biomolecular Resources (BBMRI-ERIC), as these often do not contain the information needed by the researchers to decide if the resource has relevant material/data; these resources are only weakly characterized. Hence, the researcher is typically left with too many resources to explore and investigate. In addition, resources often have complex procedures for accessing holdings, particularly for depletable biological materials. This article focuses on designing a system for effective negotiation of access to holdings, in which a researcher can approach many resources simultaneously, while giving each resource team the ability to implement their own mechanisms to check if the material/data are available and to decide if access should be provided. The BBMRI-ERIC has developed and implemented an access and negotiation tool called the BBMRI-ERIC Negotiator. The Negotiator enables access negotiation to more than 600 biobanks from the BBMRI-ERIC Directory and other discovery services such as GBA/BBMRI-ERIC Locator or RD-Connect Finder. This article summarizes the principles that guided the design of the tool, the terminology used and underlying data model, request workflows, authentication and authorization mechanism(s), and the mechanisms and monitoring processes to stimulate the desired behavior of the resources: to effectively deliver access to biological material and data.
Introduction
Biological resources are an integral part of the biomedical research process, yet there remains a persistent problem surrounding the ability of researchers to find and discover biological resources suitable for their research. 1 Some biological resources can upload their own, full datasets to a central repository and ensure the entire dataset is discoverable. However, health care data held in entities such as biobanks2,3 and health care registries4,5 often cannot be uploaded to an external location if it would entail the loss of control over the data. Common reasons for this include data protection regulations and a lack of a priori availability of reliable structured data, for example, coming from a health care system. The consequence is that data that can be made public are often a poor representation of what are actually available. However, these resources are fundamental for biomedical research and so solutions are needed to ensure researchers can discover and access resources with ease.
A primary driver for high level or summary characterizations of resources' holdings is data protection regulations, which consider making detailed individual-level data public a threat to the privacy of the donors providing the material/data. For this reason, distributed data querying and analysis platforms have been developed, such as DataSHIELD, 6 Locator by the German Biobank Node/Alliance and BBMRI-ERIC,7,8 and the commercial BC|RQUEST system. These systems allow retention of personal (pseudonymized) data at the source institutions, only sharing substantially less sensitive aggregate results of querying or analyses, thereby allowing detailed searches to be performed on the full dataset, while providing aggregated statistics to the researcher.
These advanced discovery systems, however, do rely on the data being in a format and within a data structure that would allow a query to be performed. A challenge remains in the extraction of rich, structured, quality controlled, and semantically annotated data from data repositories, such as hospital information systems in clinical biobanking 9 or national disease-specific and health care registries. These hospital systems are defined by the specifics of their national health care system and environment, including national languages and specific coding schemes, in which international interoperability is not of primary importance. Hospital information systems also contain large volumes of unstructured information relevant for the treatment of patients, but difficult to extract reliably into interoperable structured information for research purposes.
All of these aspects result in high costs of a priori data extraction and harmonization to an internationally accepted common data model. Therefore, while technological solutions to facilitate advanced querying of data do currently exist, that data are not readily accessible to the research community. Thus, large collections of biobanking and biorepository resources are often only described by high-level descriptors: from P3G Observatory, 10 BBMRI Preparatory Phase Catalogue, 11 Resource Locator by the International Society for Biological and Environmental Repositories, 12 Maelstrom Repository, 13 and BBMRI Large Prospective Cohorts catalogs, 14 to BBMRI-ERIC Directory, 15 and RD-Connect Biobank/Registry Finder.9,16,17 From within the community, there have been calls for resources to move on from defining themselves solely by the properties of their biosample holdings, and instead to incorporate clinical data as well. Indeed, the research community highlights the importance of associated clinical data as a deciding factor in the selection of a new resource. 1 However, the main challenge identified 18 remains the ability to generate sufficient data on demand to meet the needs of the research community.
An additional barrier to sample and data accessibility is the complexity of conditions governing reuse of the material/data. Structured approaches to describing reuse conditions, such as Data Use Ontology19,20 or Automatable Discovery and Access Matrix 21 do exist. Yet, there are many scenarios in which the conditions of reuse are complex and will always require human assessment. When deciding whether to release depletable material such as biological samples for research purposes, the processes usually prioritize release according to scientific excellence, potential impact of the research, and existing commitments for other purposes.
Problem statement
For the above-stated reasons, it is not realistic to expect that all biobanking resources will provide a detailed description of all their stored material/data. Many resources will describe their holdings by high-level statistical descriptors only, which we call weakly characterized resources. A typical property of aggregate descriptors is they cannot be combined to generate a query that combines multiple statistical descriptors.* The main objective of the BBMRI-ERIC Negotiator is to enable effective access negotiation when some or all resources are only weakly characterized.
This means that the requester can identify candidate resources without knowing exactly if they contain the requested material/data. Sometimes due to a lack of characterizing data, the requester needs to communicate with all the identified resources (e.g., if the requester searches for radiology imaging data and information on availability of imaging is not available in the set of descriptors in the findability service). The access negotiation must support prioritization of requests as these often require access to depletable materials. Thus, machine-actionable descriptors for access conditions only cover part of the decision process, if they are available at all.
Terminology
We use the following terminology throughout the rest of the article. The Negotiator works with different types of resources—the primary focus is on biobanks, but registries and various types of data repositories are also supported. These resources are expected to provide descriptions of their holdings using collections in findability services. This allows them to provide different levels of granularity in describing their holdings, even if still only weakly characterizing them. When searching through the findability services, the user generates different queries, which are used to identify resources with which to start a request in the Negotiator. The structure of resources and queries is further explained in the Request Design section.
Methods
BBMRI-ERIC Negotiator implements an access procedure stipulated in the BBMRI-ERIC Access Policy, as shown in Figure 1. After identifying candidate resources (biobanks), the requester negotiates access with them. Once the requester receives availability information on the material/data for the particular purpose, they select their preferred resources and directly contact the biobank with the goal of signing the Material Transfer Agreement (MTA) or Data Transfer Agreement (DTA) (note that BBMRI-ERIC is not involved from this step onward), after which the material/data are shipped to the requester, who confirms its receipt. After the project for which material/data have been obtained finishes, the requester notifies the resource and BBMRI-ERIC. If there are data created from the biological material, it should be offered free of charge to the source biobank to allow them to enrich the existing data sources. If the biobank is unable to host the resulting data, BBMRI-ERIC should be notified and help find a suitable hosting service.

Access pipeline based on BBMRI-ERIC access policy. 29 Please note that BBMRI-ERIC only provides the Negotiator platform, but the actual negotiation is done by the biobank/collection representative.
The whole system has been designed with several fundamental principles in mind. Requests coming into the biobanks have become increasingly complex in the last decade in terms of the requirements on inclusion criteria of donors, exact specifications of requested data, and properties of biological material, yet they are often insufficiently specified in terms of the purpose and methods to be applied, and clarifications are needed before biobankers can decide if the material is fit for the given purpose. The complex nature of the requests also suggests that unless very deep structured phenotyping is available with all the resulting data made available for querying, which is unlikely for all the existing collections in large biobanks, it is to be expected that identification of candidate resources followed up by subsequent access negotiation is a suitable access procedure at least for the mid-term future. Access to biological material is further subject to prioritization due to the depletable nature of the material.
Hence the Negotiator has been designed to work with resources that are only weakly characterized in findability services, yet it should provide incentives for those resources that describe themselves better by providing more fine-grained structured information. Because of complex limitations on reuse imposed by informed consent and other legal restrictions on material/data processing, it can only partially rely on machine-actionable descriptions of access conditions and representatives of the resources are ultimately deciding on whether the request is allowed.
The system is designed to work in the inherently multinational environment of BBMRI-ERIC, where data come both in different languages and in different coding systems depending on various national standards in health care. Only a relatively limited amount of information needs to be preharmonized to common data models in findability services, thanks to no requirement for more than weak characterization of resources, and such basic data are typically available structured anyway in information systems used by the biobank based on internationally accepted standards (e.g., Minimum Information About BIobank data Sharing for bioanking data, 22 World Health Organization (WHO) standards for diagnosis, and cancer-specific data such as Union for International Cancer Control (UICC) Tumor-Nodes-Metastases Classification of Malignant Tumors, UICC stage, and WHO grade). More detailed data can be provided if the biobank already has it structured and ready for conversion from their local language and local coding scheme; otherwise, such data extraction may be done ad hoc after a request is issued, hence reducing the entry barrier and overall costs to participate in the system for the resources. Overall communication in the Negotiator is in English, as this is a common language accepted in life sciences and biomedical research communities.
Request design
The design of the Negotiator builds on the “project—request—query” hierarchy. Each project can be affiliated with zero or more requests; projects define a purpose for which the data and/or biological material is being requested. Each request can have one or more queries identifying candidate biobanks in the findability services (e.g., Directory or RD-Connect Biobank/Registry Finder).
Each request starts with a requester identifying candidate resources (biobanks) using a query in a findability service, in which candidate resources are selected based on structured search criteria. The requester can subsequently add additional queries to the request to enrich the set of candidate resources, either from the same or a different findability service. Because of weak characterization of resources in findability services, the candidate set is likely to be an over-approximation of the set of resources that have relevant material/data.
However, the main purpose of the query is to exclude those resources that are confirmed to not have any relevant material/data. Those resources that are incorrectly identified in that overapproximation can easily step out of the request. If a resource is approached often with irrelevant requests, it can improve the situation by providing more accurate data about their collections in the findability services—this has been intentionally designed as an incentive mechanism for resources, while keeping the entry barrier low for the resources to start participating in the whole ecosystem (see the Incentive Mechanisms section for discussion of incentive mechanisms). On the other hand, in some rare cases where there have been no relevant data available in any findability service, BBMRI-ERIC has also run successful negotiations across more than 600 biobanks in the Directory to identify those that have a particular material/data available.
The requester can fill in additional details necessary for the request: (1) purpose of requesting the material/data (typically a research project); (2) anticipated analytical methods to be applied to the material/data; and (3) available ethics approvals (so that biobanks may avoid duplicate ethics assessment).
After the request is submitted, it follows a workflow shown in Figure 2 (more detailed information on request state evolution is provided in Supplementary Fig. S1 for reference). Initially, the request is reviewed by the BBMRI-ERIC access manager to avoid requests that might be deemed inappropriate by the resources. The communication phase starts after the request is successfully reviewed by the BBMRI-ERIC access manager.

Overview of the request state. Blue boxes indicate states handled by the Negotiator and gray boxes indicate states managed outside of it. Note that request check by BBMRI-ERIC only serves to stop requests, not compliant terms and conditions of using Negotiator, and is not a scientific project assessment.
Each request has a “request-wide communication” component, which can be seen by all the resources participating in the given request, and “individual negotiation with biobanks/collections,” which are only visible to the requester and the particular resource representative. For general refinement of the request, it is advisable to use the request-wide communication since a representative of any resource can take the lead, which can result in a much faster process and which does not need to be repeated by each resource. This request-wide communication allows effective refinement of the request, as many requests are underspecified, and further clarification is needed to assess if the purpose is compatible with the reuse conditions for the material/data or if the material/data are fit for the particular analytical method. As a part of this process, any resource can step away from the request, thus indicating that it is irrelevant for them.
Once the request is detailed enough, each resource can proceed by flagging availability of the material/data: (1) material/data available and accessible for the given purpose; (2) material/data available, but not accessible for the given purpose; (3) material/data are not available, but can be collected (prospective collection); and (4) material/data not available. Additional access conditions can be clarified in the private channel, such as access costs or any specific condition or template of MTA/DTA required by the resource. This process allows resources to use their internal decision mechanism and to prioritize access to depletable resources.
After the first set of resources indicates that material/data are available for the given purpose, the requester can select individual resources to continue with. Information about selected resources is stored in the Negotiator, but further communication follows between the requester and each selected resource. Resources then indicate that the material/data have been shipped to the requester and the requester subsequently confirms its receipt. At this point, the access procedure is considered successfully completed.
Once the requester finishes her project, she should indicate this through the Negotiator. If data were generated as a part of the project, these shall be made available to the resource at no cost to enrich the existing material/data as requested by the BBMRI-ERIC Access Policy. This can be done either by delivery of data to the resource or by depositing the data in a public repository and linking it to (persistent) identifiers, for example, biobankIDs from BBMRI-ERIC Directory. If neither option is available, the data shall be also offered to BBMRI-ERIC to support the resource by seeking alternative storage options.
All the steps in the negotiation process are reported to the relevant people by e-mail. If a request includes a resource not yet registered in the Negotiator, its representatives are invited based on the contact information from the respective findability service.
User roles and interface design
Each user has one or more roles in the system: requester, collection representative, network representative, BBMRI-ERIC access manager, and system administrator. The requester role is available for all the users and availability of the other views is dependent on the user being assigned to the role the results of the authorization process through BBMRI-ERIC Authentication and Authorization Infrastructure (AAI). The Negotiator supports several views designed for specific user roles in the system:
Requester's view—In this view, it is possible to see all the requests filed by the given user. It is possible to file a new request in this view, which starts by selecting one of the available findability services and when returning to the Negotiator, filing the request as usual. The requester can engage in request-wide chat or 1:1 private chat with collection representatives and indicate state changes such as selecting with which collections she prefers to continue, or that the material/data have been received, as indicated in Supplementary Figure S1. Collection representative's view—Lists all the requests matching a given collection. The representative can engage in the request or step away, can change the state of the request as indicated in the request state scheme in Supplementary Figure S1, and can engage in request-wide chat or 1:1 private chat with the requester on behalf of the collection/biobank. Beyond the request state, the requests can be also flagged by the collection representative as favorite/archived/ignored, which moves them to separate queues in this view. Network representative's view—This view provides access to aggregate performance metrics for monitoring behavior and performance of biobanks/collections by the network operators. One of the networks is the whole BBMRI-ERIC, the National Nodes of BBMRI-ERIC have their own networks, and biobanks can also form specific networks such as rare disease-focused Telethon
23
or EuroBiobank.
24
These metrics include the number of requests received over the time period, an overview of search queries, a histogram of time to first reaction to the request, and a histogram of time to indicate availability.
The aggregate nature of metrics supports maintaining the confidentiality of requests.
BBMRI-ERIC access manager's view—This is a simple view used for initial review of requests by the BBMRI-ERIC access manager.
Administrator's access manager view—This is a view for the Negotiator administrators, allowing them to monitor performance of the platform such as synchronization with all the findability services and with BBMRI-ERIC AAI. It can also provide detailed information about how each request was filed and in what state it is, to support debugging if problems arise.
Management of authorization
Authentication and authorization within the ecosystem of BBMRI-ERIC rely on a common BBMRI-ERIC AAI and thus all the BBMRI-ERIC services can use the same information to perform authorization decisions consistently—for example, who can edit biobank/collection information in the Directory as well as who is entitled to negotiate on behalf of it. BBMRI-ERIC AAI uses federated authentication, that is, the user should authenticate with their home organization if this organization participates in academic eduGAIN federation. 25
By this mechanism, BBMRI-ERIC obtains trusted information on user identity and her organizational affiliation. Alternatively, the user can use Open Researcher & Contributor ID 26 or LifeScience Hostel in cases where a user's home organization does not participate in eduGAIN. Information necessary for authorization decisions, such as various user attributes and group membership, is stored centrally in the BBMRI-ERIC AAI. It is also used for ensuring users agree to the Terms and Conditions of using BBMRI-ERIC IT services, thus making users aware of the confidentiality principles applied within the BBMRI-ERIC network.
BBMRI-ERIC AAI needs to provide information necessary for Negotiator to decide the role of the user: this is done by assigning users to collections and networks as representatives using groups in the BBMRI-ERIC AAI. Initial assignment of people to collections and networks is done centrally by BBMRI-ERIC after consulting BBMRI-ERIC National Nodes when necessary, and it is based on the registration process shown in Figure 3. Once the initial assignment is done, management of authorization is delegated to the existing collection or network representatives.

Overview of user registration procedure and registration of a biobank/collection representative.
Interoperability: application programming interfaces
To support interoperability and integration with different services across the biobanking ecosystem, the Negotiator relies on several openly defined application programming interfaces (APIs). All the APIs use representational state transfer (REST) API approach. 27 Detailed specification of the APIs is provided using Swagger.io OpenAPI and published by GitHub. Beyond the APIs for communication with findability services and for communication with BBMRI-ERIC AAI as described above, the Negotiator also provides an API to import and export requests from/to external services.
Incentive mechanisms
For effective access in the federated ecosystem of BBMRI-ERIC, where biobanks are only loosely coupled to their National Nodes and thus to BBMRI-ERIC, it is important that the whole infrastructure has incentive mechanisms built in. Otherwise, biobanks might have limited motivation to react positively to requests and even to describe their resource adequately in the findability services. Hence, the following incentive mechanisms have been built-in by design:
When some resources consistently step away from the negotiation, this indicates that they may not be well represented in the findability services. This can be a result of either an insufficient data model of the given findability service or it may be that the resource has not fully utilized the data model to describe themselves well in the findability service. Improving resource characterization will decrease the load on processing requests coming through Negotiator, and thus providing an intrinsic incentive to the resources to improve their characterization.
Performance of the resources can propagate into reputation information published as a part of the findability services; hence, well-performing biobanks are likely to be preferred as candidate biobanks. Implementation of this mechanism is anticipated in 2021 in the Directory.
Results and Discussion
Implementation
The whole system has been implemented in Java and is available open-source at BitBucket. Overall integration of the Negotiator into the environment of BBMRI-ERIC IT tools is shown in Figure 4.

Integration of the Negotiator into environment of BBMRI-ERIC IT tools.
Version 1.0 was designed and implemented in 2016 and released in 2017 as a technology preview to collect feedback from the whole access facilitation process. This version only had a simple communication system built in (request-wide messaging and communication with individual biobanks/collections), was closely coupled to the BBMRI-ERIC Directory as the only query source, and was also already integrated with the BBMRI-ERIC AAI.
Version 2.0 was released in August 2020 and is described in this article. The main new features include (1) ability to interface to multiple different query sources to support integration with data sources from the RD-Connect project, namely Registry and Biobank Finder 28 ; (2) complex structured state tracking for each request, including ability to provide relevant documents for each phase of the negotiation; (3) integration with various query sources; (4) tools for biobank network operators to monitor performance of biobanks/collections participating in the given network; and (5) redesign of user interfaces based on extensive feedback from users as well as dedicated user experience testing performed as a part of the BBMRI-ERIC Common Service IT.
Service usage
Between 2019 and 2020, the number of collections represented in the Negotiator increased from 231 (15% of collections in the Directory as of January 2019) to 1317 (52% of collections in the Directory as of December 2020). The number of registered biobankers increased from 95 to 201 in the same period and the total number of users registered in the platform increased from 177 to 646.
It is a known practice that researchers working with biobanks typically directly approach biobanks with which they have already established working relationships. 1 Practically, the Negotiator is used for requests where the requester is not yet using a specific biobank and seeks to establish a new collaboration, or difficult situations in which they cannot find relevant resources in the biobanks with which they have already established a collaboration. The success of the Negotiator is demonstrated in the number of requests filed into the Negotiator, from 39 in 2019 to 110 in 2020. Median time to review the incoming request by BBMRI-ERIC was 2 minutes and median time for reaction from biobanks was 18 hours from filing the request. Several national nodes of BBMRI-ERIC have also appointed their national Negotiator contacts to support the communication with the biobanks: Austria, Belgium, Czech Republic, Germany, and United Kingdom.
In several cases, the Negotiator has also been used for difficult cases where no specific information could be found in the findability services as discussed in the Request Design section and thus, all available collections of all biobanks were approached as a part of those requests. These cases eventually led to successful identification of at least one biobank that could provide the requested service.
Conclusions and Future Work
Providing access for researchers to biobanking resources poses a particular challenge, as these resources are typically only relatively weakly characterized in the available findability services. This is not due to bad intentions or lack of expertise of the service operators, but it is driven by the overall costs of obtaining and harmonizing all the possible information relevant for requesters of biobanking services. In some countries, such as Germany, biobanks have succeeded in establishing elaborate data integration processes. In several other European countries, however, the cost of a priori collecting detailed phenotypical, clinical, -omics and environmental exposure data would be a prohibitively costly process for millions or tens of millions of samples stored in the big biobanks, which are often interfacing various legacy systems containing this information. Hence, weak characterization is a principle that allows all biobanks to advertise their existence and for an initial assessment if the given biobank might have something relevant for the particular purpose.
The BBMRI-ERIC Negotiator presented in this article has been designed for effective communication, clarification of practical availability of material/data for a particular research project, and negotiation of access conditions. After this initial assessment is complete and the list of candidate biobanks and their collections has been compiled, group communication enables avoidance of costly and redundant one-to-one interactions between the requester and each candidate biobank. The Negotiator has already proved itself as an effective tool for dealing with difficult requests where many biobanks need to be contacted to check availability of material/data for a particular purpose. The tool is also effective for monitoring performance of the biobanks.
Beyond operating the platform, BBMRI-ERIC plans further development of the Negotiator. One of the highest priority areas is to support negotiating access to other services beyond access to material/data, from setting up prospective collections and clinical trials to various analytical services. Such services have proved particularly important in face of the COVID-19 pandemic, where availability of BSL-2/3 certified laboratories is crucial for processing particularly infectious sample types such as nasopharyngeal swabs or feces from infected donors.
Another important area is optimization of the behavior of the biobanking ecosystem for requesters: BBMRI-ERIC has started to investigate to what extent performance metrics such as median time to respond to the query or rate of successfully concluding the request could be used to develop the concept of a publicly available biobank reputation, for example, published by the BBMRI-ERIC Directory.
Footnotes
Authors' Contributions
P.H. developed the concept of the Negotiator and led its implementation. R.R., M.A., and S.M. implemented the service. R.P. contributed to the design of the user interface. P.R.Q. helped to develop the concept of the interoperability and designed the usability study process. E.L. and E.B. performed usability studies and provided feedback for the development. D.F.B. designed and implemented integration with AAI. E.v.E. led integration with BBMRI-ERIC Directory. R.R. and H.M. contributed integration with RD-Connect services. M.L. led integration with GBA Directory and Locator services. P.H., R.R., E.L., M.L., and P.R.Q. edited the article.
Acknowledgments
The authors thank Prof. Michael Hummel for usability feedback and comments on the article and to all the contributors from BBMRI-ERIC national nodes who contributed to testing of the service.
Author Disclosure Statement
No conflicting financial interests exist.
Funding Information
This work has been co-funded by ADOPT BBMRI-ERIC project supported by EU Horizon 2020, grant agreement no. 676550; by CORBEL project supported by EU Horizon 2020, grant agreement no. 654248; by EOSC-Life project supported by EU Horizon 2020, grant agreement no. 824087; by LM2015089 (
and by RD-Connect project supported by FP7-HEALTH, grant agreement no. 305444.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
