Abstract
Privacy-enhancing technologies (PETs) have the potential to revolutionize data sharing by streamlining time-consuming and complex risk assessment processes without sacrificing privacy and increasing risks. To realize the potential of PETs, the paper proposes that the use of PETs needs to be better communicated, promoted and legitimized. This paper provides a consensus (Delphi) study with a global panel of experts on PETs who convened the United Nations in the context of data innovation for the global community of official statistics. This panel evaluated statements and recommendations of the use of PETs in the risk assessment process for data sharing. The panel agreed that the use of PETs can improve use of the Five Safes framework for data sharing agreements. While best practices for PET deployment are still being established, the potential benefits of PETs should be communicated more effectively by emphasizing the importance of objectivity regarding the benefits and limitations rather than over-promising the benefits (30). The panel recommended institutional assessment of the utility trade-off in the use of PETs and buy-in beyond the technology domain. It was also recommended that regulators should play an active role in guiding how organizations can combine core data protection principles with PETs to supplement other measures. The panel further recommended developing standardized, PETs-specific terminology to facilitate education and communication about PETs with key stakeholders.
Keywords
Introduction
The sharing of medical records between research institutes nationally or internationally during global pandemics can potentially save lives, as solutions could be earlier discovered and better tested with a larger pool of relevant data. Similarly, the sharing of personal or otherwise sensitive data during rescue operations after natural disasters will help save more lives and increase capability for rebuilding and restoration. During global economic downturns, it could be essential to share basic economic data across borders. Whereas data sharing can have clear positive benefits for society, the privacy of information should also be respected and protected.
In this paper, we are specifically focusing on the Five Safes framework as the basis for the assessment of risk in data sharing. Green and Ritchie 1 explained this framework and attributed its origin to the Office for National Statistics in the UK around 2003. They further showed how this framework was increasingly referenced in academic literature, especially in recent years, due to the need for sharing data during the COVID-19 global pandemic. The Five Safes refer to Safe projects (Is this ethical, lawful and appropriate use of the data?), Safe people (Are users likely to follow procedures?), Safe settings (How much protection does the physical environment afford to the data?), Safe outputs (How much risk is there in breaching confidentiality of the outputs?) and Safe data (Is the level of detail in the data appropriate?). The Five Safes framework can be the basis for getting to an agreement on sharing data while assessing the risks on each of the cited five dimensions. In most practical cases of data sharing, risk assessment is a long and difficult process. This paper will provide the consensus of the expert panel that using modern Privacy-Enhancing Technologies (PETs) offers privacy guarantees that can speed up the risk assessment process.
A clear need for data sharing emerged during the COVID-19 pandemic when time was of the essence and more data were needed to fight a global outbreak. The WHO Guiding Principles for Pathogen Genome Data Sharing 2 speaks to the importance of pathogen genome data sharing across borders and technical interoperability while respecting the rights of data providers. In the aftermath of COVID-19, several data sharing networks were established such as the Canadian COVID Genomics Network and Statistics Netherlands. 3 These are examples of national consortia that were forged in response to the pandemic to increase data sharing for rapid infectious disease surveillance. At the same time, concerns around data security, data sovereignty, a lack of incentives and perceived negative consequences of privacy violation arose. Whereas the above examples show efforts to share data, there are other examples where data sharing did not work, such as Indonesia's decision 4 to withhold samples of avian influenza virus A (H5N1) from the World Health Organization. The Nagoya protocol 5 formalizes the principle of fair and equitable sharing of benefits, as a legal framework under the Convention on Biological Diversity (CBD). Whether this protocol would inhibit rapid responses by restricting access to pathogen data is an ongoing debate.6, 7 Data sharing ecosystems (both national and global) are increasingly becoming platform based and are defined as (open or closed) networks (of interconnected systems, processes and data sources) where an orchestrator mediates relationships between a diverse set of stakeholders. Platform-based ecosystems can maximize secondary use of data and bring together multiple contributors to foster knowledge sharing, leading to increased insights, innovation, and competitiveness.
National Statistical Offices (NSOs) around the world are modernizing their operations and services to better meet the demands of society in an ever-evolving data ecosystem with high interdependency between diverse actors, and a critical need for trust and social license. Society needs to trust NSOs in the way they gather and disseminate their information. Without trust the value of NSOs diminishes. NSOs recognize that they must play a significant role as data stewards 8 in the national data ecosystems to enable effective and mutually beneficial cooperation between municipalities, regional and national governmental bodies, scientific institutes, private businesses and civil society organizations. NSOs are therefore uniquely positioned to shape and influence the next generation of data sharing ecosystems. In this context, NSOs themselves are obliged to comply with the statistical law (e.g., statistics act) and corresponding regulations which include strict confidentiality and privacy rules. The statistical law underpins the efforts of the NSOs to keep the public trust and social license while interacting with other players in the data ecosystem. Establishing a federated (e.g., distributed data management) and platform-based data sharing ecosystem at national level requires implementing a generalizable, secure, and efficient computing infrastructure and associated data governance framework, which remains a challenge for many NSOs at this moment. PETs could potentially help to establish such a responsible and safe data sharing platform.
Regulators could advance data governance by considering the potential of PETs as part of foundational digital infrastructures, thereby offering ease of use and advocacy to help implementers navigate regulatory uncertainty and risk management in the use of these technologies. 9
Infocomm Media Development Authority (IMDA) in Singapore is one such regulator which actively promotes PETs. In addition to the existing suite of compliance tools (e.g., Asia Pacific Economic Cooperation Cross Border Privacy Rules (APEC CBPR certification), the regulator actively engages with implementers (such as multinational health care companies) to have them use IMDA's “PET Sandbox” and pilot PETs solutions within the boundaries of existing regulations, supported by Singapore's Digital Trust Centre.
However, in general, regulators across the globe lack consensus on the positive role of PETs and how they may impact data sharing risks. Despite breakthroughs in the technological development of PETs, they are still not widely implemented.
PETs have the potential for a generational leap forward in improving data sharing if the introduction of PETs is paired with a clear communication strategy on how PETs reduce risks and increase benefits. Good examples of case studies using PETs have been described in the United Nations PET Guide 10 and the publication “From privacy to partnership” of the UK Royal Society, 11 which demonstrate the role of PETs in data governance and collaborative analysis. The UN PET Guide describes 18 different case studies (See the case study repository at https://unstats.un.org/wiki/display/UGTTOPPT/Case±study±repository) including measuring salary disparities for women in the workforce, sharing data of mobile network operators, and combining student financial aid with educational outcomes. Yet, it remains a challenge to identify how PETs could be implemented in the best way to address the most difficult challenges in global data sharing ecosystems, many being contentious and unresolved to date. The responsible use of PETs and the ability to remain robust over time to tackle emerging challenges require further experimentation and closer investigation. With the public demand for timely and efficient replies to emerging national or global issues, maintaining the status quo is no longer an option for many organizations, especially in the public sector. Speeding up data sharing agreements with support of PETs, would necessitate a nuanced risk assessment beyond the technological aspects.
This paper attempts to consolidate experts’ views on the most important factors and conditions to effectively promote, mainstream and implement PETs in risk assessment processes for data sharing agreements. While the primary focus is on the role of PETs for facilitating data sharing, broader applications can be considered, for example data analysis 10 and machine learning. 12
The intended scope of “risk assessment” for data sharing concerns data flows, technologies, controls, processes and/or business and organizational practices involving sharing of microdata as secondary use(s). Secure Multi-Party Computation, Homomorphic Encryption (including but not limited to record linkage), Federate Learning, and Trusted Execution Environment, as described in the UN PET Guide, 10 are among the most prominent PETs for the kind of data sharing that is being envisioned in relation to the Five Safes risk assessment framework.
Methodology
This paper aims to determine the important factors which will enable the use of PETs in the decision process of the risk assessment for data sharing between institutions. By applying the Delphi surveys, a group of experts ran through a series of statements and built consensus whether they agreed or not with the statement (using a 5-point Likert scale), for example, “do you agree that appropriate use of PETs can provide risk mitigation that lowers the requirements (for risk assessment)? For example, using PETs can lower the requirement for the Safe People principle of the Five Safes framework”. After it was determined with which statements the experts agreed and with which statements they did not agree, a list of recommendations was generated for various stakeholder communities in the risk assessment process, which should lead to more efficient and better agreements on data sharing. Recommendations also went through a consensus building process. From the start of this project to the conclusion of the multi-round Delphi surveys, the study took 13 months from March 2023 to September 2024.
The study followed a Delphi technique 13 expert consultations, and dedicated meetings to facilitate the consensus-building process. The Delphi technique is a manifestation of expert opinion developed via consensus and is considered a commonly used consensus building group technique. 14
The Delphi study involved first the recruitment of a panel of experts. For this purpose, the members of the UN PET task team () were approached, who are all employed by recognized institutions and who are familiar with PETs. Panel members needed to have some direct experience with risk assessment processes and be familiar with known frameworks (e.g., Five Safes framework). For the expert panel of this study, 15 members of the PET task team volunteered. Half (53%) of the panel members were from NSOs, followed by about one third from non-profit institutions or non-governmental organizations, and the remaining members from academia and industry, which reflects the composition of the current UN PET task team. Most of the members were working in high-income countries. Half (53%) were implementers of PETs and 26% of the panel were PET developers. The panel composition broken down by sub-groups is provided for reference purposes only. Whereas it could be interesting to investigate if PET implementers and PET developers have differing opinions on statements and recommendations, further stratification and sub-group analysis could not be done in this study due to the small sample size.
Secondly, a large pool of statements on various aspects of sharing of sensitive data was constructed based on a review of the published use cases in the UN PET Guide, 10 interviews and review of several known data portals and platforms. (See https://www.statcan.gc.ca/en/microdata; See https://www.dst.dk/en/TilSalg/Forskningsservice/Dataadgang; See https://www.bfs.admin.ch/bfs/en/home/services/recherche.html) These statements were presented to the members of the expert panel with the question for each of them, if they agreed with them or not. The experts were asked to use a five-point Likert scale for measuring the level of agreement with the statements (e.g., agree, somewhat agree, neutral, somewhat disagree, disagree) collected through a Google form.
The study design consisted of three rounds; first two survey rounds, Round 1 (R1) and Round 2 (R2), in which the panel members were asked to evaluate the statements and reach consensus. A majority of a minimum of 63% (combined Likert scale 1 and 2) was used as the pragmatic definition of consensus. If there was no census in R1, feedback was collected, and some statements were clarified for use in R2. Based on the consensus statements and feedback discussions, recommendations were derived on the implementation of PETs for governments, regulators, industry, and other key stakeholders. These recommendations were evaluated in a final round (R3). The multi-round approach informed the panel of the previous round's results before moving to the next round and allowed the feedback in between rounds.
One particular outlier response was detected and investigated. This resulted in the removal of the responses from one individual in R3. The reason for this removal was the Likert scale rating data given by the individual conflicted significantly with the comments provided by the same individual (therefore deemed invalid). The total sample size was adjusted accordingly.
Results
A total of 52 statements were used in R1 and another 75 statements in R2. Some similarity and duplication in the statements between R1 and R2 exist due to the inclusion of statements over multiple rounds. Consensus was achieved on 71 statements, see Annex 1. Based on these consensus statements, 44 recommendations were derived and evaluated R3. In the end, the panel reached consensus of 41 recommendations, see Annex 2.
In the following section, the 71 statements were organized in 4 groups for ease of discussion, namely those statements related (1) to the current situation of existing risk assessment frameworks, (2) to a greater understanding of the roles and implications of PETs, (3) to regulatory uncertainty on standards, threat modeling, and perceived role of regulators, and (4) to other areas for further evaluation, such as barriers, antitrust or market outcomes.
Statements
Current situation: Many organizations are operating under a false sense of safety
The panel agreed with the statement that in the current environment many public organizations contribute to a false sense of safety in data sharing when their communications of risk assessment do not reflect the true privacy risks (Annex 1, statement 47). In addition, they agreed that – while contracts serve as a valid approach for data sharing (Annex, statement 1) – the demand for ever-increasing data sharing requests cannot be met, because the processing of data sharing agreements currently cannot be scaled given the manual systems and processes for risk assessment (Annex 1, statement 61). Further, the panel agreed that the current risk assessment complying with Five Safes is insufficient to fully mitigate any privacy threat (Annex 1, statement 52) and that PETs can enable effective implementation of the Five Safes framework (Annex 1, statement 25). In addition, the panel agreed that many existing frameworks, such as Five Safes, do not consider the data life-cycle, whereas the sensitivity of the same data may change over time (Annex1, statement 6), and they also do not provide interoperability of privacy regulation across borders (Annex 1, statement 7). In summary, the panel agreed that the use of PETs can improve the use of Five Safes framework for data sharing agreements.
Understand the role of PETs in strengthening data governance including Ai safety
Early adopters of PETs have experienced communication challenges throughout the risk assessment process, from a lack of standardized terminologies (e.g., regarding pseudo-anonymization or de-identification) to describing the appropriate utility trade-offs of PETs (using complex parameters for the implementation of differential privacy). There was broad agreement among panelists that, although best practices for PET deployment are still being established (Annex 1, statement 21), PET deployment risks can be mitigated through clear communications and better understanding of PETs (Annex 1, statement 19). For example, effective communication strategies can vary depending on the type of PETs and the corresponding applications (e.g., statistical disclosure control and/or deidentification). The panel agreed that appropriate use of PETs can provide risk mitigation that lowers the requirement for the Safe People principle of the Five Safes framework (Annex 1, statement 24). This appropriate use entails identifying specific risks in the PETs implementation architecture and layering it with non-PET measures (e.g., human-in-the-loop or audits). The communication of PETs should not over-promise their benefits but emphasize instead the importance of objectivity on both their benefits as well as limitations. For example, some benefits of PETs may not be an inherent feature (e.g., mitigate bias in AI) but rather to be used as a complementary process (Annex 1, statement 29). Another example is that benefits of cryptographic PETs (e.g., fully homomorphic encryption) are limited by good governance of private keys used to encrypt and decrypt sensitive data.
The panel agreed that PETs do not by themselves alleviate ethical risks in data management. However, there are examples of addressing ethical risk where PETs could be useful, e.g., reducing bias in training data of AI models by enabling access to more training data representative of the target population (Annex 1, statement 29). Similarly, emerging new technologies, such as the Internet of Things (IoT), collect, process, and transmit vast amounts of data about users and their environments thereby introducing new security and privacy risks. The panel agreed that PETs can help mitigate risks of IoT (Annex 1, statement 22) and could be useful for federated crowdsourcing initiatives (Annex 1, statement 23).
In summary, while best practices for PET deployment are still being established, the potential benefits of PETs should be communicated effectively for a better understanding. The communication on PETs should not over-promise the benefits but should emphasize instead the importance of objectivity on both their benefits as well as limitations.
Specific guidance is needed in addressing regulatory uncertainty
The panel agreed that no official security standards exist for emerging cryptographic schemes, which means that certification does not exist leading to regulatory uncertainty for government usage (Annex 1, statement 44). Only a few regulators around the world have weighed-in on the use of PETs, like, for example, the Information Commissioner's Office in the United Kingdom. (See https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/data-sharing/privacy-enhancingtechnologies) While PETs are being reviewed like in the United States by the National Science and Technology Council, 15 the panel reached no consensus whether related aspects of PETs (such as risk communication, education and training, compliance measures, continuous monitoring, security or ethics by design) should also be actively reviewed by the regulators. Nevertheless, the panel agreed that threat modeling, secure software development practices, and vulnerability assessment are lacking in many existing risk frameworks and that regulators need to be more prescriptive in these areas (Annex 1, statement 43), although some cautioned against over emphasizing the role of regulators over standards, such as ISO standards.
Some unexplored issues related to PETs, such as open-source development
Some issues are unexplored such as the possible market outcomes and economic impacts associated with a broad adoption scenario of PETs. How could the potential benefits (or detrimental outcomes) be distributed across the markets and what will be the role of small and medium enterprises? The panel did not reach consensus on whether PETs can introduce potential barriers to competition due to their high cost of implementation while there was consensus that open-source development of PETs should be supported to strike the right balance between privacy and competition concerns (Annex 1, statement 39). Since data are not copied and, in some cases, never revealed when using PETs, the panel agreed that data owners can better retain the value of their data ownership in the market if PETs are being used (Annex 1, statement 40). When asked about detrimental outcomes, some panelists were concerned about concentrating market power through some PET standards by big tech. Other topics included affordability and transparency of PETs, increased compliance costs (e.g., lack of maturity and/or increased complexity), or interoperability challenges (e.g., incompatibility to other cryptographic primitives).
Recommendations
Based on the consensus statements of the first two rounds, 44 recommendations were formulated for evaluation in R3 leading to 35 consensus recommendations (using the same consensus definition of a minimum 63%). Similarly to the grouping of statements, the recommendations were grouped by theme for ease of discussion. These thematic areas were (1) Institutional perspective, (2) Regulators and Government, (3) Modernization and (4) Ethical considerations.
Think Institutional, Not Project
The panel encouraged institutional level thinking, connecting PET deployments at the project level to operational considerations at the organizational level. In that spirit, the panel recommended that data-owner organizations should establish a shared responsibility model that enables different teams - such as software engineering, legal, customer service or finance - to meaningfully contribute their expertise to the risk assessment process (Annex 2, recommendation 1). The panel further recognized the intricate connection between PETs and IT infrastructure, where the ability to leverage existing IT infrastructure and guardrails determines how quickly PETs related solutions can scale. In addition, the panel recommended that the input privacy (regarding unauthorised access to the inputs to a computation) and output privacy (e.g., protection of sensitive information from the results of a computation) distinction should be clear in the risk assessment and should be widely communicated to key stakeholders such as IT system administrators (Annex 2, recommendation 3). The panel further recommended to develop standard terminology for PETs to help with education and communication with key stakeholders (Annex 2, recommendation 17). In this respect, discussions took place to clarify the PET related terminologies. For example, “PET-enabled data sharing” can be described as “data sharing” or “data visiting” or “insight sharing”. More generally, the panel recommended to communicate more clearly regarding outstanding questions and concerns about PETs by using standardized terminology (Annex 2, recommendation 20).
Regarding the privacy/utility trade-off, the panel recommended to consider an institutional assessment of the utility trade-off in the use of PETs and buy-in beyond the domain of technology (Annex 2, recommendation 5). As social, political and economic sector risks may have spillover effects on data sharing systems (e.g., geopolitical tensions during international conflicts), the panel agreed with the recommendation to update the risk assessment process, when practical, such that a wide range of threats can be contemplated (Annex 2, recommendation 2).
Finally, it was recommended to consider an investment in PETs as part of IT modernization mandates and upgrades (and not as a stand-alone decision) to avoid project-specific risk assessment which runs the risk of redundancy and wasted efforts (Annex 2, recommendation 27). More generally, it was recommended to ensure alignment of the use of PETs with data sharing agreements and legal processes, which implies aligning terminology across functions (legal, IT, etc.), and adopt standard terminologies where they exist.
The role of governments and regulators
The panel acknowledged the important role that regulators and governments can play to promote broad adoption of PETs. Specifically, the panel recommended that regulators should play an active role in guiding how organizations can combine core data protection principles with PETs to supplement other organizational and administrative measures (Annex 2, recommendation 11). This will require regulators to adopt the design principles (e.g., continuous monitoring) of PETs for risk assessment frameworks (Annex 2, recommendation 12). Moreover, privacy requirements need to be clearly defined and communicated to enable effective risk assessment where many actors (and many shared responsibilities) are involved (e.g., legal, privacy, security and 3rd party solution vendors). Given the interoperability challenges between different jurisdictions (e.g., proliferation of national regulations creating uncertainty and increasing compliance costs significantly), it was seen as crucial for regulators and governments to provide comprehensive guidelines, develop useful frameworks, and support the development of certification schemes to facilitate cross-border data sharing. The panel acknowledged that standards hold much potential to guide PET adoption and facilitate effective implementation. In this regard, the panel also recommended that regulators play a role in supporting open-source development of PETs (Annex 2, recommendation 15). A potential partnership model (similar to Linux foundation open-source partner program See https://www.linuxfoundation.org/projects/partnerships) can bring together private and public sectors for joint development and contribution to open-source and broader standards for PETs, which can benefit the ecosystem as a whole. The panel further recommended to seek broader acceptance of the use of formal privacy guarantees for risk mitigation (Annex 2, recommendation 16) and to develop standardized languages and terminologies specific to PETs to help with education and communications with key stakeholders (Annex 2, recommendation 17).
In order for PETs to be implemented in cross-border data sharing agreements, a broader take up of PETs in various countries would be necessary. In this respect, the panel recommended that regulators work closely with leading advocates of PETs (e.g., the UN PET Lab (Annex 2, recommendation 13) and that outstanding questions and concerns for PETs are clearly communicated (Annex 2, recommendations 20).
Modernization
The panel recommended that investments in PETs should be part of IT modernization efforts to ensure alignment of requirements and avoid duplications or inconsistencies, if investments in PETs and investments in IT infrastructure would be done separately (Annex 2, recommendation 27). In the same sense, the panel recommended to modernize IT infrastructure and include PETs as part of it (Annex 2, recommendation 26). Modernizing IT infrastructure does not necessarily mean new IT infrastructure, it could also be leveraging and building on existing IT infrastructure (e.g., key management infrastructure). The panel further recommended organizations to investigate their readiness for PETs by using trusted networks, such as trusted third-party data processors (Annex 2, recommendation 7). Some NSOs are transitioning into this new model of trusted exchange in highly connected networks, such as the European data spaces. (See https://digital-strategy.ec.europa.eu/en/policies/data-spaces) There are other domain specific examples, such as Healthcare, where well-established standards exist for trusted exchange (Smart on FHIR, 16 ). The panel recommended organizations to be proactive in the planning of their required infrastructure (Annex 2, recommendation 29). In this regard it should be noted that whereas security risks may be mitigated by reusing and repurposing approved IT systems, privacy risks may persist as data is a contextual resource and implications may vary from project to project. Secure and effective data sharing also requires conditions such as sound data governance beyond the domain of IT infrastructures. The panel agreed with the recommendation that organizations should be aware of blind spots and areas of false sense of security that may arise with existing risk assessment processes (Annex 2, recommendation 30).
The panel recommended that taking copies of data should be discouraged when appropriate use of PETs can be identified. This is especially relevant for existing practices such as data minimization (Annex 2, recommendation 21). The panel also recommended to consider PETs beyond data sharing and access. For example, PETs can support AI safety (e.g., using differential privacy for machine un-learning, 17 (Annex 2, recommendation 28). Adversarial attacks are not always considered in risk assessments even though they are a real threat. The panel recommended considering PETs to mitigate multi-modality of sensitive data processing, such as IoT, that is difficult to assess due to wide attack vectors (Annex 2, recommendation 34). More generally, the panel recommended that organizations consider how PETs contribute to mitigating risk, and whether the use of PETS can actually replace legacy risk management techniques (Annex 2, recommendation 35).
Ethics, Equity and Ownership, and Social Acceptability of PETs
The panel agreed with some recommendations using PETs to address ethical risks. While PETs cannot be the panacea capable of addressing all ethical issues, PETs could prove helpful in addressing, for example, data bias (Annex 2 recommendation 32). Other dimensions of risk assessment (e.g., data ownership and retention of data value) could benefit from PETs because they offer possibilities to share insights without making copies of the data. A controversial ethical issue is the case when the risk of not sharing data is more harmful than the risk of sharing, as could be the case during a health pandemic. In such case, governments may determine that the right of all individuals to good health overrides the autonomy of organizations to choose not to share data.
PETs can enable access to previously restricted data, such as people's salaries. An example of the use of PETs focusing on pre-existing social inequities is the first use case in the UN PET Guide 10 on the Boston Women's Workforce council: Measuring Salary Disparity Using Secure Multi Party Computation which examined the root cause of the wage gap by deriving insights from real demographic and payroll data from companies and non-profit organizations, large and small throughout the greater Boston area. Despite the sensitive nature of the data and the number of parties involved, all actors agreed on the data sharing facilitated by the PET (Secure Multi-Party Computation). PETs could also offer controlled data access for researchers and local governments to data involving vulnerable populations in scarcely populated areas, which could help mobilize the necessary resources to aid these communities effectively.
In this regard, the social acceptability of PETs is a crucial factor in their adoption and implementation. It involves understanding how different stakeholders, particularly the general public, perceive and accept these technologies. Social acceptability can be influenced by factors such as trust in the institutions implementing PETs, the perceived benefits and risks, and the level of transparency and communication about how PETs work and their impact on privacy and confidentiality. This is in addition to the formal privacy guarantees that is recommended by the panel.
Key use case: pathogen genomic data sharing for public health
The panel agreed that high value use cases of PETs include cross-border data sharing, which could concern collective global well-being (Annex 1, statement 31). Key use cases are data for emergency response to natural disasters, international trade data, fraud detection, smart city data or global health data including data for cross-border infectious disease case management during pandemics. Within the set of papers on PETs in this issue of the Journal, the paper on the private linkage of cross-border trade data 18 details how the NSOs of Canada, Italy and the Netherlands collaborated to link and analyze international trade micro-data within a cloud-based secure enclave, which represents a promising alternative to a private set intersection (PSI) protocol developed under the UNECE project on input privacy preserving techniques. Their improvement focuses on key issues such as lack of interoperability of risk frameworks and global data governance principles, which are currently inconsistent and without the imperatives of multilateralism. Lack of data flow alone is considered enough to halt joint initiatives undermining the developmental potential of the joint digital economy.
Data is a context dependent resource and governance challenges are also domain specific. For example, pathogen genomic data for public health is a key area of work, where multi-stakeholder digital cooperation and a multilateral approach to govern pathogen genomic data will need to be communicated further. Pathogen genome sequencing is a powerful technology which enables rapid and accurate characterisation of pathogens to support infectious diseases surveillance. Equitable access to this critical data and technology will require strategic investment and global level coordination. Previous advocacy efforts included communication15 on the lessons learned from the national consortium efforts to promote rapid sharing of data from all influenza viruses and the coronavirus causing COVID-19, which resulted in global-scale data sharing and development of bioinformatics tools) and remaining challenges (lacking infrastructure solutions and governance structure to allow ease of real-time data sharing).
Domain and modality specific risks may pose unique challenges, which will be amplified in cross border context (e.g., varying ontologies, (See, for example, https://philarchive.org/archive/BABTIDv4) information model, use of different standards, etc.) as seen during the pandemic. There is a growing concern over the use of non-consented health data especially when combined with complex data modalities such as pathogen genomics. Pathogen genomic based disease surveillance for public health raised many concerns over sensitive metadata (which is key for better interpretation of the pathogen genomes and targeted interventions) and uncertainty regarding the residual risk of de-identified data and molecular signatures that may be unique (e.g., rare mutation).
Another key issue is lack of standardization affecting the discoverability of data in the expanded network, enabled by platform-based ecosystems. Metadata are often domain specific and discovery is best guided by ontologies for efficient searching and querying. To date, no national or global standard exists when it comes to preferred ontologies for describing emerging domains such as pathogen genomic data.
PETs offer new opportunities to address complex challenges in high value and high-risk use cases such as use of pathogen genomic data for public health. The post-pandemic landscape, with a significant acceleration towards digital health, introduced emerging threats and evolving technology landscape for privacy best practices. With new health care data modalities (e.g., pathogen genomic) and new use cases involving new patterns of information flow (e.g., data linking involving multiple parties), implementers are tasked with decision making for responsible data sharing and use(s) where common best practices do not yet exist. For example, private data matching between two datasets from two different organizations (parties) using cryptographic PETs can be accomplished by applying PETs to personally identifiable information (PII). By applying PETs, data linking information can be obtained without revealing any personal information where PII introduces substantial challenges, and perceived risk may be high (in semi trusted or untrusted settings). Another example for healthcare is the pandemic modeling conducted in Singapore. In a proof of concept for the Global Partnership for AI, 19 Singapore demonstrated the use of PETs to protect personal data, such as payment transactions and public transport use, while enabling the development of near-real-time epidemiology models.
With lack of data sharing policies reflecting a diverse pattern of allowing access and computing associated with PETs, current risk assessment processes must adapt and develop further evidence to understand potential outcomes of PETs. Research is needed to determine whether detrimental outcomes with the broad adoption may be born by individuals while creating a competitive advantage for certain companies through the early development of PETs (Annex 1, statement 38). Additional research funding, particularly for open-source development of PETs (including long term maintenance and community building and support) should be prioritized (Annex 2, recommendation 15; Annex 1, statement 39). Multisectoral collaboration (e.g., government, academic and civil society) should accelerate adoption strategies across domains (Annex 2, recommendation 13). Moreover, global cross border data patterns should be standardized (Annex 1, statements 51 and 56).
Echoing some statements and recommendations regarding data sharing and its impact on equity and other key factors, risk assessment processes and corresponding decision making should consider current challenges and limitations amplified for certain cohorts (non-representation based on age, gender, race and indigenous status, and/or other vulnerable populations) (Annex 1, statement 64).
Discussions
This work represents collective efforts of the experts of the UN Privacy Enhancing Technology Task Team and is guided by the belief that improvements in data sharing arrangements can be achieved through a better understanding of the underlying challenges in risk assessment. The multi-round Delphi surveys benefited from the hands-on experiences of the experts to reach consensus views on a long list of statements and subsequent recommendations. The main statements and recommendations are as follows.
The use of PETs can improve the implementation of Five Safes framework for data sharing agreements. While best practices for PET deployment are still being established, the potential benefits of PETs should be communicated effectively for a better understanding, not by over-promising the benefits but by emphasizing instead the importance of objectivity on both their benefits as well as limitations. Best practices can consider broader applications of PETs including but not limited to machine learning and data analysis beyond data sharing. In addition, best practices could establish how different PETs can be combined (e.g., secure Multi-Party Computation and Differential Privacy) to enhance privacy protection and increase utility.
Regarding the privacy and utility trade-off, the panel recommended to consider an institutional assessment of the utility trade-off in the use of PETs and buy-in beyond the domain of technology. More generally, it was recommended to ensure alignment of the use of PETs with data sharing agreements and legal processes, which implies aligning terminology across functions (legal, IT, etc.), and adopt standard terminologies where they exist. It was also recommended that regulators should play an active role in guiding how organizations can combine core data protection principles (e.g., GDPR See https://www.dataprotection.ie/en/individuals/data-protection-basics/principles-data-protection) with PETs to supplement other organizational and administrative measures. This will require regulators to adopt the design principles of PETs for risk assessment frameworks. The panel further recommended to seek broader acceptance of the use of formal privacy guarantees for risk mitigation and to develop standardized terminologies specific to PETs to help with education and communications with key stakeholders. Co-creation of standards, best practice and guidelines with the inputs from diverse stakeholders (e.g., policy makers, data scientists and IT professionals) will ensure capacity building across stakeholder communities, which is essential for the successful adoption and implementation of PETs. This can involve training and resources to key stakeholders to equip them with necessary knowledge and skills for effective assessment and use of PETs.
On a practical level, it was recommended that investments in PETs should be part of IT modernization efforts to ensure alignment of requirements and avoid duplications or inconsistencies. While PETs cannot be the panacea capable of addressing all ethical issues, PETs could prove helpful in addressing, for example, data bias.
In conclusion, the urgency of data sharing on many relevant societal issues requires a systematic implementation of appropriate PETs into risk assessment processes to reach meaningful, trust-building, and speedy data sharing agreements grounded in respect for people's privacy. Our collective well-being could depend on it, as was evident from the ticking “molecular clock” associated with the mutations of the SARS-CoV-2 virus in 2020. Recognizing that various stakeholder communities are involved in the growing PETs ecosystem, the panel emphasized the importance of outreach and coordination with regulatory and government bodies. The next stage of the adoption of PETs will best be tackled via improved coordination and active collaboration among the relevant stakeholders.
Future work
This paper provides insights and general guidance on risk assessment associated with PET implementation (including “net new” risks exposed thanks to PETs). However, use case specific risks exist beyond the scope of general guidance. For example, AI safety concerning privacy risks may vary depending on whether you are training a model (e.g., privacy of the individual involved in training) or doing inference (e.g., privacy of the individual involved with inference). Other considerations include protecting intellectual property, and/or conducting safety audits (e.g., OpenMined AI evaluation See, for example, https://blog.openmined.org/secure-enclaves-for-ai-evaluation/) The potential benefits extend beyond personal data. For example, PETs provide a novel means to protect intellectual property and keep harmful information private during third-party AI safety audits. Ongoing updates will be required based on evolving PET use cases. Global convening, such as the UN PET lab, will continue to serve as a key forum to educate regulators and key decision makers. Outreach to the broader stakeholder groups should continue to leverage existing international and regional guidelines for data governance. 20
NSOs, as trusted entities, can play a role in clarifying and demystifying the use of PETs to all stakeholders in the broad data landscape. Key allies in this effort could include sectors that traditionally rely on NSO data and hold significant influence over public opinion, such as government, academic researchers, civil society or data journalists. Innovative analyses or studies made possible by PETs can showcase the value these technologies bring to society, thereby helping illustrate the balance between data utility and privacy. Recognizing the social acceptability of PETs as a crucial factor, NSOs can continue to play a key role in ensuring diverse stakeholder inputs in promoting and implementation of PETs, particularly the general public.
NSOs may also partner with private companies or research institutes to promote the adoption of PETs by working towards a common practice for high priority data flow challenges. Data intermediation services facilitated by NSOs can be a model to accelerate bilateral or multilateral sharing of data, especially for research institutes, small and medium enterprises and start- ups with limited financial means. For example, the Lomas 21 platform developed by Swiss Federal Statistical Office (FSO) is capable of onboarding third party researchers for microdata access while upholding the highest standards of data confidentiality (e.g., facilitating differentially private data pipelines via custom code requirement for end users).
As for the key high-value use cases (for example, cross border data sharing for public health), additional coordination and funding can accelerate the beneficial use of PETs. No public consultation has been done so far on the findings of this paper. Broader consultation may include dialogue with civil society organizations and public consultation to gather public opinions on how PETs may satisfy privacy needs. This will inform additional factors for consideration in the context of risk assessment. In our study, growing concerns were expressed concerning individual autonomy, and yet diverse opinions were seen regarding the impact of PETs on individual autonomy. Further investigation of PETs with public consultation inputs can supplement risk assessment and decision-making.
A specific area of future work concerns privacy issues that may impact vulnerable or underserved communities (e.g., minorities, groups with limited access to education). 22 Overlap between relevant governance and risk assessment frameworks for PETs and community specific principles 23 (for example the First Nations Principles of Ownership, Control, Access, and Possession (OCAP), the CARE principles (Collective Benefit, Authority to Control, Responsibility, Ethics), or “the grandmother perspective”) is not well understood. More research is needed for this problem domain including better understanding of legal and policy barriers 24 (e.g., use of trusted third party).
Supplemental Material
sj-docx-1-sji-10.1177_18747655251355706 - Supplemental material for A Delphi study on the role of privacy enhancing technologies (PETs) in data sharing ecosystems
Supplemental material, sj-docx-1-sji-10.1177_18747655251355706 for A Delphi study on the role of privacy enhancing technologies (PETs) in data sharing ecosystems by Soyean Kim, Ronald Jansen, Matjaz Jug, Dave Buckley, Raphaël de Fondeville, Augusto Cesar Fadel, Adhiraj Saxena, Jess Stahl and William Hsiao in Statistical Journal of the IAOS
Footnotes
Acknowledgments
We gratefully acknowledge DrSalil Vadhan at Harvard University/ OpenDP, Chuck Mccallum, Harvard University / OpenDP, Dr. June Brawner and Dr. Mahi Hardalupas at Royal Society, Dr. David Archer at Galois.Inc, Dr. Bradley Malin at Vanderbilt University, Luke Keller, U.S. Census Bureau, and Curtis Mitchell, U.S. Census Bureau for reviewing this manuscript.
Ethical considerations
The study has gone through an ethics review and an approval by the Simon Fraser University Office of Research Ethics (REB#30001992). Any questions regarding the ethics approval process, please contact the SFU Office of Research Ethics.
Funding
This work was supported by the Canadian Institutes of Health Research (Project Grant 1Number: PJT-159456) and the Genome BC and Genome Canada (Grant Number: 286GET)
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
