Abstract
India’s Digital Personal Data Protection Act (DPDPA) adopts a notice-and-consent-based framework for data protection; it treats all personal data, including genetic data, as a singular category without accounting for its unique characteristics. Unlike ordinary personal data, genetic data is inherently relational; it reveals information not just about an individual but also their biological relatives. Moreover, the risks associated with the processing of genetic data extend beyond identifiability, such as the potential for its misuse in law enforcement or to discriminate in matters of employment or insurance. Despite these concerns, the DPDPA fails to offer a nuanced regulatory approach, lacks a clear definition of genetic data, and does not impose heightened safeguards for its processing.
This article identifies the limitations of the DPDPA’s notice-and-consent-based model in regulating genetic data processing and argues for a shift toward a harm-based framework. It proposes key reforms, such as the classification of genetic data into categories based on sensitivity, an expanded definition of the data principal to include affected blood relatives, and risk-based processing guidelines that categorize genetic data processing into prohibited, high-risk, medium-risk, and low-risk processing. Additionally, this article advocates for stronger privacy by design and by default requirements, mandatory data protection impact assessments (DPIAs), and the introduction of rights such as data portability and right to restrict processing. Further, to ensure effective enforcement, it recommends strengthening grievance redressal mechanisms, introducing compensation for privacy harms, and imposing proportionate criminal liability for negligent handling of sensitive genetic data.
By addressing these gaps, this article underscores the need for a strong legal framework that moves beyond notice and consent to provide meaningful privacy protections for genetic data in India’s evolving digital landscape.
Keywords
Introduction
In 2018, publicly available genetic databases were used by U.S. law enforcement authorities to identify the Golden State Killer. 1 Investigators matched DNA from crime scenes with samples voluntarily submitted to genealogy websites. However, this raised significant privacy concerns, as many individuals were unaware that their genetic data, shared for ancestry research, could be accessed by law enforcement authorities for unintended purposes. This case highlights broader challenges in genetic data privacy and the need for clear legal and ethical safeguards.
Genetic data is widely used across various fields, including healthcare, forensic science, genealogy, and personalized medicine. For instance, in healthcare, it enables early diagnosis of genetic disorders, informs treatment plans, and supports the development of precision medicine tailored to an individual’s genetic makeup. 2 In forensic science, DNA analysis aids in criminal investigations by identifying suspects and exonerating the innocent. 3 Genealogical services, such as ancestry tracing, allow individuals to explore their heritage and familial connections. 4 Genetic data also plays a crucial role in pharmacogenomics, helping to determine how individuals respond to specific medications, thereby improving drug efficacy and reducing adverse effects. 5 Similarly, it may be used for private eugenics that would allow parents to choose the genetic characteristics of their offspring and reduce the possibility of them inheriting diseases or negative traits like dwarfism. 6
While the use of genetic data analytics seems promising, it also raises critical legal and ethical questions pertaining to use and ownership of genetic data, and individual rights such as privacy, dignity, and equality. The misuse of genetic information can lead to discrimination, stigmatization, and the unfair allocation or denial of opportunities, including health insurance, employment prospects, and other social benefits. 7 Likewise, while private eugenics may be helpful in preventing genetic abnormalities in the offspring, it may also be misused to perpetuate gender disparity or traits like fair skin color that may be perceived to be valuable by society. 8
In light of these complexities, legal frameworks must balance the benefits of processing genetic data with strong privacy protections. Several jurisdictions like the European Union (EU) recognize the sensitivity of genetic data and allow its processing only in specific circumstances with heightened safeguards. 9 However, most data protection frameworks like the European Union’s General Data Protection Regulation (GDPR) and India’s Digital Personal Data Protection Act, 2023 (DPDPA) rely on the notice and consent-based model of privacy that overlook the relational nature of genetic data and the unique risks associated with its processing. 10
This article examines India’s data protection framework under the DPDPA in this context, arguing for a more nuanced harm based and relational approach to genetic data privacy. A clearer understanding of the risks associated with the processing of genetic data will enable stronger privacy protections while allowing responsible use for societal benefit.
Definitions and Scope
Before addressing the arguments of this article, it is important to clarify the term genetic data as used in this discussion. For the purposes of this article, genetic data refers to information derived from biological samples that identify an individual’s inherited or acquired genetic traits, including insights into their health and physiology. Genetic data are both unique, as they can identify an individual, and inalienable, as they cannot be fully transferred or fabricated. 11 They are different from other personal data as they affect not just the proband from whom they are derived but also their biological relatives and remain relevant long after the death of the proband. 12
Genetic data include genomic sequence, phenotype information, inheritance patterns, and nongenetic data linked to genome expression. 13 The genomic sequence is central to heredity and phenotype determination. 14 Phenotype information allows genetic inferences; for instance, a single variation in a gene may affect multiple traits through pleiotropy. 15 Inheritance patterns reveal genetic predispositions. 16 Further, nongenetic data linked to genome expression, such as the environmental conditions, also influence phenotypic outcomes. 17
This article considers genetic data as a category of biometric data, based on legal definitions and judicial interpretations. 18 Biometric data include personal data processed through specialized techniques to identify individuals based on physical, physiological, or behavioral traits. 19
It must, however, be clarified that not all genetic data are identifying, and also the different types of genetic data impact privacy in different ways. Hence, the GDPR classifies genetic data into three categories; first the genetic data that requires special protection, because it is identifying and is sensitive due to association with health or physiology 9 ; second, the genetic data that is identifying but is like any other ordinary personal data; and third is the nonpersonal genetic data. 20
On the other hand, DPDPA and the Draft Digital Personal Data Protection Rules, 2025 (DPDPR) apply only to the genetic data that is identifying. This is because they define “personal data” as “any data about an individual who is identifiable by or in relation to such data.” 21
Understanding Genetic Data Privacy
Due to the predictive nature of genetic data, early discussions on genetic data privacy emphasized upon the ownership rights of the proband. 22 Fears regarding their misuse led to “genetic exceptionalism,” 23 emphasizing the need for strict privacy protections. The “right not to know” one’s genetic predispositions further reinforced ownership and autonomy-based models of genetic data protection. 24 Consequently, laws in the United States prioritized individual autonomy, allowing only limited exceptions, such as processing for paternity tests. 25 Likewise, the EU allowed limited research exceptions under the EC Data Protection Directive 95/46/EC. 26
Gradually, large-scale genomic sequencing weakened the ideas of individual ownership and autonomy. The 1996 Bermuda Principles promoted open data sharing, 27 and UNESCO’s 1997 Universal Declaration on the Human Genome and Human Rights symbolically recognized that the genome belongs to humanity. 28 Nevertheless, individual autonomy remained at the core of genetic data protection due to concerns about reidentification from genetic data. 29
Most frameworks like the GDPR have, hence, adopted a notice-and-consent model, emphasizing upon transparency, data minimization, and purpose limitation. 30 Similarly, though ethical guidelines like the Helsinki Declaration aim to prevent harms like discrimination, they rely upon informed consent and confidentiality. 29 However, this model is inadequate for protecting genetic data as it pertains to more than one individual and is indefinitely relevant. Further, advancements in technology allow genetic data to be combined with other information, creating highly detailed profiles. 31 Therefore, an individual cannot fully comprehend the implications of their consent for processing their genetic data. Hence, obtaining informed consent for the processing of genetic data is nearly impossible.
In light of these limitations, some frameworks like the HUGO Ethics Committee’s Statement on Pharmacogenomics, 2007 32 and the International Declaration on Human Genetic Data, 2003 33 have adopted a relational model for protecting genetic data. Further, the Genetic Information Non-Discrimination Act, 2008 in the United States has adopted a harm-based approach to prevent discrimination. 34 Other laws too must adopt a relational and harm-based model for protecting genetic data. They should categorize genetic data by risk, enforce privacy-by-design rules, and hold data processors accountable.
DPDPA and Genetic Data Privacy
As aforementioned, genetic data is unique as it relates to both the proband and their relatives. Sensitive attributes like race may be inferred from it, raising concerns about discrimination. However, despite its unique nature, the DPDPA and DPDPR treat identifying genetic data as ordinary personal data, and do not extend protection to deidentified genetic data or nongenetic data linked to genomic expression. 35 As per Section 3(c)(ii) of the DPDPA, the provisions of the DPDPA do not apply to publicly available personal data, leaving genetic information from public sources vulnerable to misuse.
Like the GDPR, the DPDPA follows an individualistic approach to privacy. Accordingly, the DPDPA grants rights to the “data principal,” defined under Section 2(j) as “the individual to whom the personal data relates.” This definition leads to a legal ambiguities in respect of genetic data since it does not address whether biologically related individuals have equivalent rights as the proband with respect to their genetic data. Therefore, it is unclear that who should be regarded as the “data principal” or whose consent must be taken under Section 4 for processing of genetic data. Further, under Section 2(p), the DPDPA defines “loss” only in material terms, such as loss of property or financial opportunity. As a result, harms like loss of autonomy or discriminatory treatment are not acknowledged under the DPDPA or DPDPR.
Furthermore, the rights granted under the DPDPA guarantee limited autonomy to the data principal and are narrower than other data protection laws like GDPR. For instance, the right to access under Section 11 allows individuals to obtain information on data collection and processing but does not mandate disclosure of data sharing with third parties. Similarly, the right to correction and erasure under Section 12 does not require data fiduciaries to inform third parties of these changes, reducing the effectiveness of these measures. A major limitation is that these rights apply only when data is processed based on consent. 36 If genetic data is processed under broad legitimate uses like research or public interest, individuals cannot exercise these rights. In contrast, the GDPR allows individuals to object to such processing under Article 21, and request restriction on processing under Article 18, in certain cases, like disputes over data accuracy. The absence of similar protections in the DPDPA significantly limits individuals’ ability to control the processing of their genetic data. Further, the DPDPA does not grant the right to data portability, limiting individuals’ ability to transfer their genetic data between service providers, researchers, or healthcare institutions. 37 It also lacks safeguards against decisions based solely on automated processing. This omission is a huge cause of concern owing to the growing use of algorithms in genetic research, healthcare, and insurance. 38 Without such protections, individuals may easily be subjected to discriminatory treatment.
The DPDPA imposes additional obligations on significant data fiduciaries (SDFs), as they handle sensitive personal data. However, neither the DPDPA nor the DPDPR defines sensitive personal data. The Information Technology Act, 2000, and the Information Technology (Reasonable Security Practices and Procedures and Sensitive Personal Data or Information) Rules, 2011, classify biometric and medical records as sensitive under Section 3. Hence, it is likely that these additional obligations would apply to data fiduciaries that process genetic data. As per Section 10 of the DPDPA, the SDFs must conduct independent audits, periodic data protection impact assessments (DPIAs), and appoint a data protection officer (DPO) based in India. DPIAs assess data processing purposes, risks, and mitigation measures. The DPO would ensure compliance with provisions and effective redressal of grievances. While the DPDPA contains these safeguards, it offers minimal guidance regarding their implementation. For example, the DPDPA or the DPDPR do not specify when DPIAs must be conducted or who should be involved. There is also no requirement to consult affected individuals, raising concerns about transparency and procedural safeguards.
Unlike Article 25 of the GDPR, the DPDPA does not mandate “privacy by design and by default.” Privacy by design and by default means that data fiduciaries must build protections into their systems from the start, rather than treating privacy as an afterthought. 39 It moves beyond the limitations of the notice-and-consent model by focusing on reducing risks before they arise. This approach rests on seven key principles. First, organizations must anticipate privacy risks and put safeguards in place. Second, they must ensure the highest level of privacy by default by limiting data collection, defining clear purposes, setting strict retention policies, and preventing unauthorized access. Third, privacy must be part of their core policies and decision-making. Fourth, security should be both effective and practical, ensuring protection without making systems unusable. Fifth, privacy protections must last throughout the entire data lifecycle, from collection to deletion. Sixth, organizations must be transparent and accountable, clearly communicating their data practices and allowing oversight. Finally, data processing must prioritize the rights of individuals, not just compliance with the law. For genetic data, the absence of this requirement in the DPDPA leaves a major gap. Without these safeguards, the risk of misuse, unauthorized access, and lasting harm to the proband and their biological relatives become much higher.
Further, under the Section 17, the DPDPA provides broad exemptions for research and crime prevention but does not clearly define key terms. It allows research-related data processing if conducted lawfully, with reasonable security safeguards, and in compliance with ethical standards. However, it does not specify what constitutes a lawful research purpose. Similarly, while the law permits data processing for crime prevention and investigation, it lacks guidance on what qualifies as proportionate processing. These omissions create significant risks, as genetic data can impact privacy, dignity, and equality of many individuals. Without stronger protections, they would remain vulnerable to harm.
Suggestions and Recommendations
As discussed above, the DPDPA treats genetic data like any other personal data, ignoring its unique risks. It fails to recognize its impact on biological relatives or its lasting relevance. A clear legal definition is needed to reflect its relational nature and ensure broader privacy protections. Genetic data should be categorized based on its sensitivity and identifiability. Identifiable genetic data must be classified as sensitive personal data and should be subjected to stricter safeguards. Deidentified genetic data, which does not per se identify an individual, and nongenetic data linked to genomic expression, may be treated as ordinary personal data.
It is equally important to refine the definition of a data principal in this context. The DPDPA currently defines a data principal as “the individual to whom the personal data relates,” but this approach is unsuitable for genetic data due to shared nature. Hence, the definition should explicitly include both the proband and their biological relatives.
Rather than relying solely on a notice-and-consent model, the law should take a harm-based approach to privacy. This would shift responsibility from data principals to data fiduciaries, requiring them to assess and mitigate risks. Unlike a consent-based model, a harm-based approach acknowledges that genetic data affects not just one person but their biological relatives as well. It also accounts for both material harms, like financial loss, and nonmaterial harms, like discrimination and loss of autonomy. Another advantage of this approach is that it ensures that grounds for processing are clearly defined, preventing vague and broad exemptions like “research purposes” from being misused.
To implement this approach, genetic data should first be categorized based on its sensitivity and its connection to other personal data, such as health information. Then, processing activities should be classified into four risk levels: prohibited, high risk, medium risk, and low risk. This classification should rely on clear evidence of risks to privacy and other rights, considering the sensitivity of the data involved. Processing that poses extreme risks, such as genetic profiling for employment purposes, should be outrightly prohibited, as it can enable discrimination. High-risk activities, like large-scale genetic research, or government use of genetic data, should be subject to strict oversight. This includes independent audits, DPIAs, and additional security measures, such as maintaining detailed records of processing. Medium-risk activities, such as genetic analysis for pharmacogenomics, should involve regulated access with safeguards like encryption and strict purpose limitations. Low-risk activities, such as basic genetic testing for paternity, may require fewer restrictions. In such cases, notice and consent may suffice, provided that the data is deleted immediately after serving its purpose.
Accordingly, the obligations of SDFs must be clearly outlined. Particularly, DPIAs must be tailored to the risk level of the processing activity. Further, a DPIA must be mandatory before processing begins, with periodic reviews, especially for high-risk activities. Additionally, affected individuals must be consulted to ensure their rights and concerns are fully considered. Similarly, for medium-risk processing, DPIAs must involve consultation with relevant oversight bodies, such as ethics committees. This would enhance accountability and strengthen privacy protections for genetic data.
The DPDPA and DPDPR should incorporate the principles of privacy by design and by default to embed data protection at every stage of processing. Privacy safeguards must align with risk levels, ensuring the highest protection for high-risk processing. These principles would also enhance transparency, enabling data principals to exercise rights like access to their data. Additionally, they would standardize accuracy, completeness, relevance, and timely deletion of data, strengthening overall accountability and privacy protections. Further, failure to ensure the highest levels of privacy for high-risk processing should carry serious consequences. In cases of gross negligence or reckless processing, criminal liability may be imposed to prevent privacy harms.
Finally, the DPDPA must extend the rights of the data principal beyond consent-based processing. Individuals should have explicit rights to data portability and the ability to restrict processing, ensuring greater control over their genetic information regardless of the legal basis for processing. The law must also strengthen grievance redressal mechanisms. In addition to the fines imposed on data fiduciaries, a compensation mechanism should be introduced to provide direct remedies to affected individuals. This would enhance accountability and ensure that individuals have meaningful recourse in cases of privacy violations.
Footnotes
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Ethical Approval and Informed Consent
Not applicable.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
