Abstract
Including underrepresented population groups in databases and initiatives integral to assessing hereditary cancer risk presents several challenges. Data and knowledge from genome-wide association studies (GWASs) and clinical genomics are based largely on people with predominantly European ancestry. Individuals who do not have this ancestry are under-represented in resources used to make clinical assessments. They are therefore more likely to receive results of “variant of unknown significance” (VUS) from genetic testing. Efforts to broaden representation of population groups have created additional challenges in grouping those with mixed ancestry. Further, the focus on genetics can lead to downplaying the impacts of racism. We describe challenges that must be overcome to include underrepresented population groups, including our insights from our ongoing work within each of 3 population groups that focus on: health care access for heterogeneous Hispanic/Latino populations, engagement with Black, African, and African Diasporic communities, and Indigenous Data sovereignty. Using empirical data derived from population databases, we demonstrate gaps in hereditary cancer gene variant representation. Our interviews with experts in genetic testing and data sharing suggest ways to create inclusive hereditary cancer data resources. The interview data highlight the inequities that have led to underrepresentation, ranging from patient access to research and health services (especially those that are culturally-sensitive), to funding available for professional training and grants to those who belong to underrepresented groups. Our goal is to ensure the realities of how different excluded groups experience genetic testing are not overlooked in the development of policy to address the broad challenge of hereditary cancer gene variant data sharing. There is no one simple solution to making more inclusive hereditary cancer data resources, but there are many opportunities to improve them and better serve these populations.
Introduction
Data and knowledge built on genome-wide association studies (GWASs) and clinical genomics are based largely on populations of people with predominantly European ancestry. 1 Although there is only one human race of unique individuals who are nearly genetically identical, tiny differences can tell a big story and provide context for those seeking to reduce disparities in cancer disease burdens and personalized medicine. In this paper, we discuss broad factors that influence inequitable hereditary cancer risk assessment and provide a more nuanced examination of factors impacting inclusion in cancer genomic resources for 3 specific (though overlapping) racial and ethnic groups: Hispanic/Latino populations, Black, African, and African Diasporic communities, and Indigenous peoples.
Many individuals belonging to these racial and ethnic groups have ancestral genetic markers geographically associated with Africa, Asia, or the Americas. These groups are underrepresented in cancer genomic resources. This underrepresentation creates major problems for assessing hereditary cancer risk, as individual risk assessments rely on vast amounts of variant data linked to clinical outcomes to determine the clinical significance of each variant. Individuals whose genomic characteristics are not represented in resources used to make clinical assessments are more likely to receive results of “variant of unknown significance” in genetic testing.2-5 Since sequencing of the human genome was completed, there have been major recruitment initiatives to diversify datasets by increasing participation in genomic research, although sometimes marred by controversy.6,7
Even when genomic researchers make strides to have more representative data, they struggle with social concepts of race to describe populations of people. Groupings made based on outward physical features like skin color, hair texture, and body form are not based on valid science. Some have shifted to simply rebranding racial categories using geographic ancestral indicators, using false categories of “European” and “non-European”, 8 which conjures troubling imagery of “White” and “Colored”. This shift creates a problematic system based on percentages of ancestry for the purpose of convenience to categorizing admixed people, and risks ascribing racial meaning to DNA ancestral markers in the translation of genomic research to clinical practice. Genetically there is no such thing as race but clinically, there are grave inequities associated with racism. Individuals who have ancestry from Africa, Asia, or who are Native to the Americas experience societal discrimination and marginalization. Discrimination and marginalization create barriers to the most advanced health care by influencing underrepresentation in genetic databases, exclusion from data governance decision making, and failures of referral to precision medicine and clinical trials.
Using our expert knowledge, membership within communities, review of the literature, and insights from ongoing work with excluded populations, we describe a key challenge for each population group. We begin with descriptions of these populations to help remind the reader that each of the broad challenges must be contextualized within distinct populations and communities with their own cultures, histories, and nuanced experiences with health care and medical research.
The next 2 sections use empirical analysis to describe broad issues. In the first section, we take a step back to review the representation of different population groups in a commonly used data resource, gnomAD, to demonstrate some of the practical challenges resulting from the Eurocentric bias in genetic repositories. Then, we analyze interviews with experts who shared their perspectives on challenges that inhibit developing inclusive genetic data repositories.
Equity and Inclusion Challenges in Specific Communities
Simply “recruiting” populations that have been continually underrepresented in genomic efforts is insufficient to ensure data resources can serve everyone equitably. Here, we describe some of the challenges in addressing these inequities through more nuanced and in-depth discussion of 3 large but underrepresented population groups: Hispanic/Latinos, Black, African, and African Diasporic communities, and Indigenous Peoples. While these population groups overlap with each other in terms of both who identifies with them and what barriers they face in cancer genetic resource inclusion, each has unique histories and experiences that can guide efforts to improve inclusion. In each section, we describe relevant historical context and existing inequities that shape how these groups access health services and participate in research. We then discuss how these unique factors influence how each community participates in hereditary cancer testing, producing the data housed in databases used to interpret hereditary cancer risk. We specifically discuss the barriers to accessing health care that heterogeneous Hispanic/Latino populations face, partnership with Black communities, and the need for Indigenous data sovereignty.
Hispanic/Latino Populations
As a whole, the diverse and fast-growing Hispanic/Latino population is the largest ethnic or racial minority group in the U.S., making up 18% of the country’s population in 2019. 9 In contrast, only 0.5% of genetic samples analyzed in GWAS studies so far are from individuals with Latin American ancestry. 10 The pan-ethnic terms “Hispanic” and “Latino” commonly are used in the U.S. to refer to heterogeneous populations with heritage from dozens of countries across Latin America, the Caribbean, Iberia, and sometimes regions in which surnames reflect Spanish or Portuguese influence, such as the Philippines and parts of Africa. These populations often categorized together as “Hispanic/Latino” are both ancestrally and socially diverse. Genetic admixture of Native American, African, and European ancestries among Hispanic/Latinos varies between11,12 and within their countries of origin. 13 Acknowledging and accounting for the diverse experiences of Hispanic/Latino groups in the health care system are key to more equitably including them in genomic resources.
Breast cancer is the most common type of cancer among Hispanic/Latina women, making up 29% of cancer cases. It also causes the most cancer-related death, 14 and Hispanic/Latinas with breast cancer are more likely to be diagnosed in later stages 14 and with worse prognoses than non-Hispanic Whites (NHW). 15 Studies have found BRCA mutations in about one quarter of breast cancer cases in U.S. Latinas. 16 Both cancer prevalence 17 and hereditary cancer predisposition syndromes 16 differ by Hispanic/Latino sub-ethnicity. For example, studies found Latinas with more European ancestry as well as Latinas born in the U.S. to be at higher risk for breast cancer, while Latinas of Mexican descent have been found to have lower risk than those of Cuban or Dominican descent. 17 Studies have also detected BRCA1 mutations more commonly in Argentine and Mexican women and BRCA2 mutations more commonly among Cuban and Puerto Rican women. 16 Despite the prevalence of hereditary cancer across multiple Hispanic/Latino populations, Hispanic/Latina cancer patients and those without cancer but at high-risk are less likely to receive genetic testing for hereditary cancer risk when indicated compared to NHW, and thus less likely to reap its potential downstream benefits for their overall health. 18
Patients from underserved communities are less likely to access sites that conduct genomic research or have genetics providers such as large academic medical centers. 19 This suggests the same systemic barriers that have prevented underserved Hispanic/Latino groups from accessing basic health services 20 also hinder their participation in genomic medicine and research. Lynce et al. (2016) point out that: “although genetic counseling and testing for inherited susceptibility to breast cancer has been clinically available for nearly 20 years, disparities in awareness, referral to services, and access persist.” 15 Compared to NHW, Hispanics/Latinos are more likely to be uninsured and low-income, particularly those of Mexican and Central American descent. 21 Indeed, the high costs of genetic testing and counseling services coupled with a lack of insurance coverage of those services has been shown to be a major concern among Hispanic/Latinos broadly. 16 Inadequate culturally and linguistically competent medical care has also been found to be connected to worse health outcomes among Hispanic/Latino patients. 22 These disparities extend into genetics 23 where there are few bilingual bicultural genetic counselors in the U.S. to provide services in languages other than English 24 for the 28% of U.S. Hispanics/Latinos who are limited English proficient, 25 particularly Central Americans. 21
Despite these systemic barriers, Hispanic/Latinos are enthusiastic about and interested in genetic services for hereditary cancer once made aware of them. 16 However, a gap persists in Hispanic/Latinos’ awareness and knowledge of genetic services for hereditary cancer risk. 16 Almeida, et al. (2021), for example, document “great need and desire for education on genetics of breast cancer” among Hispanic/Latinos. 26 This gap between awareness and interest is likely due to historically inadequate community engagement with diverse Hispanic/Latino communities.
Together, these disparities suggest that genomics researchers, leaders, and funders must discern why efforts have fallen short to include diverse Hispanic/Latino populations in genomic medicine and research. Almeida, et al. (2021), emphasize that education is “only the first step” to reducing disparities in access to hereditary cancer services among Hispanic/Latinos, and that Hispanic/Latino populations also need “support navigating the complex health care system, so they are able to access screening with full understanding and decision-making ability.” 26 The field can learn from the example of genomics research groups working in hereditary cancer who have created long lasting relationships with Hispanic/Latino communities through establishing partnerships, employing patient navigators and community health workers,27,28 and hiring bilingual genetic counselors. 29 No single approach may work to engage different Hispanic/Latino communities with diverse socioeconomic and sociolinguistic contexts in genomics. Instead, community outreach for genomic research and services should be tailored to the language, literacy, and communication mode preferences of specific Hispanic/Latino communities involved. Including Hispanic/Latino communities in genomic initiatives will involve working to simultaneously lessen barriers to health care and expand access to needed medical services among Hispanic/Latino populations, especially given the currently limited clinical benefit of genomic sequencing for these populations. Otherwise, genomic initiatives will neither be appropriately aligned with communities’ needs nor serving to lessen the pressing health disparities that prevent equitable inclusion of underserved communities in genomics in the first place.
Black, African, and African Diasporic Communities
Black individuals are faced with the greatest cancer burden of any racial or ethnic group. They experience the highest mortality and the lowest survival for most cancers. 30 In Black men, the cancers most likely to cause death are lung, prostate, colorectal, pancreatic, liver, stomach, and myeloma, while in Black women, breast, lung, colorectal, pancreatic, uterine, stomach, cervical, liver, and myeloma are the leading causes of cancer-related deaths. 31 Black people represent vastly greater genetic diversity than descendants of European populations, 32 yet, people with predominately African ancestry comprise only about 2.5% of GWAS participants globally. 33 Millions of genetic variants have yet to be characterized, 34 and scientists must be careful not to widen disparities with unrepresentative data. The genetic variants most strongly associated with breast cancer are found in the BRCA1 and BRCA2 genes, the majority of which are associated with hereditary breast cancer and the triple negative breast cancer (TNBC) subtype. 35 Incidence rates for TNBC across African nations represent approximately 33% of breast cancer diagnoses with the highest incidence of TNBC being observed in West African nations when compared with East African nations. 36 Researchers have reported a higher risk of TNBC associated with West African ancestry.37,38
Globally, these patterns of increased TNBC incidence are pronounced within the African diaspora. Genetic diversity among Black people in the Western Hemisphere, is heavily influenced by historical features of the transatlantic slave trade and racialization. 39 Racialization is a process by which people are grouped or classified by social relationship and then scientifically or medically the grouping is used as a predictor, indicator, or proxy for biological outcomes or clinical causality, 40 but racial groups are not genetical valid because the differences between people is not associated with any identifiable exclusive “race” genes. 41 Socially, the word “race” has been used to represent an individual’s racial identity and self-classification; however, in some instances, “race” refers to how others may observe or perceive an individual based on how the observer characterizes and interprets race. 42 Racialization has resulted in a caste system intertwined with the pseudo-biological eugenics premise of “fitness”. The terms “Black” or “African American” are used in the United States to describe people who have origins in any of the darker skin sub-Saharan peoples of Africa. 43 In this paper, the terms Black, African American, people of African descent, people with predominantly African ancestry, are used interchangeably as a reference to the socially constructed racial category. Racializing human beings as Black has been an effective tactic to socially and economically oppress Black communities and terrorize Black women in the name of medical research.44-46
The transatlantic slave trade forced migrations of millions of people from Africa to the United States, Brazil, and the Caribbean. The generational enslavement shattered families, disrupted established social norms, shaped cultural beliefs, and drove policy. Gene-environment interactions related to African ancestry may be shaped by specific cultural and social experiences of different Black communities in the Americas. George and colleagues found diversity in cancer risk factors and genetics between U.S. born, Caribbean born, and African born cancer patients. 47 Carvalho and colleagues report the racial composition of Brazil is very heterogeneous with 5 centuries of admixture. Yet they observed the region of Brazil with the highest TNBC incidence (20.3%) is the northern region which, also has the highest African influence (77.8%). 48 Due to the imprecise and inconsistent definitions of race, researchers are developing new approaches for studying specific gene expression differences in TNBC tumors using quantified genetic ancestry. These promising new techniques have been designed to isolate African ancestry–associated genes and indicate the functional influence of the genetic ancestry background upon gene expression. However, our understanding of the influences of ancestry is limited and remain understudied. Projects like, SAMBAI (Social, Ancestry, Molecular and Biological Analysis of Inequalities), which is a UK Cancer Grand Challenge funded global initiative to tackle the cancer inequities challenge are pioneering this research area. 49
Inequities in cancer prevention, screening and treatment also factor into health inequities. For example, younger Black breast cancer patients have been found to be more likely to report higher intention for genetic testing and greater information needs related to genetic services and breast cancer screening. 50 However, In the United States, Black cancer patients at-risk for HBOC report greater intention of getting genetic testing and greater information needs related to genetic services and breast cancer screening 50 but are less likely to be referred for genetic testing.51,52 When studying racial differences in access to genetic counseling services, some studies have demonstrated underutilization 53 by Black patients while others have found no significant racial differences. 54 Yet, Black women do not meet criteria for referral to testing due to truncated family trees and incomplete health history information. 55 Also, Black women may lack the necessary insurance coverage and have higher co-pays for genetic testing. 56 Access to care with adequate insurance is associated with earlier diagnosis and nearly half the disparity between Black and white patients in breast cancer stage at diagnosis is explained by lack of insurance or having Medicaid coverage. 57 While it is true that, economic disparities (lower income, insurance coverage, employment benefits, and wage gaps) present barriers to genetic testing uptake 58 and create financial burdens when identifying high-risk individuals eligible for treatment with precision medicine, 59 the literature consistently reveals persistent racial health disparities despite improved income and educational achievement.60,61 Modern approaches must combine social science with genomics and epigenetics to address these old problems with more nuance. If people with African ancestry face too many barriers to genetic testing and counseling, there will be inaccurate representation in both clinical and research data leading to worse health outcomes and possible mortality. 62
In a 2019 report from the National Academies, Williams and colleagues wrote, “Disparities in access to genomic health care are also intertwined with under-representation of genetically diverse populations in genomic research.” 55 Genomic health care includes pre- and post-test genetic counseling, which may be of the utmost importance for Black patients requiring additional support when interpreting results due to increased likelihood of uncertain results. Racial differences in access to genetic counseling services are also clear. Sheppard and colleagues (2014) observed underutilization of genetic counseling services by Black patients, 53 while Peterson and colleagues (2020) reported significant racial differences in referrals to genetic counseling but no significant racial differences in uptake. 54 Systemic and structural barriers to genetic research and health care are not explained by biological differences between “races” rather they are a reflection of a specific form of anti-Black racism that is deeply rooted in the history of enslavement and oppression. 63 As Sederstrom and Lasege state, “Being born Black in America yields, because of America’s foundation of anti-Black racism, an identity that generates the consequences of a chronic condition.” 64 By using tools like the National Institute on Minority Health and Health Disparities model of health disparities, researchers can examine the inter-relationships between downstream (individual risk) and upstream factors (racism and discrimination).
Policymakers can resist the normalization of structural and systemic inequity by recognizing race as a social construct in both research funding opportunities and ongoing institutional outreach. There is no one Black community or monolithic black experience. Multi-level strategies show promise for improving racial health equity 65 but in order to be effective, community engagement strategies and initiatives should represent greater diversity within the Black racial group. For example, the Human Heredity and Health in Africa (H3Africa) supports projects across 34 countries to bridge gaps in population-scale whole genome sequencing, which include many previously understudied populations. 66 Also, partnerships with Historically Black Colleges and Universities (HBCUs) across the US have been promising. Ewing and colleagues partnered with Howard University to improve representation in Genetic Counseling workforce and increase participation of Black participants in genomics research. 67 Solberg and Taylor piloted bioethics program uniquely designed for use at HBCUs when Supreme Court hearings were considering the patentability of BRCA genes and tests, 68 and the Department of Defense has sponsored undergraduate HBCU summer training programs to improve representation in cancer disparities research. While The WISDOM Study (Women Informed to Screen Depending On Measures of risk) is a pragmatic randomized clinical trial exploring personalized breast cancer screening using an adaptive study design to allow continued refinement. The WISDOM Study empowers patients by letting them either choose to be randomized to risk-based screening or annual screening or join an observational cohort. 69 Well-established trusted relationships with a diverse portfolio of community-based organizations as partners in research should increase awareness, improve our predictive models, and support the equitable distribution of benefits from genetic research. In these ways community-based participatory research (CBPR) can help improve racial diversity in genomic research participation. 70
Indigenous Populations
The Indigenous population group is also very diverse. It includes the 573 federally recognized tribal nations of American Indians and Alaska Natives (AI/AN), 71 state recognized tribes, 72 members of other non-recognized tribes including Native Hawaiians, descendants of these tribes who do not have official status, and those from Indigenous communities outside of the US, particularly those who have migrated from Central and South America. 73 We use the term Indigenous to refer collectively to all of these groups, using more specific terms when warranted. The Indian Health Service is a federally funded program that provides health service to many AI/ANs, primarily to those living on reservations and in rural areas. 74 For those seeking care from other providers, obtaining culturally-appropriate care is difficult as Native Americans are underrepresented in health care service professionals, including genetic counsellors. 75
The health of Indigenous Peoples living in the United States is a product of the systems of historical and ongoing colonialism and racism that this group experiences.76,77 Indigenous Peoples experience many health disparities 78 and health care funding shortfalls mean that AI/AN each receive only about $4078 in federal funding per person, while the rest of the population receives $9726. 79 They also experience many health disparities related to cancer, despite having a lower overall cancer mortality rate. 80 Native Americans have a higher incidence of pancreatic, cervical, gastric, and colorectal cancers,81-84 and while AI/AN women have a lower incidence of breast cancer than other population groups, they are experiencing incidence increases faster than others, and get diagnosed at later stages for breast cancer. 85 Finally, AI/ANs experience higher levels of mortality for colorectal, 84 liver, and stomach cancers. 86
Much as health systems have failed to provide appropriate and sufficient care to Indigenous Peoples, research conducted about but not by Indigenous Peoples continues to cause harms to individuals and sovereign Indigenous nations and tribes. The infamous case of DNA being collected for diabetes research and then used to conduct harmful and stigmatizing research of the Havasupai tribe in Arizona is one of the most famous examples.87,88 Large scale projects like the Human Genome Diversity Project, The International HapMap Project, and 1000 Genomes not only failed to produce medical benefits for Indigenous communities, but led to commodification of their genetic information through the use of such information in commercially available ancestry tests. 7
Non-Indigenous approaches to research guided by non-Indigenous institutions are also harmful, characterized by an inherent power imbalance that fails to provide Indigenous peoples with benefits from research while simultaneously providing very little control over processes and outputs. Indigenous scholars have written about the “cycle of victim blaming and coercion” 89 that Indigenous People face when engaging in research, and the harms caused by standard research practices like individually-focused informed consent. 90 This focus on individual consent and patient privacy causes issues when non-identifiable research is conducted on groups that can be identified, as illustrated by a controversy in Canada. Non-Indigenous researchers accessed HIV sequences derived from HIV-infected people in Saskatchewan that were sent to a research center in British Columbia for the purpose of conducting HIV drug resistance genotyping. All identifying information was removed, but the authors noted that 80% of those infected with HIV in Saskatchewan identified as having Indigenous ancestry. Reasoning that most of the sequences they analyzed were therefore from those with Indigenous ancestry, they used published information about Indigenous immune system serotypes to make inferences about the immune adaptations they reported in their study. This work was conducted without the knowledge and consent of the Indigenous populations affected.91,92 The authors stated in their discussion: “Though the lack of sociodemographic and ethnicity data precludes us from investigating this directly, it is tempting to speculate that the HIV strains that are being widely transmitted among Indigenous communities are the ones that harbor the highest levels of adaptation to HLA alleles expressed in these populations.” This study created an “ethical storm” in Saskatchewan Indigenous communities and “further stigmatized HIV-1 in a province already dealing with the ongoing effects of colonialism and racism”. 92
As genomic research and precision medicine have increased over the years, so too have calls from Indigenous scholars and communities to increase the control that Indigenous groups themselves have over what data is collected from their communities and how it is used. Rather than advocate for seats at non-Indigenous tables and governance structures, Indigenous Peoples have pushed for what is referred to in the United States as Indigenous Data Sovereignty. Other jurisdictions have moved toward increasing Indigenous control of genomic data collected from Indigenous Peoples, including Canada, Australia, and New Zealand. Canada’s Tri-Council Policy Statement requires that researchers engage with relevant Indigenous communities if their results so much as refer to Indigenous Peoples. 93 Australia legislated the creation of the National Centre for Indigenous Genomics to create a “safe, permanent, national keeping place for biological samples and for genomic and related data obtained from Aboriginal and Torres Strait Islander Peoples”. 94 Finally, New Zealand has developed Genomics Aotearoa with a major goal “place Te Ao Māori at the center of these activities, through research undertaken by, for and with Māori and embedding Māori management of indigenous genomics research practice and data”. 95
Recent genomic research efforts in the United States highlight how Indigenous Data Sovereignty has not yet fully permeated research policy. When asking “Indigenous Peoples to continue participating in newer-larger scale precision health projects… is to ask for their trust in a system that has historically exploited systemic inadequacies and anti-Indigenous politicking.” 90 The All of Us program through NIH was criticized by Indigenous scholars for not engaging tribes at the outset, yet still recruiting participants for the diverse genomics initiative from urban settings known to have large Indigenous populations ostensibly to avoid asking Tribes for permission. 96 Another effort by the NIH, the Rapid Acceleration of COVID-19 Diagnostics for Underserved Populations project, announced the intent to develop a Tribal Data Repository and NIH was criticized for seeking to not allow Tribal Nations the full authority to develop their own data systems. 97 The announcement undermined Tribal sovereignty by requiring data to be shared with the NIH. In light of the criticism and the universal NIH Data Management and Sharing policy, a supplement on Tribal data was released in 2022. This notice clearly states that “data management and sharing must be predicated on tribal sovereignty” and “is committed to supporting Tribal data science resources, including data repositories”. 98
While this recent supplement is working towards best practices for working with AI/AN Tribes, the NIH still has work to do on protecting other Indigenous groups within the US. As Tsosie, et al., note, “a publicly funded research agenda with a clear path to commercialization but without a clear path for Indigenous health or economy is fundamentally flawed.” 89 There have also been global efforts to develop principles for Indigenous data sharing, called the CARE Principles (Collective Benefit, Authority to Control, Responsibility, and Ethics), meant to be a complement to the widely cited FAIR Principles (Findable, Accessible, Interoperable, and Re-usable). 99
Tackling the problem of hereditary cancer in Indigenous populations has aspects of a catch-22: We cannot know how significant hereditary cancer is for Indigenous populations or if there are pathogenic variants in this population not found in other groups (which is likely, see analysis below) unless gene variant data linked it to phenotypic outcomes is collected. However, it is difficult for Indigenous scholars and communities to prioritize hereditary cancer when the scope of the problem has not been defined. We do know this: Indigenous communities will resist participating research if it is part of a “cycle of victim blaming and coercion.” 89 For Indigenous communities to receive benefits from advances in hereditary cancer genetic testing, they must have equitable access to health care funding and services. It does not matter how robust a hereditary cancer genetic variant resource is if potential users cannot access care.
Therefore, the way forward out of this catch-22 is funding that supports Indigenous Data Sovereignty, so that Tribal nations and Indigenous organizations can establish their own data commons, and decide how they will partner with existing resources and initiatives. While large initiatives like All of Us and RADx respond to criticisms from Indigenous scholars and seek Tribal engagement, it is not yet enough; “[M]eaningfully engaging Indigenous communities in precision medicine must also entail restructuring research ecosystems from an anti-colonial standpoint.” 89
Population Representation for Hereditary Cancer Genomic Resources
Whether or not an individual will receive a definitive hereditary cancer genetic risk assessment depends on their variant being included in databases and linked to enough clinical outcome data to assess pathogenicity. Accurately determining the clinical classification of each identified variant for hundreds of hereditary cancer genes is a daunting task. Indeed, even the most widely studied hereditary cancer genes, BRCA1 and BRCA2, have 25% and 40% of their variants on ClinVar classified as “variant of unknown significance” as of October 2020. 100 This is despite years of concerted effort to bring together the necessary experts and data needed to advance understanding of the clinical significance of BRCA1/2 genetic variation. 101 Here, we examine the variant classifications of multiple hereditary cancer genes across population groups.
One commonly used resource that is used to support variant classification is the population-based Genome Aggregation Database (gnomAD), based on exome and genome sequencing data from multiple large-scale sequencing projects
102
; it provides information on the population frequency of variants, clinical classifications, and the population group(s) in which the variant was identified. The 2.1 version of gnomAD uses 8 population groups: African American, Latino/Admixed American, Ashkenazi Jewish, East Asian, European (Finnish), European (non-Finnish), South Asian, and Other. We examined variant data of 13 hereditary cancer genes (chosen based on their inclusion on panels offered by all 17 companies in our recent study). We identified 19 846 hereditary cancer variants, of which 663 were classified as pathogenic or likely pathogenic. The vast majority of all variants (76%) and pathogenic variants (84%) were found in only a single population group (Figure 1). In other words, variants found in each population group are rarely found in another, which underscores the importance of groups being represented in variant databases. Moreover, pathogenic variants found in multiple groups are often found at very different frequencies (Figure 2). Finally, the majority of pathogenic variants are found in the European (non-Finnish) group (Figure 3). The most likely explanation is not that gnomAD is a comprehensive picture of the global distribution of variants associated with inherited cancer risk, but that most populations have not been sampled sufficiently, and so there are many “founder” and other cancer-associated variants still to be found in under-represented groups. Interpreting risk in under-sampled populations will require data from those populations to complete the picture. This analysis helps demonstrate how incomplete data resources are when the resources only include individuals from limited populations. The data file used for this analysis is available on our Open Science Framework project page.
103
Percent of Variants From 13 Hereditary Cancer Genes in the gnomAD Database that are Found in Different Numbers of Population Groups Population Frequencies of Hereditary Cancer Gene Variants Found in Four or More Different Population Groups in the gnomAD 2.1 Database Percent of Pathogenic Variants that are Found in Each Population Group


High Level Overview of Issues Related to Equity in Hereditary Cancer Variant Commons
As part of a modified policy Delphi process to identify effective and feasible policy options to expand and improve cancer variant data resources, our research team conducted interviews with domain experts selected for their expertise in previously-identified priority areas. 104 These interviews were round one of 3 in the policy Delphi, and were “intended to elicit policy issues [the experts] perceived as impeding data sharing and the development and sustainability of a cancer gene variant commons” that would be used in subsequent Delphi rounds. The experts interviewed represented 5 groups: data contributors and end-users (eg, patients), data generators (eg, testing laboratories), data facilitators (eg, data curators), data resources (eg, data repositories), and professional data users (eg, genetic counsellors). This group of experts included 7 interviewees with knowledge and interest in challenges to achieving equity and inclusion in hereditary cancer data resources (methods described in Robinson et al 104 ). Here, we detail the main ideas that were shared in interviews about challenges and solutions related to inclusion and equity.
Contributors to Underrepresentation
Experts brought up 2 factors known to be major (and proximal) reasons that some population groups continue to be underrepresented: a lack of trust in research institutions that have harmed their communities, and a lack of access to health services that are necessary entry points for inclusion. As noted above, research has historically harmed many distinct population groups. The legacy of these harms, which are often ongoing, is that some communities may be wary of participating in research that is conducted by untrustworthy institutions and actors. Those who wish to engage in research often rightfully demand approaches that include community members as partners and not simply subjects. However, most of the data comprising hereditary cancer resources like ClinVar are derived from clinical testing. Factors that impact access to health services will affect what data are included. In line with the Hispanic/Latino community experiences described above, experts indicated that genetic testing might only be available to those who can afford to pay for it or have insurance that covers it. Furthermore, even when testing is affordable or covered through insurance, the health service providers that are available might not be aware of the importance of hereditary cancer testing referrals, or there could be lack of access to culturally appropriate (or any) genetic counselling. One interviewee described access challenges for testing family members, even if the testing is free: If a hereditary cancer syndrome was found in a family, then providing the information and access to testing for other family members, because it's a powerless feeling to say, “Oh, well, you're positive.” And then you have three sisters in another country, who may have the same gene but they don’t have access… I know some of the genetic testing companies are offering free cascade testing, if the samples can be mailed, but then it’s really expensive to mail the samples. So, there’s lots of barriers still.”
Improved Representation Requires Institutional Change
Changing research and medical institutions is obviously a large task, or more accurately, many large tasks. But experts had ideas about how to shift participation in both research endeavors and clinical care to be more inclusive. As it turns out, ensuring representation in the end product (data resources) is highly dependent on ensuring representation in all activities that feed into that product. Experts pointed to the importance of supporting diverse professionals from underrepresented communities, as people respond positively to others who look like they do. The experts mentioned that research institutions should hire diverse staff and faculty, to ensure that there are diverse perspectives on deciding what research questions get asked and how those questions get answered. They pointed to challenges related to which institutions get research dollars, and a perception of how often the prestigious institutions that have the most funding have a lower percentage of non-White tenured faculty. Institutions providing training can also ensure diverse individuals are becoming trained in relevant clinical professions. “[We need] diversity in research teams, leadership, and who gets to ask the questions, and diversity regarding the kinds of questions that get to be asked…Some of these high resource institutions have a low percentage of Black tenured faculty, but they get most of the money.”
Shifting Ideas About Who Institutions Should Engage and How to do It
Changing the dynamics of genetic research, however, requires more than just increased diversity among professionals. Experts discussed factors that permeate research institutions, including the limits to how community partners are empowered to participate. Much like Black, African, and African Diasporic community participation could be supported through better engagement of stakeholders throughout the process of developing data resources, experts discussed the key role of community members and organizations to work as research partners and help identify topics that are important to communities, carry out the research, and participate on advisory boards to keep the research and community interests front of mind. They discussed how too frequently, “engagement” was limited to a goal of recruiting specific research participants, which is insufficient. One interviewee said: “I find that involving community partners and community organizations since the beginning, it is key… [T]rying to get a sense of what are the needs, what are the priorities of the communities that you want to work with, because I feel sometimes researchers have a very top-down approach in which they already have their questions and their design in the study, and then they just go to communities. And communities feel that researchers are just extracting data from them, but not really partnering with them.”
Experts also pointed out the potential role of community members to participate as peer-reviewers and also help identify research practices that could harm communities (like improper consent). They discussed how this work needs to happen before recruitment and continue on an on-going basis throughout the project life-cycle.
Necessary institutional changes are not limited to how communities engage in research, but also how researchers conduct themselves and the institutional structures that shape that behavior. Experts pointed to a lack of institutional awareness of the skill of community-partnership building. They discussed how, too often, funds are provided to researchers from large academic centers whose scientific publication and grant records are impressive but they lack experience in community engagement. Meanwhile, researchers who have done the work to build relationships can get overlooked because community engagement does not directly make money for the institution. One interviewee explained: “Going back to the role of funders in terms of thinking about where the experts are and where the people are who’ve been doing [community-based] work for a long time, they may not be at the top 10 university, but the top 10 university is going to get that funding.”
Experts spoke of the importance of funding the engagement necessary to do equity-related work, but similarly cautioned that funds should be provided to researchers who have proven they have the necessary skills to conduct research with community partners. They discussed how academic reward structures do not work to break down power imbalances and inequity between communities and researchers. Experts warned that often these reward structures and related reporting requirements are easier for well-resourced centers to adhere to. Therefore, those funding community work should be cautious that strict reporting structures do not exacerbate inequities by producing new hurdles that large institutions can easily overcome but would be insurmountable for smaller institutions. In some cases, as discussed regarding Indigenous inclusion above, the current dominant institutional structures do not support the needs of a population group, and funding would best be provided directly to organizations and scholars to conduct the work within their institutions or to build capacity in new institutions built for purpose.
Accountability
Experts discussed the roles of journals and funding agencies to help with accountability. They suggested journals could require descriptions of why populations were chosen, how the research advanced equity, and how the researchers engaged community partners. One interviewee described a recent experience with publishing: “One experience I’ve had recently is, sometimes it’s hard to find a journal that will publish more of the process-oriented work that goes into creating these strong relationships… There’s ways to do it right, and not do it right. And so, it would be helpful to have venues that publish [the process].”
Experts pointed to creating community engagement guidelines for researchers to adhere to in a similar manner as other ethical requirements. They suggested that funders have a responsibility to help ensure they are supporting research that helps build a representative database, and that funders could implement grant review criteria that ensure adequate time and money are invested in diversity-boosting projects. They suggested milestone-based reporting to ensure targets were being met, and to actively manage what data was being produced and deposited rather than passively accepting whatever data is produced. Some, however, suggested that simply asking researchers to report on their equity was too weak of a requirement to produce meaningful results, suggesting that there could be consequences for failing to adhere to certain standards. Many of these factors that impact equity relate to the trustworthiness of the institutions responsible for funding or conducting data collection, and the strength all of the relationships that are entwined in those systems. Research participants need transparency about what will happen with data they contribute, and how the data might benefit their families and communities. There need to be policies in place to ensure that work is carried out consistently and predictably. Power imbalances, and lack of adequate funding to include participants in data governance structures, mean that community members may be reliant on the whims of individual researchers and vulnerable to further exploitation.
The Relationship Between Health Care Access and Research Participation
Finally, facilitating access to health services can be a way to increase participation in genomic research. While health care delivery and access are well known to impact which individuals are included in clinical datasets, the ability to access health care services also impacts decisions about participation in research. Individuals are often motivated to participate in research because of a desire to help their families and communities. If they belong to groups that have difficulties accessing care, there may be less motivation to contribute to research to build data resources that support clinical care. Focusing on research questions that have actionable outcomes for underrepresented communities is crucial to improving participation, highlighting why it is so important to engage communities at the point where funding priorities and research questions are established.
Conclusion
As described in Robinson et al, 104 experts in a policy Delphi ranked funding studies in underrepresented populations, equipping lower-resourced institutions/communities to use data, and conducting needs assessments to ensure that funding priorities aligned with community priorities as the most feasible and effective policies to tackle equity challenges in hereditary cancer resources. 34 While these approaches may have impact, it is clear from our discussion of the 3 population groups that without access to health care, effective community engagement, and data sovereignty, many of these efforts may be insufficient. Furthermore, efforts that target “underrepresented groups” in pan-group approaches may fail to account for the diverse reasons that different groups remain underrepresented and excluded. Outside the United States, some jurisdictions have begun using “distinctions-based” approaches to ensure unique needs of excluded populations are considered in broad efforts to reduce inequities. Despite these challenges, many of the policy options regarding equity and inclusion in the Delphi study were rated as effective and feasible, 104 indicating that there are many opportunities to work towards improving inclusion.
Progressing towards inclusive cancer gene variant resources depends on acknowledging and addressing racism in institutions, and pursuing actively anti-racist actions. Institutional approaches need to enable participation of groups that have been underrepresented while holding those in power accountable for actions that undermine inclusion. Under-representation can only be solved through concerted efforts to ensure the institutions that are responsible for research and clinical care reflect the priorities, interests, and demographics of hitherto excluded groups. Institutions must constantly evaluate their outcomes and ensure that some population groups are not being left out systematically. Efforts to simply “recruit” racialized or ancestrally diverse groups ignore structural racism within our institutions, and fail to provide all peoples the same access and opportunities that have enabled White populations to participate.
Footnotes
Ethical Approval
The modified policy DELPHI process described within was reviewed by the Arizona State University Institution Review Board and was approved (EXEMPT) on January 18, 2019 (Study 00009507).
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This paper is one of 5 thematically related papers that have emerged from a National Cancer Institute grant to Arizona State University and Baylor College of Medicine: “The Sulston Project: Making the Knowledge Commons for Interpreting Cancer Genomic Variants More Effective” (R01 CA237118). Gerido received support from the Case Comprehensive Cancer Center (P30 CA043703) and the University of Michigan ELSI Research Training Program (T32 HG010030) for the preparation and publication of this manuscript.
Disclaimer
This information or content and conclusions are those of the authors and should not be construed as the official position or policy of, nor should any endorsements be inferred to come from, the National Institutes of Health, the National Cancer Institute, or the U.S. government.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
