Abstract
African American woman are 43% more likely to die from breast cancer than white women and have increased the risk of tumor recurrence despite lower incidence. We investigate variations in microsatellite genomic regions—a type of repetitive DNA—and possible links to the breast cancer mortality gap. We screen 33 854 microsatellites in germline DNA of African American women with and without breast cancer: 4 are statistically significant. These are located in the 3′ UTR (untranslated region) of gene ZDHHC3, an intron of transcribed pseudogene INTS4L1, an intron of ribosomal gene RNA5-8S5, and an intergenic region of chromosome 16. The marker in ZDHHC3 is interesting for 3 reasons: (a) the ZDHHC3 gene is located in region 3p21 which has already been linked to early invasive breast cancer, (b) the Kaplan-Meier estimator demonstrates that ZDHHC3 alterations are associated with poor breast cancer survival in all racial/ethnic groups combined, and (c) data from cBioPortal suggest that ZDHHC3 messenger RNA expression is significantly lower in African Americans compared with whites. These independent lines of evidence make ZDHHC3 a candidate for further investigation.
Introduction
For decades, African American women have had a higher breast cancer mortality rate than any other racial/ethnic group.1,2 The source of this health disparity likely stems from a complex combination of mammography rate and frequency, segregation, socioeconomic status, and differences in biology.3,4 Added to this challenge is the urgency that the racial disparity in breast cancer mortality is getting worse. 5 In 2014, African American women were 43% more likely to die from breast cancer than white women; in 2009, the difference was 39.7%. 5 In addition, young African Americans tend to have more aggressive forms of breast cancer. 4
Some progress has been made to understand racial/ethnic differences in breast cancer biology. A landmark 2006 study demonstrated that basal-like and triple-negative breast cancer is more common among young African American women compared with white women. 4 Triple-negative breast cancer emerges in the absence of the estrogen receptor (ER), progesterone receptor (PgR), and human epidermal growth factor receptor 2 (HER2). These receptors are well-established biomarkers that help guide treatment and predict survival. 6 Indeed, studies have shown that survival is lower for women with triple-negative tumors compared with other breast cancers. 4 Thus, racial/ethnic differences in breast cancer mortality likely have a biological component. Many more established and emerging biomarkers have been linked to breast cancer survival. Mutations in the BRCA1 and BRCA2 genes are associated with hereditary breast cancer and survival. 7 In fact, some studies suggest that the position of mutation within BRCA1 correlates with survival. 8 Markers such as Ki67, cyclin D1, and cyclin E show promise but have not yet found routine clinical use. 6
The growing number of genes implicated in breast cancer has given rise to multigene assays designed to predict various aspects of the disease. For example, the well-established Oncotype DX assay uses the expression of 21 genes to predict breast cancer recurrence in node-negative disease. 9 The MammaPrint assay uses a larger signature of 70 genes to identify patients with good prognosis and poor prognosis. 10 Despite this progress, it is estimated that the genes with established links to breast cancer only account for about 30% of the familial risk. 11 What other gene mutations affect breast cancer survival and are any of these mutations common in African Americans?
To contribute to what is known about breast cancer biology, we investigate microsatellites—a type of repetitive DNA. Microsatellites are understudied compared with single-nucleotide polymorphisms and have the capacity to affect gene expression; moreover, they are already linked to breast cancer 12 and self-identified racial/ethnic groups. 13 Microsatellites consist of a 1- to 6-base-pair unit repeated in tandem to form an array: more than 1 million exist in the human genome often embedded in gene introns, gene exons, and regulatory regions. 14 Interestingly, the length of microsatellite arrays frequently changes due to strand slip replication and heterozygote instability. 14 These changes can influence gene expression by inducing Z-DNA and H-DNA folding, 15 altering nucleosome positioning, 16 and changing the spacing of DNA-binding sites. 14 For these reasons, microsatellites have been called the “tuning knobs” of gene expression. 17 In this study, we use a generalized Fisher exact test to identify microsatellite markers in breast cancer: this improves on our older approach.12,40 We use this updated approach to compare microsatellite genotypes from germline DNA belonging to 2 groups of samples: African Americans with breast cancer (cancer group) and African Americans without breast cancer (healthy group). We found that 4 microsatellites have a significantly different distribution of genotypes in the 2 groups; 1 of these is located in a gene region.
Methods
Microsatellite list generation
A list of microsatellites in version 38 of the human reference genome was generated with a custom Perl script “searchTandemRepeats.pl” using default parameters. This script has been used in previous microsatellite studies and is freely available online at http://genotan.sourceforge.net/#_Toc324410847 (see supplementary material for additional details).
Nomenclature
We adopt a nomenclature designed to emphasize the length of the 2 microsatellite alleles that make up a genotype. For example, the genotype “14|15” indicates a heterozygote genotype with 14- and 15-base-pair alleles, respectively. The genotype “15|15” indicates a homozygote genotype with 2 copies of the 15-base-pair alleles.
Microsatellite genotyping
We used the program RepeatSeq 18 to determine the genotype of microsatellites in next-generation sequencing reads. RepeatSeq has been used in previous studies of microsatellites and is freely available: https://github.com/adaptivegenome/repeatseq (see supplementary material for additional details).
Statistics
For each microsatellite, we check whether the distribution of genotypes differs in the germline DNA from 2 groups of African Americans: 37 with breast cancer and 40 healthy controls. In each case, statistical differences were quantified using a generalized Fisher exact test. This test is appropriate because it is specifically designed for small sample sizes with sparsely populated tables.19,20 For each microsatellite, a contingency table populated with genotype counts is constructed for the 2 groups of patients; then, P values for each contingency table are calculated using the fisher.test function in R. The Bonferroni multiple testing correction (n = 33 854) is applied to control the false discovery rate. Relative risk scores are calculated using MedCalc online statistical software (www.medcalc.org) on determining cancer modal genotypes and nonmodal genotypes (see supplementary material for additional details).
Microsatellite genotyping samples
Breast cancer samples were downloaded from The Cancer Genome Atlas (TCGA); healthy controls were downloaded from the 1000 Genomes Project (KGP). The 40 healthy controls were identified using the 1000 Genomes Project phase 3 exome alignment index file “20130502.phase3.exome.alignment.index”: all female African American samples were included in the analysis. Cancer samples were downloaded from TCGA: all 37 germline African American female samples were included in the analysis.
Genotyping was also performed on 2 sets of white samples with respect to the microsatellite in ZDHHC3 (see section “Results”); for this analysis, we used 136 breast cancer samples and 49 healthy control samples. Cancer samples were downloaded from TCGA: all 136 germline female samples were included in the analysis. The 49 healthy controls correspond to all the female 1000 genome samples with European ancestry.
Samples used in Kaplan-Meier analysis
To investigate ZDHHC3 links to survival, we considered all samples available in TCGA database: 757 white, 183 black or African American, 1 American Indian or Alaska Native, 61 Asian, and 95 not reported. We use the 151 deceased samples for Kaplan-Meier analysis. ZDHHC3 alterations—which include amplifications, deletions, and missense mutations—are found in 14 of these samples; the remaining 137 samples have a normal ZDHHC3 gene.
Results
Four microsatellite markers found in African American breast cancer samples
We use a generalized Fisher exact test to screen 33 854 microsatellites in germline sequencing data from African American women with and without breast cancer (see section “Methods” for details). We use the Bonferroni multiple testing correction (n = 33 854) to mitigate false discoveries and found that 4 microsatellites have significantly different distribution of genotypes in the 2 groups. These microsatellites are located on chromosomes 3, 7, 16, and the unplaced contig GL000220v1. None of these microsatellites have been linked to breast cancer previously, and the risk ratio score suggests a significant increased risk of cancer in woman with the modal genotype (see Table 1 for a summary of the potential markers found in this study).
Summary of the microsatellite markers found in this study.
Abbreviations: CI, confidence interval; RR, relative risk; UTR, untranslated region.
The q values listed here are P values adjusted with the Bonferroni multiple testing correction (n = 33 854). RR indicates risk of cancer of subjects with modal genotype when compared with nonmodal genotypes.
Interestingly, the statistically significant microsatellite on chromosome 3 is the only one located in a gene region. It is a monomeric guanine repeat located at base pair 44 918 234 embedded within the 3′ UTR (untranslated region) of gene ZDHHC3. This gene is a member of the zinc fingers DHHC-type gene family having protein-cysteine S-palmitoyltransferase activity. We found 9-, 10-, and 11-base-pair alleles in the African American samples. The 11-base-pair allele is completely absent in the healthy controls; furthermore, all 37 of the cancer samples are homozygous or heterozygous for the 11-base-pair allele (25 homozygous and 12 heterozygous). However, the 11-base-pair allele does not appear to be unique to African Americans. Genotyping of 136 white samples with respect to the same ZDHHC3 microsatellite (see section “Methods”) revealed that 131 are homozygous or heterozygous for the 11-base-pair allele (103 homozygous and 28 heterozygous); 3 samples were homozygous for the 9-base-pair allele. Genotyping of 49 healthy white women (see section “Methods”) was largely unsuccessful due to low sequencing coverage. Thus, statistical significance for this microsatellite could only be demonstrated among African Americans (see Table 1) but the 11-base-pair allele appears to be common in both African American and white germline cancer samples.
The ZDHHC3 gene is located at cytogenetic band 3p21.31, which has in turn been tied to early invasive breast cancer. Several older studies21,22 first indicated the presence of a tumor suppressor locus within human chromosome 3p21-p22; follow-up studies23,24 narrowed down the region to 3p21.3 and established links to clinically early-stage sporadic breast tumors. 25 The Semaphorin 3F gene from this region was shown to have a role in tumorigenicity in mice; however, its expression had no effect in the lung cancer line GLC45. 26 Thus, a definitive conclusion has not been reached regarding role of 3p21 in human cancer, particularly breast cancer. Given the interest in this region, it is surprising that relatively little attention has been given to ZDHHC3.
We recognize that the genetic background of the samples is a potential pitfall of our microsatellite analysis; the risk is that the ZDHHC3-embedded microsatellite could be in linkage disequilibrium with other well-established markers making it appear significant when it in fact is not the causal site. However, this is not the case because ZDHHC3 is on chromosome 3 and most established markers are on different chromosomes altogether: BRCA1 on chromosome 17, BRCA2 on chromosome 13, HER2 on chromosome 17, ER on chromosome 6, PgR on chromosome 11, MUC1 on chromosome 1, and P53 on chromosome 17. To ensure that we went a step further and cross-referenced the location of the ZDHHC3 gene with the 70 genes in the well-established MammaPrint assay 27 ; among these, the nearest gene to ZDHHC3 is RAB6B located nearly 90 Mb away. However, linkage disequilibrium is expected to be relevant in the range of 10 to 30 kb for European populations and perhaps less for African populations. 28 We reiterate that only germline DNA samples—for which genotypes are unchanged throughout the lifetime of the individual—were used for this analysis.
ZDHHC3 alterations are linked to mortality in African Americans and whites
Does ZDHHC3 affect breast cancer survival? To answer this question, Kaplan-Meier analysis using the 151 deceased samples available in the TCGA database was performed. Alterations in the ZDHHC3 gene—which include which include amplifications, deletions, and missense mutations—are found in 14 of these samples; the remaining 137 samples have a normal ZDHHC3 gene. The Kaplan-Meier estimator shows that patients without ZDHHC3 alterations live significantly (log-rank P < .03) longer than patients with ZDHHC3 alterations (see Figure 1). The mean overall survival for patients without ZDHHC3 alteration (150.2 ± 7.3 months) is more than 2-fold higher than the mean survival for patients with alteration (74.1 ± 5.3 months). We propose that ZDHHC3 alterations may be an important biomarker for breast cancer survival.

Kaplan-Meier curve comparing breast cancer survival in the presence and absence of ZDHHC3 alteration. All racial/ethnic groups are combined.
Most of the factors that potentially influence this analysis were balanced in the altered and unaltered ZDHHC3 cohorts (see Table 2). In particular, patients in both altered and unaltered ZDHHC3 cohorts did not significantly differ in age at pathologic diagnosis (F > 0.33), menopause status, therapeutic history, histologic type, and most (41% ZDHHC3 altered group; 43% unaltered ZDHHC3 group) of the patients were diagnosed with AJCC neoplasm disease stage II group a/b breast cancer. Mutations were not found in KRAS, BRAF, EGFR, or ALK with the exception of an in-frame KRAS deletion (0.8%) and a missense ERBB2 mutation (2%) in the unaltered ZDHHC3 group. The ER, PgR, and HER2 status were also recorded for most of the patients and did not considerably differ (see Table 2). BRCA status and lifestyle factors were not available in data retrieved from TCGA cohorts; this remains an acknowledged weakness of this analysis.
Comparison of cohorts used for Kaplan-Meier analysis.
Both altered and unaltered ZDHHC3 cohorts did not significantly differ in age at pathologic diagnosis (F > 0.33), menopause status, therapeutic history, histologic type, and most (41% ZDHHC3 altered group; 43% unaltered ZDHHC3 group) of patients were diagnosed with AJCC neoplasm disease stage II group a/b breast cancer. Mutations were not found in KRAS, BRAF, EGFR, or ALK with the exception of an in-frame KRAS deletion (0.8%) and a missense ERBB2 mutation (2%) in the unaltered ZDHHC3 group. Estrogen, progesterone, and HER2 receptor status were also recorded for most of the patients and did not considerably differ. BRCA status and lifestyle factors were not available in data retrieved from TCGA cohorts.
ZDHHC3 messenger RNA expression is lower in African American than white patients with cancer
We compared messenger RNA (mRNA) expression in tumor tissue of the breast from 182 African Americans and 751 whites with breast cancer. Using TCGA microarray breast cancer data from cBioPortal, 29 we found that African Americans with breast cancer have significantly (P < .0001 with t-ratio: 3.83) lower mRNA expression of ZDHHC3 than whites with breast cancer: compared using a t test as performed in other studies.30,31The cause and effect of this difference remains elusive. The microsatellite harbored by ZDHHC3 does not appear to be the causal site because we see no correlation between the various microsatellite alleles and mRNA expression levels. In particular, we are not able to link the 11-base-pair allele to aberrant mRNA expression levels. However, microsatellite mutations are complex and we cannot rule out the possibility that the 11-base-pair allele may have causal effects on splicing, translation, and DNA-protein binding, to name a few.
Discussion
Our results suggest that ZDHHC3 is a potential marker for breast cancer. We provide 4 lines of evidence. First, an 11-base-pair monomeric guanine repeat embedded within the 3′ UTR is common in germline samples of African Americans with breast cancer and virtually absent in African American controls. The 11-base-pair allele also appears common in germline samples of whites with breast cancer. Second, Kaplan-Meier analysis shows that patients without ZDHHC3 alterations live significantly longer than patients with ZDHHC3 alterations. Third, African Americans with breast cancer have significantly lower mRNA expression of ZDHHC3 than whites with breast cancer. Fourth, previous studies have linked loss of heterozygosity at human chromosome 3p21-p22 to early invasive breast cancer. 25 So far, these lines of evidence are mutually independent; that is, we are not yet able to link the 11-base-pair microsatellite allele to aberrant mRNA expression levels and we do not show that patients with breast cancer with lower ZDHHC3 expression have worse overall survival. These questions will be addressed in future studies.
African American women continue to have higher rates of breast cancer mortality than any other racial/ethnic group. We add to what is known about this problem by analyzing microsatellite variations in African Americans with and without breast cancer and found a promising new microsatellite marker located on chromosome 3 at base pair 44 918 234. The microsatellite is located in the 3′ UTR of the ZDHHC3 gene. Homozygotes for the 11-base-pair allele are common in breast cancer samples of both African Americans and whites. We found additional microsatellite markers for breast cancer located on chromosome 7 and the unplaced contig GL000220v1. These markers are located in an intron of the transcribed pseudogene INTS4L1 and an intron of the ribosomal gene RNA5-8S5, respectively. A fourth marker is located in an intergenic region of chromosome 16. Survival data and mRNA expression could only be obtained for the microsatellite marker in ZDHHC3 (see section “Results”). These results build on a previous study that identified 55 microsatellites capable of distinguishing breast cancer and healthy individuals with a sensitivity of 88.4%.12
We show that the presence of any alteration in ZDHHC3 is linked to breast cancer mortality in both African Americans and whites. Although ZDHHC3 mRNA expression is significantly lower in African Americans compared with whites, we were unable to reach a conclusion about its role in the breast cancer mortality gap. Nevertheless, ZDHHC3 has not been identified previously and adds to the known biomarkers associated with breast cancer mortality. It is difficult to speculate about the role of the 3 additional microsatellite markers in African Americans. Two of these are found in introns and could conceivably affect splicing or DNA-protein binding. Given how little is known about the function of microsatellites, a mechanism for their effects could simply be unknown.
ZDHHC3 expresses an enzyme that contains a DHHC domain. Human DHHC proteins consist of 23 genes that are member of a family of palmitoyltransferases. Palmitoylation, or more specifically S-palmitoylation, affects protein stability, function, and trafficking. 32 Interestingly, the process of palmitoylation affects the function of dysregulated genes in breast cancer such as epidermal growth factor receptor and ERs. 33 In fact, DHHC genes have been associated with numerous cancers 34 including breast cancer. Other evidence has not yet implicated ZDHHC3 in breast cancer but our results underscore the importance of this gene. We found that alterations in ZDHHC3 have a significant effect on patient survival. We also found that African Americans have a significantly lower expression of ZDHHC3; aforementioned, African American patients have more aggressive forms of breast cancer. The loss of ZDHHC3 has already been shown to be associated with squamous cell cervical carcinoma through the downstream effects of the ZDHHC3 substrate DR4 (TRAIL-R1). 35 Palmitoylation of DR4 localizes it to the plasma membrane where it can be bound by TRAIL (tumor necrosis factor–related apoptosis-inducing ligand) to induce apoptosis. In this context, ZDHHC3 is proposed to function as a tumor suppressor in squamous cell cervical carcinoma. 36 However, ZDHHC3 has also been suggested to function as a potential oncogene in cancer. ZDHHC3 oncogenic potential is thought to be derived from regulating laminin-binding α6β4, which is involved in cell motility and invasion through expression of Src. 37 ZDHHC3 has also been suggested to be involved in curcumin treatment effects of invasive breast cancer cells. Curcumin—which has been investigated as a treatment in cancer and metastasis 38 —has been shown to block acylation of DHHC3, which is responsible for integrin β4 palmitoylation, and subsequently suppresses breast cancer cell signaling. 39 Collectively, ZDHHC3 is a compelling target that should be further investigated.
Overall, ZDHHC3 appears to be intriguing for 3 reasons. First, the presence of an 11-base-pair microsatellite allele embedded in its 3′ UTR is statistically significant in African American breast cancer samples. This allele may also be important in whites; however, statistical significance could not be demonstrated due to sample size. Second, ZDHHC3 alteration is linked to breast cancer mortality in all racial/ethnic groups. Third, ZDHHC3 mRNA expression levels are significantly lower in African American breast cancer samples compared with white breast cancer samples. More work needs to be done to investigate the hypothesized role of ZDHHC3 in the breast cancer mortality gap.
Footnotes
Acknowledgements
The authors thank Liang Shan for assisting with statistical analysis.
Funding:
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was funded by a grant from the Bradley Engineering Foundation to the Edward Via College of Osteopathic Medicine.
Declaration of conflicting interests:
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: HRG is the founder and co-owner of Orbit Genomics which may be interested in licensing these findings. Orbit Genomics was not involved with any aspect or in funding of this research. All authors have no competing financial interest.
Author Contributions
NK and HRG contributed to the conceptualization of this project, experimental design, data analysis, and the writing of this manuscript. NK, RA, and RTV were responsible for software writing and execution, data analysis, and manuscript preparation. NK, RTV, RA, and HRG conceived and designed the experiments; analyzed the data; wrote the first draft of the manuscript; contributed to the writing of the manuscript; agree with manuscript results and conclusions; jointly developed the structure and arguments for the paper; made critical revisions and approved final version, and reviewed and approved the final manuscript. All authors read and approved the final manuscript.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
