Abstract
Background:
The aging Mexican American (MA) population is the fastest growing ethnic minority group in the US. MAs have a unique metabolic-related risk for Alzheimer’s disease (AD) and mild cognitive impairment (MCI), compared to non-Hispanic whites (NHW). This risk for cognitive impairment (CI) is multifactorial involving genetics, environmental, and lifestyle factors. Changes in environment and lifestyle can alter patterns and even possibly reverse derangement of DNA methylation (a form of epigenetic regulation).
Objective:
We sought to identify ethnicity-specific DNA methylation profiles that may be associated with CI in MAs and NHWs.
Methods:
DNA obtained from peripheral blood of 551 participants from the Texas Alzheimer’s Research and Care Consortium was typed on the Illumina Infinium® MethylationEPIC chip array, which assesses over 850K CpG genomic sites. Within each ethnic group (N = 299 MAs, N = 252 NHWs), participants were stratified by cognitive status (control versus CI). Beta values, representing relative degree of methylation, were normalized using the Beta MIxture Quantile dilation method and assessed for differential methylation using the Chip Analysis Methylation Pipeline (ChAMP), limma and cate packages in R.
Results:
Two differentially methylated sites were significant: cg13135255 (MAs) and cg27002303 (NHWs) based on an FDR p < 0.05. Three suggestive sites obtained were cg01887506 (MAs) and cg10607142 and cg13529380 (NHWs). Most methylation sites were hypermethylated in CI compared to controls, except cg13529380 which was hypomethylated.
Conclusion:
The strongest association with CI was at cg13135255 (FDR-adjusted p = 0.029 in MAs), within the CREBBP gene. Moving forward, identifying additional ethnicity-specific methylation sites may be useful to discern CI risk in MAs.
INTRODUCTION
Alzheimer’s disease (AD) is currently the 7th leading cause of mortality in the United States (US), though its prevalence is expected to triple over the coming decades as the Baby Boom generation continues to age [1]. Demographic shifts in the US population, however, will precipitate uneven effects affecting health outcomes across racial and ethnic groups [1]. For example, Mexican Americans (MA) who are the largest and fastest growing aging minority group in the US have a unique predisposition for cognitive impairment (e.g., AD and mild cognitive impairment (MCI)) [2–4]. Existing protein biomarker panels for predicting AD risk in MAs include various metabolic factors such as FABP and GLP-1 [5], suggesting that MAs may suffer from a distinct metabolic-related form of AD. In contrast, non-Hispanic whites (NHW) have predominantly inflammation-based AD associated protein biomarkers [5]. Further, MAs are often diagnosed with more severe forms of AD at a younger age relative to NHWs [3, 6]. Despite this, MAs are still likely to live 3 years longer (on average) compared to NHW counterparts— a phenomenon referred to as the “Hispanic Paradox” [7, 8]. The frequency and effect of the Apolipoprotein E (APOE) ɛ4 allele [3, 4], known to confer 3- to 12-fold risk for AD in NHWs [9], is much lower in MAs [3]. Though poorly understood, exploring the intersection of genetics and environmental/lifestyle factors via epigenetic studies may help elucidate the root cause of these differences in AD etiology and presentation.
The multifactorial risk for cognitive impairment (CI) involves both genetics and epigenetics which is influenced by environmental and lifestyle factors [4, 11]. The genetic risk for AD has been estimated to be 58% (based on twin studies) [12]. While over two dozen single nucleotide polymorphisms (SNPs) have been associated with AD risk [13, 14], including ɛ4 allele of the APOE gene [15], this heritability fails to explain the heterogeneity of AD onset and presentation across individuals and racial/ethnic groups, a term referred to as ‘missing heritability’ [13, 16]. Twin studies have shown that monozygotic twins subjected to different environments experience different disease outcomes despite sharing the same genetic risk for disease [12]. This suggests that genetics, environmental, and lifestyle factors play important roles in conferring risk for AD. Examples of lifestyle factors influencing risk for CI are smoking and inactivity [17, 18]. Some factors however can decrease CI risk, such as education. Education has been proven to be a protective factor against dementia as having no education doubles the risk of developing dementia [19]. As such, investigations of the interplay between genetics and environment are pivotal for understanding AD etiology.
Epigenetic factors regulate gene function through chemical modification of DNA (most often via methylation) that does not alter the sequence [20]. DNA methylation occurs through the addition of a methyl group to the cytosine base of DNA [21]. This process is catalyzed by DNA methyltransferases, and typically occurs at CpG dinucleotides where cytosine and guanine bases are consecutively paired together [21]. Degree of CpG methylation at sites involved in gene regulation are modifiable [20] and potentially reversible [22] depending on environmental factors such as diet, exercise, and lifestyle [20]. Environmental factors can impact the rate of biological aging [23]. The human aging process can be quantified epigenetically through a collective number of CpG sites that have varying degrees of methylation such as hyper- (increased) or hypo- (decreased) methylation [23, 24]. Certain patterns of methylation, however, have been associated with age-related diseases such as AD [25, 26]. The rate of epigenetic aging can also differ across ethnic groups depending on burden of co-morbidities and lifestyle factors [27].
In this study we sought to identify differentially methylated positions of the genome that may be associated with CI (diagnosed as AD/MCI) in MAs and NHWs. We hypothesized there would be differentially methylated genomic sites or regions associated with CI in both groups (MAs and NHWs) but that methylation profiles would differ across ethnic groups. Differential DNA methylation may explain differences in racial and ethnic susceptibility for CI since environmental factors influencing methylation levels can vary from one ethnic group to another. Understanding how DNA methylation might impact age associated diseases/phenotypes such as AD and MCI among varying ethnic groups could inform future race- and ethnicity-specific risk assessments. Targeted risk assessments may also aid the development of more-efficacious interventions/therapeutics for deterring/delaying onset of AD in the future.
METHODS
Dataset and study design
Participants were selected from the Texas Alzheimer’s Research and Care Consortium (TARCC), a collaborative effort aimed at understanding the etiology, pathophysiology, treatment, and prevention of AD. Specifically, TARCC aims to bridge health disparities by incorporating minority populations in Texas such as the Hispanic community, in particular MAs, into research. Peripheral blood samples and accompanying demographic, clinical, and cognitive data from 600 individuals (300 NHWs and 300 MAs) were provided by TARCC. Participants were stratified into two groups based on cognitive status: those with CI (diagnosed as either AD or MCI) and normal controls. All participants were ≥50 years old and matched based on age and sex. Following removal of duplicates and participants with missing data, the final cohort (n = 551) consisted of 252 NHW and 299 MA participants (Table 1). This study was approved by the North Texas Regional Institutional Review Board # 1330309-1. Informed consent was obtained by TARCC from all participants.
Demographic table of the TARCC cohort
Standard two-sample t-tests assuming unequal variances were used to evaluate any significant difference between normal controls (NC) and participants with cognitive impairment (CI) for age (in years), education level (in years), Mini-Mental State Exam (MMSE), and Clinical Dementia Rating (CDR) sum.
Cognitive function diagnosis
Participants were determined cognitively normal or diagnosed with AD/MCI based on a consensus by a review committee consisting of healthcare professionals such as a neuropsychologist and clinician [28]. Each participant underwent a battery of neurocognitive tests (SF1) such as Clinical Dementia Rating (CDR) and Mini-Mental State Exam (MMSE) to determine cognitive status. An English or Spanish version of assessments was provided to participants depending on their native language in order to obtain a fair assessment of cognitive status. Individuals with other neurological diseases or conditions such as Parkinson’s disease or severe depression-related cognitive dysfunction, were excluded.
DNA extraction, genotyping, and methylation analysis
DNA from the buffy coat of peripheral blood was extracted using the MagBind ® Blood and Tissue DNA HDQ Kit (Omega Bio-tek, Norcross, GA) and a Microlab STAR liquid handling system. DNA extracts were quantified using the Qubit ® dsRNA BR Assay kit (Thermofisher, Waltham, MA). All blood samples were handled according to standard regulations and protocols, IBC/p/NP-2018-2. DNA samples (200 ng with a concentration of ≥10 ng/μl) were genotyped using the Infinium® HTS Global Screening Array v.2 (Illumina, San Diego, CA). Extracts below 10 ng/μL were concentrated using Microcon® DNA Fast Flow Filters (Sigma-Aldrich, Milwaukee, WI). The >10 ng/μL genotyping criterion was also applied for methylation typing to select suitable samples. DNA samples were bisulfite converted using the EZ DNA MethylationTM kit (Zymo Research, Irvine, CA) and subsequently processed on the Infinium ® MethylationEPIC BeadChip array (Illumina) which assesses the methylation status of over 850,000 CpG sites. Although spike ins were not included, control probes within the EPIC array were present to ensure efficient bisulfite conversion. According to the Illumina Infinium HD Assay Methylation Protocol Guide (Document # 15019519 v01), the Infinium I probe and Infinium II probes have bisulfite conversion controls, each that emit varying high signals depending on whether the bisulfite conversion was successful or not. Technical replicates per batch of EPIC BeadChips were run and all had an R2 value of ≥0.98. Signal intensity data (IDAT) files containing raw data obtained from the array [29] were uploaded and assessed in R studio. Beta values from these IDAT files representing levels of methylation detected by array probes at respective CpG sites (whereby 0 is an unmethylated site and 1 is a fully methylated site) were analyzed [30].
Methylation data and statistical analysis
Any differences between cognitively impaired participants and normal controls were assessed using the standard two-sample t-test assuming unequal variances. The analysis found significant differences (p-values<0.05) between cognitively impaired participants and normal controls for education, MMSE, and CDR sum among both the MA and NHW groups. The education level among cognitively impaired participants was significantly lower than normal controls in both MAs and NHWs. Education has been proven to be a protective factor against dementia [19] and might confer some protective effect against CI in normal controls here too. As expected, MMSE scores were significantly lower and CDR sum rating significantly higher, among cognitively impaired participants compared to normal controls in both MAs and NHWs. Age was deemed significantly different between cases and controls in the NHW group only. Beta values obtained from raw IDAT files were analyzed using the ChAMP Bioconductor package in R [31–34]. Data was normalized using the BMIQ method [35], batch effect correction was undertaken using the ComBat function [36, 37]. Covariates contributing to significant variation in results were visualized using singular value decomposition (SVD) plots within ChAMP [38].
Covariates that were adjusted for include age, sex, education level (in years), APOE ɛ4 allele status (present or not), participant recruitment site, PCA eigenvector values 1 & 2 and the minfi package generated white blood cell type proportions (CD8 T-cells, CD4 T-cells, natural killer cells, B-cells, monocytes, and neutrophils) [32]. The recruitment site was selected as a covariate since TARCC is a collaborative research effort with seven sites used to recruit participants. Genetic ancestry was accounted for by using PCA eigenvector values 1 & 2 derived from linkage disequilibrium pruned SNP data. These eigenvector values were generated using smartPCA for population stratification and thereby to verify self-reported ethnicity. The first two eigenvectors of the genetic relatedness matrix were used as covariates since they accounted for the most genetic variation in the dataset.
The covariates displaying the strongest significant association with CI in the SVD plots were chosen for further downstream analysis using the cate R package [39]. The cate R package was used to compare methylation at CpG sites for significance between the cognitively impaired and controls while controlling for potential unmeasured confounders and incorporating associated covariates using SVD plots. Covariates selected for adjustment using cate in MAs were sample group, age, sex, CD8 T-cells, CD4 T-cells, NK cells, B-cells, monocytes, neutrophils, EV1, EV2, and recruitment site (SF2c). In NHWs the covariates were sample group, sex, CD8 T-cells, CD4 T-cells, NK cells, B-cells, monocytes, neutrophils, EV2, and recruitment site (SF3c). Limma was used to convert beta values to M-values as M-values are better suited for statistical analysis of differential methylation from microarray based data [30, 41]. The lambda value (genomic inflation factor) of p-values were calculated using QCEWAS [42]. False discovery rate (FDR) approach was used to adjust for multiple testing. An FDR-adjusted p-value <0.05 was used to determine significantly differentially methylated CpG sites between those cognitively impaired versus controls.
RESULTS
The cate package was used to adjust for confounders and qqman package was used to check for genomic inflation (SF4) in both MAs and NHWs [43]. Q-Q plots of p-values (Fig. 1) generated thereafter displayed little signs genomic inflation.

Q-Q plots obtained from p-values after adjusting for confounders using cate. A) p-values from the Mexican American (MA) cohort. B) p-values from the non-Hispanic white (NHW) cohort. Most of the p-values observed for each of the CpG sites investigated fall within the expected range of methylation differences between cognitively impaired individuals and normal controls however the outlier data points at the end of the graph suggested those CpG sites are differentially methylated between these groups.
Two CpG sites in MAs (cg13135255 and cg01887506) and three in NHWs (cg27002303, cg10607142, and cg13529380) displayed differential methylation between cognitively impaired participants and normal controls (Table 2). Based on an FDR-adjusted p-value threshold of 0.05, two significant differentially methylated CpG sites were obtained: cg13135255 in MAs and cg27002303 in NHWs. Three CI-associated suggestive sites obtained, based on an FDR-adjusted p-value threshold of 0.1, were cg01887506 (MAs) and cg10607142 and cg13529380 (NHWs). Methylation in those cognitively impaired versus normal controls tend to be slightly elevated (hypermethylation) at all the CpG sites reported, except for cg13529380 in NHWs which shows a slight decrease (hypomethylation) in methylation instead (Fig. 2).

Average beta values at significant and suggestive CpG sites in cognitively impaired and normal controls among (A) Mexican Americans and (B) non-Hispanic whites. The y-axis displays the beta values between 0 and 1 reflecting level of methylation at the CpG site, where 0 represents an unmethylated site and 1 is a fully methylated site [30].
Differentially methylated CpG sites obtained among Mexican Americans and non-Hispanic whites from the TARCC cohort with the associated gene and FDR adjusted p-values
DISCUSSION
Despite well-established pathological features of AD, there are number of remaining questions and challenges related to understanding AD pathogenesis. It remains relatively unknown how late onset AD pathology begins and diagnosis of AD is often not confirmed until an autopsy [44]. AD related changes in the brain are detectable prior to diagnosis, despite a lack of functional or psychiatric change [1, 46]. Identifying molecular factors associated with AD, within easily accessible tissue such as peripheral blood, could help identify individuals at risk for AD early. Methylation-based studies can provide necessary insight into these issues since AD risk is also influenced by environmental factors.
In this study, five differentially methylated sites in peripheral blood were ascertained to be associated with CI, either significantly (FDR p < 0.05) or suggestively (FDR p < 0.1). These were CpG sites cg13135255 (FDR-adjusted p = 0.029) and cg01887506 (FDR-adjusted p = 0.074) in MAs, as well as cg27002303 (FDR-adjusted p = 0.037), cg10607142 (FDR-adjusted p = 0.095) and cg13529380 (FDR-adjusted p = 0.095) in NHWs. Altogether, three CpG sites (cg01887506, cg10607142, and cg13529380) were identified as suggestive based on an FDR p-value threshold of 0.1.
Notably, out of the five CI-associated CpG sites, two hits fell below the traditional FDR threshold of 0.05: cg13135255 in MAs (FDR-adjusted p = 0.029) and cg27002303 (FDR-adjusted p = 0.037) in NHWs. The strongest association was observed for the CpG site cg13135255 (FDR-adjusted p = 0.029) in MAs, which is situated within the intron of the CREB binding protein (CREBBP) gene. CREBBP is a histone acetyltransferase that regulates acetylation impacting downstream gene transcription (Fig. 3) [47]. It has an important role in forming memory and mutations in this gene can cause cognitive dysfunction (e.g., Rubinstein-Taybi syndrome) [47]. Regulation of CREBBP through an upstream non-coding RNA and a mi-RNA causes hyperacetylation at CREB1, GATA2, NFKB1 and FOXA1 genes; all of which have been implicated in AD associated with exposure to toxic metals [48]. A genetic variant in this gene has also been associated with episodic memory loss in healthy aging individuals [49]. The cg13135255 site has also been previously associated with post-traumatic stress disorder severity following social adversity and stress in a predominantly female African American cohort [50]. CREBBP plays an indirect role in some proinflammatory pathways through co-activation of transcription factors such as CREB and NF-kappaB. AD associated protein biomarkers are predominantly non-inflammation based in MAs and there are no studies yet proving direct association between CREBBP and CI-based inflammation. Therefore, it is unsurprising that this hit was significant in MAs and not in NHWs within this TARCC cohort [5]. Hypermethylation within CREBBP at a global level in peripheral blood is associated with CI in MAs, in this study.

Loci of differentially methylated significant or suggestive CpG sites among Mexican American and non-Hispanic white TARCC participants (site of gene/loci derived from UCSC genome browser [47]).
The cg27002303 site (NHWs) is located within the LOC100188947 loci, which is a non-coding RNA (Fig. 3) [47]. A SNP variant within this loci has been associated with increased ADHD risk among children in a family-based study set in Montreal [52]. This CpG site has been reported in Han et al. (2021) which examined fetal malnutrition and its impact on pineal development, by observing patterns of melatonin production and assessing DNA methylation levels in a Chinese cohort. Network analysis found this CpG site involved in processes such as nervous system development and synapse assembly among others [53]. Further information regarding its function or role in influencing gene expression is yet to be elucidated. Both highly significant sites (cg13135255 and cg27002303) among MAs and NHWs respectively have been associated with either cognitive function or development, therefore environmental factors influencing methylation at these sites could possibly contribute to AD pathogenesis collectively alongside other AD risk associated CpG sites.
In addition to the aforementioned significant CpG sites with FDR-adjusted p < 0.05, the other suggestive hits (FDR-adjusted p < 0.1) associated with CI were CpG sites cg01887506 (MAs), cg10607142 (NHWs), and cg13529380 (NHWs). The site cg01887506 is situated within the EPHA4 gene which is most expressed in the brain hippocampus and plays an important role in the formation of the dendritic spines within the hippocampus (Fig. 3) [47, 54]. EPHA4 was found to regulate amyloid-β production [55]. A decrease in EPHA4 expression has been associated with an increase in BACE1 expression resulting in increased levels of amyloid-β [55]. One study in a cohort from the Netherlands found protein levels of the gene in AD and control participants to be the same, however distribution of the protein in the hippocampus differed among cases and controls whereby it was found alongside plaques in AD patients [54]. Of the five CpG sites reported in this study, cg01887506 is the only CpG site that is already associated with AD pathogenesis. Hypermethylation at EPHA4 globally in peripheral blood might either reflect methylation changes in the hippocampus as it’s highly expressed there or possibly indirectly influence methylation in the hippocampus via the blood-brain barrier.
The site cg10607142 (NHWs) is within the ADAMTS14 gene (Fig. 3) [47]. The A Disintegrin and Metalloproteinase with Thrombospondin Motifs (ADAMTS) genes play important roles in processes such as angiogenesis and inflammation and has been associated with AD [56]. The ADAMTS14 gene, however, is normally involved in producing collagen and a polymorphism within is associated with osteoarthritis [57]. The ADAMTS14 gene associated with this CpG site has the least established association with CI out of the genes reported in this study.
Whether the five reported CpG sites confer early CI risk or are modulated as a result of CI presence is yet to be determined. A longitudinal study may be best suited to identify whether methylation patterns at these CpG sites are a cause or consequence of CI. The results of this study demonstrate, however, that the methylation landscape associated with cognitive impairment is vastly different between MAs and NHWs. Further comparative analysis between cognitively impaired NHWs and MAs displayed hyperinflation on Q-Q plots even after adjusting for confounders using cate, which can be explained due to population stratification [58] since the groups compared differ genetically (SF5). Similar hyperinflation was observed comparing normal control NHWs and MAs (SF5).
Methylation associated with AD has previously been established in various studies, such as with the APOE region and AD [59, 60] and a collectively higher risk from 71 methylation sites in the genome than already established single allele risk variants [25, 61]. Many studies report a mixture of either hypermethylated or hypomethylated regions or sites that are associated with CI and can vary depending on the ethnic population in focus. DNA methylation levels can be specific to certain populations such as increased global methylation in peripheral blood from AD Caucasians with poor cognitive performance [26] and altered methylation at 48 CpG sites within the APOE region associated with aging and age-related cognitive decline in a healthy African American population [61]. In Caucasian based population studies, many of the CI associated alteration of methylation are at genes involved in inflammatory pathways [25, 62], like the Religious Order Study or the Memory and Aging Project study that found AD associated methylation in brain tissue within 5 genes involved in lipid metabolism, tau tangles and the inflammatory pathway (ABCA7, BIN1, HLA-DRB5, SLC24A4, and SORL1) [63]. Japanese population-based studies have also found similar associations between cognitive impairment and methylation levels in genes within inflammatory pathways, such as reduced TREM1 methylation at three CpG islands in peripheral leukocytes of AD patients [64].
In other ethnic populations, differentially methylated genes have been reported in pathways that are not associated with inflammation in patients with cognitive decline. In Chinese populations, heavily methylated CpG islands in the KLOTH gene promotor was associated with MCI [65] and heavy methylation of BDNF which has neuroprotective functions was associated with AD [66]. A study evaluating methylation levels at 48 CpG sites within the APOE region in peripheral blood from African Americans found lack of methylation in 8 CpG islands across the APOE, PVRL2, and TOMM40 genes to be significantly associated with delayed recall cognitive function [61]. Studies in Mexican Americans also show association between cognitive decline and non-inflammatory pathways that lean more towards metabolic processes [5, 68].
A previous study conducted by our group discovered 10 CpG sites and 4 regions differentially methylated associating with MCI in 90 MA participants from the Health and Aging Brain among Latino Elders (HABLE) cohort [69]. Notably, some of the previously identified genes associated with CI (SEPT9, CCNY, and KLHL29) were replicated here (based on unadjusted nominal p < 0.005). However, the individual CpG sites vary between studies. The CpG sites were cg09029294 in SEPT9 (NHWs), cg16922732 and cg16015468 in SEPT9 (MAs), cg02524725, cg11958675, and cg20772106 in CCNY (MAs), and finally cg10755016 in KLHL29 (MAs) (Table 3).
Significant CpG sites (nominal p-value <0.005) at replicated genes from previous study [69]
Though informative, this study did have some limitations. The focus of this study was to identify differential methylation associated with CI; however, next steps could involve adjusting for type 2 diabetes comorbidity as its presence is known to increase the risk of developing AD [1, 70]. Methylation studies based on peripheral blood provide insight into global methylation levels that might influence CI risk; however, methylation signatures in peripheral blood may not accurately reflect epigenetic changes in brain tissue. Despite this, there are a plethora of studies showing correlation between methylation changes in the blood and cognitive changes in the brain as discussed earlier. Given this correlation, peripheral epigenetic markers such as DNA methylation may be useful for predicting ethnicity-specific risk for AD. Early signs of cognitive dysfunction can be indistinguishable between MCI and AD patients in the general population, and often MCI patients develop AD later in life [1, 71]. A biomarker that indicates risk at preclinical stage for progression to AD (e.g., MCI) would have great utility since there is still no cure or method of slowing neuronal degeneration [1]. Methylation based biomarkers could potentially be more useful since methylation can be reversed depending on lifestyle choices and changes.
Footnotes
ACKNOWLEDGMENTS
The research team thanks the Texas Alzheimer’s Research and Care Consortium team of investigators.
FUNDING
This research project was supported (in part) by funding provided to the Texas Alzheimer’s Research and Care Consortium by the Darrell K Royal Texas Alzheimer’s Initiative, directed by the Texas Council on Alzheimer’s Disease and Related Disorders.
This work was also supported by the Office of Vice President for Research and Innovation, the Institute for Healthy Aging, and National Institutes of Health/National Institute on Aging (T32 AG020494) (2018 and 2020) including half predoctoral international fellowship through the Neurobiology of Aging and Alzheimer’s Disease (NBAAD) Training Program 2021-2022 and R01AG070862 to RCB.
CONFLICT OF INTEREST
The authors have no conflict of interest to report.
DATA AVAILABILITY
The data supporting the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.
