Abstract
The exposome characterizes all environmental exposures and their impact on a disease. To determine the causally-associated components of the exposome for cerebral small vessel disease (CSVD), we performed mendelian randomization analysis of 5365 exposures on six clinical and subclinical CSVD measures. We found statistically significant evidence (FDR-corrected P < 0.05) that hypertension, high cholesterol, longer television-watching time, lower educational qualifications, younger age of first sexual intercourse, smoking, reduced pulmonary function, higher subjective overall health rating, and frequent tiredness were associated with increased risk of intracerebral hemorrhage or small vessel stroke. Adiposity, diabetes, frequent alcoholic drinks, higher white blood cell count and neutrophil count were significantly associated with higher risk of non-lobar hemorrhage or small vessel stroke, but not lobar hemorrhage. Hypertension, higher arm or leg fat-free mass and higher sitting height were significantly associated with higher white matter hyperintensities. The results were robust to sensitivity analyses and showed no evidence of horizontal pleiotropy. We also identified 41 exposures suggestively associated (uncorrected P < 0.05) with multiple CSVD measures as the “the CSVD exposome”. This exposome-wide association study provides insight into CSVD development and prevention.
Keywords
Introduction
Cerebral small vessel disease (CSVD) refers to all pathological processes affecting the small vessels of the brain. 1 Neuroimaging findings resulting from CSVD encompass recent small subcortical infarcts, lacunes, white matter hyperintensities (WMH) and microbleeds. 2 CSVD contributes to 25% of ischemic strokes, namely small vessel stroke (SVS), and most hemorrhagic strokes, 3 mainly intracerebral hemorrhage (ICH). Except for some gene-driven subtypes, acquired environmental exposures play a critical role in its development. Despite a few well-recognized risk factors, other causally-related environmental exposures remain unclear. 4
Observational studies have suggested some CSVD-related exposures, 5 but those results were limited by confounding and reverse causality biases. Using genetic variants associated with exposures, Mendelian randomization (MR) investigates causal relations between exposures and outcomes. It is less susceptible to confounding and reverse causality bias. 6 The “exposome” introduced by C. P. Wild 7 characterizes the whole spectra of environmental exposures and their influences on a disease. 8 It refers to the totality of exposures from a variety of external and internal sources including chemical agents, biological agents, or radiation, from conception onward, over a complete lifetime. 9 Internal exposome are those that are unique to the individual; and external exposome include occupational exposures, lifestyle factors, education level and financial status. 10 Exposome-wide association screening for CSVD via MR has been made possible by publicly available genome-wide association studies (GWAS) of various exposures and CSVD clinical and subclinical measures.11,12
Here, we appraise the associations and causalities between 5365 environmental exposures and CSVD clinical (ICH by location or SVS) and subclinical (WMH) measures through MR analysis. Two diffusion tensor imaging measures, fractional anisotropy (FA) and mean diffusivity (MD), were used as supplementary outcome measures as they capture early microstructural lesions of white matter attributing to CSVD. 13 Our aim was to determine the causally-associated components of the exposome for cerebral small vessel disease.
Methods
Exposure data, outcome data, and data sources
We collected summarized data of 4587 exposure GWASes from the Neale lab (http://www.nealelab.is/uk-biobank), and 778 exposure GWASes from Gene Atlas (http://geneatlas.roslin.ed.ac.uk/), 14 obtained from 361,194 and 452,264 participants from UK Biobank, respectively. The UK Biobank conforms to the ethical guidelines of the 1975 Declaration of Helsinki as reflected in a prior approval by the North West Multi-Centre Research Ethics Committee and all participants provided written informed consent. Six newly-published GWASes of “all location ICH or SVS”, “lobar hemorrhage or SVS”, “non-lobar hemorrhage or SVS”, “WMH volume”, “FA” and “MD” served as our outcomes, with their summary level data (summary statistics) which consist of full sets of association results by SNP available from Cerebrovascular Disease Knowledge Portal (www.cerebrovascularportal.org).11,12,15 The three cross-phenotype outcome GWASes were produced by meta-analysis of the SVS GWAS from MEGASTROKE 16 with the ICH GWASes by location. 11 These cross-phenotype analyses estimated the genetic overlap shared by two diseases, and the genetic associations of variants from the two diseases were combined and weighted according to the degree of overlap. 11 Samples included 241,024 participants, consisting of 6255 ICH or SVS cases and 233,058 control subjects. 11 The GWASes of WMH volume, FA and MD were conducted in 18,381, 17,663 and 17,467 participants. 12 MRI was performed on two identical scanners (Siemens Skyra 3.0 T, software VD13A SP4) using the standard Siemens 32-channel receive head coil. Identical acquisition parameters and careful quality control were used for all scans. 12 WMH volume was log-transformed and normalized for brain volume. FA and MD in 48 brain regions were reduced to their first principle components, which accounted for 38% (FA) and 41% (MD) of the variance in these measures. 12 More information about participant enrollment and sample demography is available in the original works. All relevant studies to the data sources received ethical approval from institutional review boards and obtained informed consent from all participants. All data that support the findings of this study are available from the corresponding author upon reasonable request.
Exposure filtration and instruments selection
MR uses genetic variants associated with exposures of interest and then explores the associations between the genetic predisposition to this exposure or the genetically predicted levels of the exposure phenotype with disease outcomes. The following criteria were used to include exposures in the final analyses: (1) binary and categorical variables must have cases ≥250, and the case/control ratios must be greater than 0.1; 17 (note that rank-normalized version of continuous variables were used instead of the raw version); (2) the GWAS must have single nucleotide polymorphisms (SNPs) of p < 1e-6 as potential instrumental variables that are associated with the risk factor of interest, that are not related to confounders, and that affect the outcome only through the risk factor; 18 (3) no less than three independent SNPs (r2 < 0.001) of p < 1e-6 had to be found in the outcome GWAS. We primarily selected SNPs of p < 1e-6 as instruments which were defined as multiple instrumental variables. 6 For each exposure possessing genome-wide significant SNPs, we constructed another set of instruments of p < 5e-8 and repeated the MR analysis. Concerning for possible weak instruments biases, we calculated F-statistic for each set of instruments and abandoned the results when F-statistic was less than 10. These measures warranted that the most relevant instrumental SNPs were used and avoided violating the relevance assumption of MR.
Statistical analysis and quality control
We performed two-sample MR analyses using 1352, 1402, 1338, 1445, 1143 and 1147 exposures that survived the filtering process, respectively, for six outcomes. Of these exposures, 620, 619, 621, 658, 658 and 658 exposures were qualified for secondary analyses with instruments of p < 5e-8, respectively. We conducted analyses using the TwoSampleMR package (version 0.5.3) in R environment (version 3.6.3). Every exposure was studied separately on every corresponding outcome. The Shapiro-Wilk Test and Kolmogorov-Smirnov Test were used to test for normality separately when the sample size was less than 2,000 and more than 2000. We used inverse variance weighted (IVW) MR as our primary approach, supplemented by weighted median and MR Egger methods for sensitivity analyses. These two methods revalidated the results, allowing pleiotropy to present in 50% (weighted median) or 100% (MR Egger) of instruments. 6 The presence of significant horizontal pleiotropy was assessed by testing if the intercept from MR-Egger regression was significantly different from zero. 19 Leave-one-out analyses were performed to determine if the effect estimates were driven by a single SNP that was likely not independent of (or had direct effect on) the outcome. Heterogeneity in SNP Wald ratios was evaluated with Cochran’s Q test to appraise the internal homogeneity within each set of instruments. We performed FDR correction for IVW P-values with p.adjust() function in R. The correction was done separately for each outcome and separately for analyses using instruments of p < 1e-6 and p < 5e-8, with PFDR<0.05 defining significance. (The uncorrected and FDR-corrected IVW P-value would be distinguished as “P” and “PFDR” in this article.) For significant exposures robustly qualified for causal inference, we performed reverse causation analysis to confirm the direction of the causal relations. In order not to overlook possibly meaningful results, we also considered the uncorrected IVW P < 0.05 as evidence of a suggestive association. We performed overlap analyses of results for all outcomes, and constructed the CSVD exposome with exposures suggestively associated (i.e P < 0.05) with at least one clinical measures and at least one MRI measures (WMH, FA or MD), or with macrostructural measure (WMH) and at least one microstructural measures (FA or MD). Exposures associated with increased risk of clinical outcomes, higher WMH volume, lower FA and higher MD 13 were considered hazardous, otherwise they were considered protective. Only exposures showing consistent directionality for all their associated outcomes were included in the constructed CSVD exposome.
Results
Overall screening results
In the analyses with instruments of p < 1e-6, we recognized 51 exposures that had P < 0.05 for all studied clinical outcomes (Figure 2), 41 and 46 exposures that were more suggestively associated with lobar hemorrhage and non-lobar hemorrhage (Supplement 1 and 2), as well as 92, 49 and 68 exposures suggestively associated with WMH, FA and MD, respectively (see legend of Figure 1 for details of classification). We also constructed the CSVD exposome following the criteria described in “Methods”, and found 41 exposures presenting suggestive associations widely among outcomes (Figure 3). Moreover, the Venn diagram has demonstrated that hypertension and high cholesterol were risk exposures for all the studied outcomes of CSVD (Figure 4).

Flow chart of the current study. Exposome extracted from UK Biobank were studied for association with clinical and subclinical CSVD outcomes in Cerebrovascular Disease Knowledge Portal via Mendelian randomization. *In analyses with instruments of p < 1e-6, this number is 449, 399, 463, 356, 658 and 654 for outcome measures 1 to 6, respectively; correspondingly in analyses with instruments of p < 5e-8, this number is 1181, 1182, 1180, 1143, 1143 and 1143. †The quantities of exposures were summarized from analyses with instruments of p < 1e-6: 1. exposures with P < 0.05 for “intracerebral hemorrhage or small vessel stroke”, “lobar hemorrhage or small vessel stroke”, and “non-lobar hemorrhage or small vessel stroke”; 2. exposures with P < 0.05 for “lobar hemorrhage or small vessel stroke”, but not for “non-lobar hemorrhage or small vessel stroke”; 3. exposures with P < 0.05 for “non-lobar hemorrhage or small vessel stroke”, but not for “lobar hemorrhage or small vessel stroke”; 4. white matter hyperintensities; 5. fractional anisotropy; 6. mean diffusivity: exposures with P < 0.05 for corresponding MRI outcomes. We converged completely identical exposures and preserved the one with the smaller PFDR. For exposures with only weak instruments (F-statistic <10), we precluded their results due to concern for weak instruments biases. CSVD: cerebral small vessel disease; IVW: inverse variance weighted; FDR: false discovery rate.
Significant exposures for clinical outcomes
After FDR correction of all analyzed exposures, forty-two exposures preserved statistical significance (Table 1). Hypertension had the strongest association (OR = 3.98, 95%CI: 2.65 to 5.98), and the quantitative increase of both systolic (1.44, 1.17 to 1.77) and diastolic (1.48, 1.24 to 1.76) blood pressure was associated with higher risk of clinical outcomes. Although there was significant heterogeneity in SNP estimates of hypertension-related exposures, the Egger intercepts didn’t suggest horizontal pleiotropy. High cholesterol (3.37, 1.81 to 6.29), no qualifications (4.13, 2.23 to 7.67), and longer time spent watching television (1.29, 1.13 to 1.46) were significantly associated with increased risk of ICH or SVS, while A/AS levels qualifications (0.24, 0.12 to 0.49) and older age of first sexual intercourse (0.68, 0.57 to 0.83) were significantly associated with decreased risk. Diabetes (3.92, 1.77 to 8.68) was significantly associated with higher odds of non-lobar hemorrhage or SVS. The above significant results showed no evidence of horizontal pleiotropy or heterogeneity (intercept p > 0.05, Q test p > 0.05). And the significant associations were further confirmed in analyses using instruments of p < 5e-8.
Exposures with PFDR < 0.05 for clinical outcomes.
Exposures under the item “ICH or SVS” might be significant for more than one clinical outcomes, but the data shown here derived from analyses for “all location ICH or SVS”. Exposures under the item “Lobar/non-lobar hemorrhage or SVS” were significant for “lobar hemorrhage or SVS” and “non-lobar hemorrhage or SVS” (but not for “all location ICH or SVS”), and the data shown here derived from analyses for “lobar hemorrhage or SVS”. Exposures under the item “Non-lobar hemorrhage or SVS” were not significant for other two clinical outcomes, and the data shown here derived from analyses for “non-lobar hemorrhage or SVS”. † Data derived from analyses with instruments of p < 1e-6; ‡ Data derived from analyses with instruments of p < 5e-8; †,‡ Data derived from analyses with instruments of p < 1e-6, but this exposure showed PFDR<0.05 with both sets of instruments. Four exposures in the category of medication were not included due to limited space and that their indications of use were unavoidable confounders (see in Supplement 3).
SNP: single nucleotide polymorphism; IVW: Inverse Variance Weighted; FDR: false discovery rate; Q test: Cochran’s Q test; ICH: intracerebral hemorrhage; SVS: small vessel stroke.
Other exposures only reached significance when analyzed with one set of instruments. Among results without obvious horizontal pleiotropy or heterogeneity, smoking (3.75, 1.88 to 7.48), higher body mass index (1.06, 1.03 to 1.10), higher leg fat percentage (right: 1.05, 1.02 to 1.08; left: 1.07, 1.03 to 1.10), variation in diet (13.15, 2.70 to 64.18), greater overall health rating (2.40, 1.45 to 3.97), increased frequency of tiredness/lethargy in last 2 weeks (3.82, 1.65 to 8.87), and more days/week of moderate physical activity >10 minutes (2.21, 1.44 to 3.39) were significantly associated with higher risk of ICH or SVS, while never smoked (0.43, 0.26 to 0.71), faster usual walking pace (0.42, 0.26 to 0.68), college or university degree (0.45, 0.30 to 0.68), greater forced vital capacity (FVC) (0.74, 0.63 to 0.88), and greater forced expiratory volume in 1-second (FEV1) (0.73, 0.61 to 0.88) were associated with decreased risk. Frequent monthly intake of other alcoholic drinks (4.01, 1.82 to 8.84), higher white blood cell (1.08, 1.03 to 1.13) and neutrophil (1.10, 1.04 to 1.17) count were significantly associated with increased odds of non-lobar hemorrhage or SVS, while more cheese intake (0.66, 0.52 to 0.82), greater peak expiratory flow (0.77, 0.66 to 0.90), and no illness of siblings (0.16, 0.06 to 0.42) were significant associated with decreased odds of non-lobar hemorrhage or SVS. Long-standing illness, disability or infirmity (5.77, 2.57 to 12.96) and standing height (0.96, 0.94 to 0.98) were significant for lobar hemorrhage or SVS and non-lobar hemorrhage or SVS, and they nearly reached significance for all location ICH or SVS.
Other than body mass index, leg fat percentage, FVC, FEV1, and cheese intake, all significant exposures surpassed at least one sensitivity analysis, indicating that the associations were still robust even half (for weighted median) or all (for MR Egger) of the SNPs had pleiotropic effects. 6 Leave-one-out analyses suggested that none of the results were driven by a single SNP (Supplement 3). We then performed reverse causation analyses for 11 modifiable exposures pertaining to television-watching time, educational qualifications, pulmonary function, physical activity, mood, and diet with clinical outcomes, knowing that CSVD may influence physical activity 20 and mood 4 , and cause changes in diet (Supplement 6). The analyses revealed possible reverse causation between “days/week of moderate physical activity >10 minutes” (1.10, 1.04 to 1.16; IVW P = 0.001), “usual walking pace” (1.04, 1.01 to 1.07; IVW P = 0.009), and “variation in diet” (0.98, 0.96 to 1.00; IVW P = 0.016) and “ICH or SVS”, and between “FEV1” (1.03, 1.00 to 1.05; IVW P = 0.045) and “non-lobar hemorrhage or SVS”. There was no evidence of reverse causation for the other exposures.
Significant exposures for subclinical outcomes
None of exposures remained significant after correction in analyses with instruments of p < 1e-6 for WMH. In analyses with instruments of p < 5e-8, ten exposures of all analyzed 658 exposures maintained significance after correction (Table 2). Hypertension (OR = 1.60, 95%CI: 1.23 to 2.08), higher fat-free mass (arm:1.41, 1.17 to 1.69; leg: 1.22, 1.09 to 1.36) and higher sitting height (1.04, 1.02 to 1.06) were significantly associated with higher white matter hyperintensities, while no vascular/heart problems (0.61, 0.47 to 0.78) and no medication for cholesterol, hypertension or diabetes (0.62, 0.48 to 0.80) were significantly associated with lower white matter hyperintensities. Though heterogeneity in SNP estimates was observed in eight exposures, this didn’t necessarily indicate pleiotropic pathways or bias the pooled estimates. 21 Robustness in sensitivity analyses was noted without evidence of horizontal pleiotropy from MR Egger intercept. Leave-one-out analyses suggested that none of the results were predominantly driven by a single SNP (Supplement 4).
Exposures with PFDR<0.05 for white matter hyperintensities.
This table derived from analyses with instruments of p < 5e-8.
SNP: single nucleotide polymorphism; IVW: Inverse Variance Weighted; FDR: false discovery rate; Q test: Cochran’s Q test.
Because only the first principle components were used in GWASes of FA and MD, 12 the estimates and directionality of MR results for them were not as reliable as for other outcomes. Nevertheless, we noticed thirty exposures presenting significant associations with MD after correction, using either set of instruments (Supplement 5 table I). No significant exposures for FA were noticed.
Discussion
This Mendelian randomization study supports the conventional understanding that hypertension, high cholesterol, smoking and diabetes are significant risk factors for CSVD. We also provide significant evidence for associations between longer time spent watching television and no qualifications and increased CSVD risk, and between A/AS levels qualifications and older age of first sexual intercourse and decreased CSVD risk. The above results were robust to sensitivity analysis with more rigorous instruments and other MR methods, and showed no evidence of horizontal pleiotropy, thus supporting casual inference.
Longer television-watching time was reported to increase risk of all stroke with a hazard ratio of 1.37, which mirrored the causal relationship between sedentary behaviors and CSVD. 22 The estimate of causal effect was 1.29 on ICH or SVS here, which is less than the observational value because ICH and SVS do not represent all strokes. Considering this exposure as a measure of sedentariness, we noticed that a protective effect against CSVD was previously reported in its opposite, higher level of physical activity. 23 In this study, we found frequent moderate physical activity >10 minutes was a significant risk exposure, while increased usual walk pace was significantly protective. However, in the reverse causation analyses, CSVD was associated with more days/week of moderate physical activity >10 minutes and increased usual walking pace (Supplement 6), indicating that the genetic susceptibility to CSVD may have a weak effect on increasing people’s physical activity (OR = 1.10 and 1.04). This suggested possible reverse causation bias in the finding that more days/week of moderate physical activity >10 minutes were a risk factor to CSVD, as this result is based only on 10 SNPs (Table 1). However, the result that increasing usual walking pace may have a protective effect against CSVD was based on 127 SNPs (Table 1), which was highly unlikely to be affected by the contradictory reverse causation result. Increasing walking pace, and suggestively walking for pleasure (see protective exposures in Figure 3), may be safer than frequently taking prolonged moderate exercise. Overall, our findings imply that sedentariness should be avoided in order to prevent clinical CSVD, but the properly beneficial modality of physical activity require further investigation.
No qualifications and A/AS levels qualifications reflect educational levels, so dose another significant protective exposure, college or university degree. Literature noticed that heavier CSVD burden at age 73 was associated with lower age-11 intelligence, and suggestively associated with lower educational level. 24 Our CSVD exposome also revealed higher fluid intelligence score as a suggestively protective exposure, which is known to peak early in life (Figure 3). 25 As the MR study is methodologically less susceptible to cofounding or reverse causation than conventional observational studies, 6 the casual effect we found may suggest unknown mechanisms connecting brain features during early development with CSVD risk in mid or elderly life.
The result of age of first sexual intercourse came from two robust sets of instruments (F-statistic = 30.83 and 36.80) that explain much of the variance of this exposure. A recent study suggested that this trait was associated with hundreds of genetic loci and older age at first birth was associated with longevity and decreased incidence of type 2 diabetes and cardiovascular disease. 26 In combination with their findings, we assumed that a proper delay of first sexual intercourse potentially decrease CSVD burden in later life. Another exposure, higher overall health rating, surprisingly turned out as a risk exposure. The data behind this exposure was collected from a question “in general how would you rate your overall health”. We speculated that the possible reason for this unexpected result was that this trait was scored by participants themselves, which cannot truly and objectively reflect their real health status.
Greater pulmonary function measures were significantly protective against CSVD in this study. Reduced FVC 27 , FEV1 28 and chronic obstructive pulmonary disease 29 were previously reported to increase CSVD risk. Reduced midlife lung function was associated with increased WMH in late life. 30 The underlying mechanism may be that hypoxia exacerbates ischemia in areas of hypoperfusion. 31 Besides, reduced pulmonary function was associated with systemic inflammation and higher C-reactive protein level, 32 which was subsequently associated with presence and progression of CSVD. 33 The association between systemic inflammation and CSVD was also reflected by our findings that higher white blood cell and neutrophil count were significant risk exposures for non-lobar hemorrhage or SVS.
Adiposity-related exposures and diabetes were more associated with non-lobar hemorrhage than lobar hemorrhage, reflected by both significant results and trends in the suggestive results (Figure 2, Supplement 1 and 2). Adiposity was previously reported to increase the risk of deep but not lobar ICH. 34 Conversely, a protective effect was possible for adiposity on lobar hemorrhage 35 and this may balance the hazardous effect of adiposity on SVS 36 and explain why associations were not found between adiposity-related exposures and lobar hemorrhage or SVS. Large meta-analysis supported that diabetes was associated with increased occurrence of ICH 37 , and the trend towards non-lobar hemorrhage was also observed. 19 All these findings suggest that differentiation of anatomical location is warranted in future studies about adiposity or diabetes and ICH.

Exposures suggestively associated with all clinical outcomes. Heatmap of fifty-one exposures with IVW P < 0.05 for three clinical outcomes under instrument with p < 1e-6, and those ended with “*” were sensitive to analysis with instruments of p < 5e-8. Darker color suggests stronger association. * indicates PFDR < 0.05, while ** indicates PFDR < 0.01. ICH: intracerebral hemorrhage; SVS: small vessel stroke; IVW: inverse variance weighted; FDR: false discovery rate.
The constructed CSVD exposome revealed exposures suggestively associated with CSVD (Figure 3). It included some previously-reported relevant exposures, such as intraocular pressure 38 and depression. 39 But we didn’t notice suggestively significant results in other previously-reported exposures, like sleep quality 40 and chronic renal failure. 41 Exposures had different association profiles with different CSVD measures. SVS and WMH are ischemic consequences of CSVD due to acute localized loss of perfusion and chronic diffusely reduced perfusion, respectively. 1 Exposures causing ischemia usually were associated with the two measures together, such as smoking and higher hemoglobin concentration (Figure 3, 4). ICH is the hemorrhagic consequence due to vessel wall damage, which is often caused by cerebral amyloid angiopathy (for lobar hemorrhage) or arteriolosclerosis (for non-lobar hemorrhage), with the latter presenting more associations with traditional vascular risk factors that also lead to ischemia. 42 Thus hypertension and none of vascular/heart problems showed the smallest IVW P for “non-lobar hemorrhage or SVS” among three clinical outcomes. Compared with clinical measures, WMH has multifactorial pathophysiology. Hypotheses of WMH development include incomplete infarct, 43 blood-brain barrier damage 44 and inflammation. 45 Interestingly, WMH is also associated with cerebral amyloid angiopathy, the key underpinning of lobar hemorrhage. 46 The underlying mechanism for exposures suggestively associated with WMH is uncertain, but their simultaneous associations with SVS, lobar or non-lobar hemorrhage may give a clue in deduction. FA and MD measure the loss of white matter integrity or disrupted molecular movement along the axons of any cause. They are sensitive but not specific for CSVD. 13 Presenting associations with them in addition to other CSVD measures makes the exposure more convincing as injury to microstructure necessarily underlies or precedes all macrostructural lesions.

The CSVD exposome. Heatmap of the cross-outcome association between the CSVD exposome, consisting of forty-one exposures, and six CSVD measures. All the exposome either presented IVW P < 0.05 for at least one clinical and at least one MRI (WMH, FA or MD) measures, or showed IVW P < 0.05 for macrostructural (WMH) and at least one microstructural (FA or MD) measures, with consistent directionality. Darker color suggests stronger association. Blanks (or white grids) indicate no suggestive association (i.e. P ≥ 0.05). *indicates PFDR < 0.05, while **indicates PFDR < 0.01. aData derived from analyses with instruments of p < 1e-6; b Data derived from analyses with instruments of p < 5e-8; a,bData derived from analyses with instruments of p < 1e-6, but this exposure was qualified for the CSVD exposome in analyses using both sets of instruments. CSVD: cerebral small vessel disease; ICH: intracerebral hemorrhage; SVS: small vessel stroke; WMH: white matter hyperintensities; FA: fractional anisotropy; MD: mean diffusivity; IVW: inverse variance weighted; FDR: false discovery rate.

The Venn diagram of various exposures and their associations (IVW P < 0.05) with clinical (stroke outcomes), macrostructural (WMH) and microstructural (FA, MD) measures of CSVD. WMH: white matter hyperintensities; FA: fractional anisotropy; MD: mean diffusivity; CSVD: cerebral small vessel disease. IVW: inverse variance weighted. DTI: diffusion tensor imaging.
Strengths of this study include the exposome-wide spectrum, various outcome measures and large sample sizes of the GWASes. There was no population stratification as participants were all of European ancestry. We primarily selected instruments with relaxed criteria to reveal every possible association, then constructed more rigorous instruments for sensitivity analysis and left out all results derived from weak instruments (F-statistic <10). We adopted strict quality control in results interpretation. Only those that survived FDR correction with robustness in sensitivity analyses, and without evidence of horizontal pleiotropy were considered to have significant associations or casual effects. Our results were highly consistent with previous MR studies reporting individual risk exposures for CSVD.19,47,48
This study is limited by difficulties in quantitatively interpreting estimates. Our exposures were composed of binary, categorical, ordinary and normalized continuous variables, thus the odds ratios were not comparable to each other and of limited clinical values. Besides, the exposure GWASes were uniformly performed using least-squares linear models, which were not ideal for the binary variables (especially those with imbalanced case/control ratios) and likely biased the model coefficients, 49 subsequently affecting the MR analysis. We minimized deviation of estimates by excluding exposures with extreme case/control ratios. Secondly, there was possible sample overlap between exposure GWASes and GWASes of MRI measures, as participants in both came from UK Biobank. However, the maximal overlap percentage was only ∼5% due to large difference in sample size. Extant literature suggests that for continuous outcomes, as in our case, the bias under null is proportional to the overlap percentage, and the direction of the bias gradually changes from towards null to towards confounded association as the overlap increases. 50 A 5% overlap is too small to alter the estimates significantly, and the most likely direction of the bias is towards the null, which does not increase false positive results. Thirdly, some previously reported relevant exposures (i.e. atrial fibrillation, 51 HbA1c, 52 C-reactive protein 33 ) were not analyzed due to unavailability in the dataset or failure in the filtration process. Fouthly, low penetrance SNPs require large cohorts to provide statistically significant results, and then only show a statistical association instead of a mechanism of action 53 . Fifthly, as stated by Wild, accurately measuring exposome is extremely challenging and the dynamic nature of exposome may militate against long-term exposure assessment. 7 In addition, SNP associations identified in one race population are frequently unable to apply to another populations.
There are advantages and disadvantages in using cross-phenotype GWASes. The sample size of available ICH GWAS is small 54 and leads to compromised power if used directly, while the combination efficiently leverages the large-scale SVS GWAS and avoids this problem. The ability to reveal associations with ICH in different locations was partially preserved by horizontally comparing MR results for three clinical outcomes. However, we fail to differentiate ICH-related and SVS-related exposures, and to get a specific effect estimate on each manifestation. Our principal objective is to identify CSVD-associated exposures with maximal statistical power. Less emphasis is put on the quantitative interpretation of estimates and delicate differentiation of heterogenous CSVD phenotypes. Although exposures with contradictory effects on ICH and SVS may be buried or only show the dominant directionality (eg, adiposity), all exposures in screening results were at least suggestively associated with one CSVD phenotype. Larger GWASes of various CSVD phenotypes in the future would allow the identification of specific exposome for every CSVD manifestation, with similarly high statistical power.
In conclusion, this study implies that among the modifiable environmental exposures, reducing television-watching time, obtaining higher educational qualifications, avoiding early first sexual intercourse and maintaining good pulmonary function are potentially preventive against the development of CSVD.
Supplemental Material
sj-pdf-1-jcb-10.1177_0271678X221074223 - Supplemental material for Validation of external and internal exposome of the findings associated to cerebral small vessel disease: A Mendelian randomization study
Supplemental material, sj-pdf-1-jcb-10.1177_0271678X221074223 for Validation of external and internal exposome of the findings associated to cerebral small vessel disease: A Mendelian randomization study by Xue-Qing Zhang, Yu-Xiang Yang, Can Zhang, Xin-Yi Leng, Shi-Dong Chen, Ya-Nan Ou, Kevin Kuo, Xin Cheng, Xiang Han, Mei Cui, Lan Tan, Lei Feng, John Suckling, Qiang Dong and Jin-Tai Yu: the OPTIMAS investigators in Journal of Cerebral Blood Flow & Metabolism
Footnotes
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Grants from the National Natural Science Foundation of China (91849126), the National Key R&D Program of China (2018YFC1314700), Shanghai Municipal Science and Technology Major Project (No.2018SHZDZX01) and ZHANGJIANG LAB, Tianqiao and Chrissy Chen Institute, and the State Key Laboratory of Neurobiology and Frontiers Center for Brain Science of the Ministry of Education, Fudan University.
Acknowledgements
We thank the Neale lab, Gene Atlas and Cerebrovascular Disease Knowledge Portal for providing summarized statistics, and all authors of the original works for generous sharing. We also thank the UK Biobank, the North American (USA) multi-centre Genetics of Cerebral Haemorrhage on Anticoagulation (GOCHA) study, the European member sites contributing to the International Stroke Genetics Consortium (ISGC-EUR), the Genetic and Environmental Risk Factors for Haemorrhagic Stroke (GERFHS) I, II and III, and the MEGASTROKE study for providing genotype and phenotype data supporting the original work of all GWASes used in this study.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Authors’ contributions
JTY, LT and QD contributed to conception and design of the study. XQZ, YXY and SDC contributed to acquisition, analysis, and interpretation of data. XQZ drafted the manuscript. CZ, XYL, SDC, YNO, KK, XC, XH, MC, LF, JS and JTY revised it critically for important intellectual content. All authors approved the current version to be published.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
