Abstract
Dementia, Parkinson’s disease, multiple sclerosis, and motor neuron diseases cause significant disability and mortality worldwide. Although the etiology of these diseases is unknown, highly correlated disease prevalences would indicate the involvement of common etiologic factors. Here we used published epidemiological data in 195 countries worldwide to investigate the possible intercorrelations among the prevalences of these diseases. All analyses were carried out using nonparametric statistics on rank-transformed data to assure the robustness of the results. We found that all 6 pairwise correlations among the prevalences of the 4 diseases were very high (>.9, P < .001). A factor analysis (FA) yielded only a single component which comprised all 4 disease prevalences and explained 96.3% of the variance. These findings indicate common etiologic factor(s). Next, we quantified the contribution of 3 country-specific factors (population size, life expectancy, latitude) to the common grouping of prevalences by estimating the reduction in total FA variance explained when the effect of these factors was eliminated by using the prevalence residuals from a linear regression where theses factor were covariates. FA of these residuals yielded again only a single component comprising all 4 diseases which explained 71.5% of the variance, indicating that the combined contribution of population size, life expectancy and latitude accounted for 96.3% − 71.5% = 24.8% of the FA variance explained. The fact that the 3 country-specific factors above accounted for only 24.8% of the FA variance explained by the original (ranked) disease prevalences, in the presence still of a single grouping factor, strongly indicates the operation of other unknown factors jointly contributing to the pathogenesis of the 4 diseases. We discuss various possible factors involved, with an emphasis on biologic pathogens (viruses, bacteria) which have been implicated in the pathogenesis of these diseases in previous studies.
Keywords
Introduction
Neurological disorders are the leading cause of disability and the second-leading cause of mortality worldwide. 1 They comprise dementia(s) (DEM, including Alzheimer’s disease [AD]), Parkinson’s disease (PD), multiple sclerosis (MS), and motor neuron diseases (MND, including amyotrophic lateral sclerosis [ALS]), among others. The Global Burden of Diseases (GBD) epidemiologic study has documented the prevalence and increasing global burden of these diseases in 195 countries worldwide,2-5 highlighting the need for additional research aimed at determining their etiology, risk factors, and ultimately advancing their treatment.
Despite phenotypic and neuropathological differences between DEM, PD, MS, and MND, research to date suggests that all 4 of these diseases are similarly driven by the interplay of genetic and environmental factors that lead to central nervous system and immune disruptions.6-9
Although the geographical heterogeneity of these diseases has been well documented,2-5 the extent to which their prevalence covaries is unknown: highly correlated prevalences would indicate that these diseases share common etiologic factor(s). In this study, we sought (a) to determine the degree to which worldwide prevalences of DEM, PD, MS, and MND are intercorrelated, and (b) to find out whether how they are grouped in a factor analysis (FA).
Methods
Disease prevalences were the Prevalence Counts given in the recently published GBD estimates of these 4 disorders for 195 countries worldwide (see original publications for detailed descriptions of how the prevalences were determined).2-5 For each country, we also obtained total population,10,11 life expectancy,10,11 and latitude, 12 defined as the absolute angle from the equator (0°) to either pole (90°).
Statistical analyses
To maximize robustness, all analyses were performed on data converted to fractional ranks. More specifically, fractional ranks for each country were obtained for 7 variables, namely disease prevalences for each one of the 4 diseases, population size, life expectancy, and latitude.
First, all 6 possible pairwise nonparametric Spearman coefficients were computed among prevalences of the 4 diseases. Next, 4 linear regressions were performed for each disease to generate residuals of prevalences controlling for (i) population size, (ii) life expectancy, (iii) latitude, and (iv) population size, life expectancy, and latitude together; in these regressions, the ranked prevalence was the dependent variable and one or more of the 3 ranked covariates were the independent variables. Finally, a FA was used to identify potential groupings of the ranked disease prevalences and to quantify the contribution to such grouping by the 3 ranked covariates (population size, life expectancy, and latitude). For that purpose, 5 FAs were performed, as follows (Table 1). (a) In the first FA, the data entered were the ranked prevalences, yielding a grouping outcome of the original prevalences of the 4 diseases; (b) in the second FA, the data entered were the residuals of the linear regression of ranked prevalence versus ranked population size, yielding a grouping outcome in the absence of a population size effect; (c) in the third FA, the data entered were the residuals of the linear regression of ranked prevalence versus ranked life expectancy, yielding a grouping outcome in the absence of a life expectancy effect; (d) in the fourth FA, the data entered were the residuals of the linear regression of ranked prevalence versus ranked latitude, yielding a grouping outcome in the absence of a latitude effect; and (e) in the fifth FA, the data entered were the residuals of the linear regression of ranked prevalence versus ranked population size, life expectancy, and latitude (together) yielding a grouping outcome in the absence of a combined effect of the 3 covariates above. Bartlett’s test of sphericity was used to test the null hypothesis that the correlation matrix of the data entered into a FA is an identity matrix and the Kaiser criterion was applied to drop all components with an eigenvalue <1. All statistical analyses were conducted using the IBM-SPSS statistical package (version 27).
Design of testing groupings of the 4 diseases using factor analysis (FA) of the variables indicated.
Abbreviations: LAT, latitude; LE, life expectancy; PS, population size.
See text for details.
Results
All 6 possible pairwise scatterplots of the ranked disease prevalences are shown in Figure 1. All Spearman rank correlation coefficients (Table 2) were very high (>.9) and were highly statistically significant (P < .001, 2-sided).

Scatter plots of ranked prevalences amongst the 4 diseases. Numbers are nonparametric Spearman correlation coefficients. N = 195 countries per correlation. See Table 2 for detailed statistics.
Nonparametric pairwise Spearman rank correlation coefficients with their 95% confidence intervals (CI) and P value of statistical significance for the 6 pairs of diseases shown.
Factor analyses
In all 5 FAs, Bartlett’s test of sphericity was highly statistically significant (P < .001), rejecting the null hypothesis that the correlation matrix of the data entered was an identity matrix and thus justifying the performance of a FA. All 5 FAs yielded a single grouping component factor (with eigenvalue >1) comprising all 4 diseases; details of the relevant statistics are given in Table 3. These results are depicted in the scree plot of Figure 2 (eigenvalue against FA component) and Figure 3 (percent of variance explained against FA component, a more intuitive measure). Figure 4 exemplifies the contributions of population size, life expectancy, and latitude to the FA variance explained. We found that (i) population size contributed 7.3% to the variance explained in FA, (ii) life expectancy contributed an additional 7.6% to the population size effect, (iii) latitude contributed an additional 13.2% to the population size effect, and (iv) all 3 factors entered together in the analysis (population size, life expectancy, latitude) contributed 24.8%.
Results of the 5 factor analyses outlined in Table 1.
Abbreviations: LAT, latitude; LE, life expectancy; PS, population size.

Scree plot of eigenvalue against FA components for the 5 FAs performed. See text and Table 3 for details.

Scree plot of percent variance explained against FA components for the 5 FAs performed. See text and Table 3 for details.

Discussion
The GBD study represents the largest worldwide research effort to quantify the prevalence and health loss associated with hundreds of diseases. Using data derived from GBD 2016,2-5 here we compared the prevalence of 4 phenotypically distinct neurological diseases. We found that the prevalence of 4 diseases covaried strongly (Figure 1) and highly significantly (Table 2). The results of the factor analyses documented a single grouping comprising all 4 diseases, even when prevalences were controlled for country-specific population size, life expectancy, and latitude (Table 3; Figures 2–4). These findings indicate the presence of additional common etiologic factor(s) among the 4 diseases studied. In fact, such currently unknown factors account for 71.8% of the variance explained by FA (Figure 3). In that what follows, we discuss various possible, common pathogenic factors.
The manifestation of a disease is the result of an insult (of external or internal source, environmental or genetic in nature, and combinations thereof) and the reaction of the body to it. The worldwide existence of these diseases indicate that their causative factors are distributed globally. Assuming (a) that such neuropathogenic insults are widely shared, (b) that neural systems are very similar in different populations, and (c) that the diverse disease symptomatologies reflect damage to different neural elements (eg, motor neurons, white matter, subcortical/cortical structures), we hypothesize that the variation in disease prevalence essentially reflects a differential vulnerability of these neural elements (eg, motor neurons vs myelin vs subcortical/cortical neurons), in combination with genetic (eg, sex, genes predisposing to autoimmunity) and other factors (eg, a host of environmental and lifestyle factors). We discuss below potential common etiologic causes and highlight the role of pathogens—more specifically, persistent antigens that result from exposure to pathogens coupled with lack of immunogenetic protection against them—as one potential shared link among these conditions.
Potential common etiologic links
In light of the present findings documenting very high correspondence between the prevalences of these diseases, we turn toward consideration of factors that have been linked to these conditions to consider potential common underlying causes. Lifestyle factors such as diet, exercise, smoking, and alcohol consumption have been associated with risk for each of the 4 diseases investigated here, to variable degrees.2-9 However, if such lifestyle factors were causally associated, one might expect evidence of decreasing global burden associated with these conditions commensurate with recent global health efforts aimed at reducing such modifiable risk factors rather than increased global burden.2-5 Furthermore, such efforts have had variable effects. For instance, while smoking is known to have numerous health consequences and is linked to increased risk for ALS (a motor neuron disease) and MS, 4 recent work suggest that decreased rates of smoking may counter intuitively be linked to the increase in PD. 3 While such lifestyle factors may influence disease risk, it is not clear that they play a causal role. Other potential contributors include factors associated with latitude such as ultraviolet radiation and vitamin D which have been strongly linked to MS5,9 and to a lesser extent with PD 7 ; however, the current finding of very robust correlations in disease prevalences even after accounting for latitude eliminate such latitude-based factors as common contributors to the 4 diseases investigated here. Environmental factors such as those associated with industrialization (eg, pollutants, pesticides, and other environmental contaminants) have been widely associated with increased risk for PD and ALS.3,4,6,7 Though not as widely established as risk factors for DEM and MS, evidence implicating such industrialization factors in dementia of the Alzheimer’s type and MS is growing,13,14 raising the possibility that the prevalences of the 4 diseases evaluated here may be linked by relative exposure to industrialization and related environmental contaminants that varies across countries. One final factor that we consider in terms of possible etiologic contributions that may be common to all 4 conditions evaluated here is exposure to pathogens. In light of substantial evidence linking infectious agents to each of the 4 conditions evaluated here,15-20 we focus the rest of the discussion on evidence highlighting the potential shared role of pathogen exposure as a common link among these 4 conditions.
Ordinary causes of universal diseases
As previously noted, the remarkable consistency observed with respect to the prevalence of the 4 diseases globally coupled with the extremely high correlations amongst them suggests an inherent commonality. Notably, viruses and other pathogens have been strongly implicated in all 4 conditions investigated in the present study.15-20 From the perspective of microbial etiology, the robustness of the correlations even after removing the effects of geographical location and life expectancy points to a pathogen or family of pathogens that are common universally (eg, Herpesviridae, Influenza). Indeed, several pathogens are so ubiquitous that nearly all adults are suspected of having been exposed over the course of their lifetime.21,22 Furthermore, the very high correlation between the prevalences of Parkinson’s disease and dementia (r = .985, Figure 1) suggests that these, in particular, are likely due to the same or highly similar pathogens which would account for the fact that many patients with Parkinson’s disease progress to dementia and that the 2 conditions share numerous biochemical, molecular, and genetic mechanisms.23,24
Evolutionary protection against pathogens
Given the near universality of some pathogens, including several implicated in the disorders investigated here, one might expect these disorders to be a near universal outcome of pathogen exposure. Fortunately, however, that is not the case because the human immune system is equipped to eliminate foreign pathogens, a process that critically involves human leukocyte antigen (HLA) genes. The HLA region is the most highly polymorphic region of the human genome, having evolved in parallel with microbial evolution to maximize species protection. Thus, despite broad exposure within a population, variability in disease outcomes are expected at the individual level based on differences in HLA composition such that those individuals possessing HLA alleles that match the epitopes of offending pathogens will be spared from disease outcomes due to their ability to successfully eradicate the pathogens. Consistent with a protective role, prior research has demonstrated that certain HLA alleles protect against dementia and age-related brain structural and functional changes,25-28 presumably due to successful elimination of foreign antigens that would otherwise gradually cause damage if they persisted. 29 Given the high correlations among the 4 conditions evaluated in the present study, it is reasonable to suppose that HLA alleles that are protective against dementia may similarly confer protection against the other conditions investigated here.
HLA and disease: Persistent antigens and autoimmunity
In spite of the evolutionarily protective role of HLA, circumstances arise in which HLA may not afford protection. First, each individual possesses a limited repertoire of HLA genes which vary in their binding affinity and immunogenicity and, consequently, their ability to successfully eliminate foreign pathogens. The absence of sufficient HLA-antigen binding is presumed to result in viral (or other foreign) antigen persistence, eventually leading to inflammation and cell damage. 29 In the case of brain tissue, the nature of the damage and ensuing disease development may depend on any number of factors including the specific pathogen, route of entry into the central nervous system, selective preference of certain regions for particular microbes, the brain milieu (eg, the effect of lifestyle factors that may mitigate damaging effects), genetic differences that may promote or moderate disease development, immunosenescence, among others. That is, we speculate that despite phenotypic and neuropathological differences, all 4 conditions investigated in the present study may be a result of antigen persistence due to HLA-antigen incongruence. The second mechanism through which HLA may contribute to disease is through autoimmunity. Such is the case with MS, for which HLA DRB1*15:01 has been shown to be predisposing due to molecular mimicry in which myelin and/or oligodendrocytes are cross-reactive with viral epitopes that match with DRB1*15:01 resulting in an immune response that not only eliminates the virus but also attacks myelin and/or myelin producing oligodendrocytes. 30 Autoimmune processes have also been implicated in the other conditions investigated here31-33; the extent to which certain HLA alleles may overlap with offending pathogens, thereby resulting in autoimmunity, remains to be fully elucidated in relation to AD (dementia), PD, and ALS (a motor neuron disease).
Putting in it together
The current findings document robust correlations among 4 distinct conditions, even after removing the effects of life expectancy, latitude and population size. These findings point to shared etiological and/or contributory mechanisms of which pathogens may be implicated. Indeed, the varied post-infection sequelae associated with Streptococcus pyogenes (group A streptococcus) represents a model akin to that considered here. In the case of group A streptococcus, it has been shown that exposure to the bacterial pathogen may result in subsequent damage to any of several organ systems, manifesting as conditions ranging from rheumatic fever, glomerulonephritis, and reactive arthritis to Tourette’s syndrome and attention deficits. 34 Here, we speculate that all 4 seemingly disparate conditions investigated in the present study are a result of exposure to ordinary pathogens that are common across the globe. Typically, HLA-mediated immune responses facilitate elimination of those pathogens; however, some genotypes may be predisposing to certain conditions, as in the case of autoimmunity, or are less effective at mounting an immune response thereby resulting in persistent antigens that ultimately result in neurological damage. Future studies aimed at identifying and eliminating persistent antigens that are associated with these diseases are warranted to curb their increasing global burden.
Limitations, Qualifications, and Future Challenges
While the findings provide compelling evidence of global intercorrelations and a systematic covariation among DEM, PD, MS and MND prevalences, it is necessary to consider study limitations, qualifications, and future directions. Our study analyzed disease prevalences across the 195 GBD countries; the extent to which the findings here extend to other epidemiological indicators such as incidence and mortality remains to be investigated. Furthermore, the influence of additional factors such as sex, age, and ethnicity were not investigated here as the aim was to obtain a bird’s eye view of the global associations among the diseases. It is possible that measurement error may have influenced the GBD prevalence data on which the present findings were based; however, since the GBD study is the most comprehensive worldwide epidemiological study to date, the findings here likely provide a best-estimate of the global associations and systematic covariation of the prevalences of these 4 diseases. Finally, we hypothesized (a) that common pathogens may constitute a shared etiological mechanism (that does not preclude other shared mechanisms, such as sleep alterations, stress, exposure to chemicals and unhealthy foods, etc.) and (b) that a differential vulnerability of the affected neural elements and systems may underlie the pervasive (across countries) systematic covariation of disease prevalences; future studies aimed at identifying specific shared etiologic links and differential neural vulnerabilities are warranted to mitigate the global burden of these 4 devastating diseases.35,36
Footnotes
Funding:
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Partial funding for this study was provided by the University of Minnesota (the Anita Kunin Chair in Women’s Healthy Brain Aging, the Brain and Genomics Fund, the McKnight Presidential Chair of Cognitive Neuroscience, and the American Legion Brain Sciences Chair). The sponsors had no role in the current study design, analysis or interpretation, or in the writing of this paper. The contents do not represent the views of the U.S. Department of Veterans Affairs or the United States Government.
Declaration of Conflicting Interests:
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Author Contributions
LMJ and APG contributed to data retrieval and writing the manuscript. APG contributed to data analysis and drafted the figures. LMJ and APG reviewed and approved the paper.
Ethical Approval
This article does not contain any studies with human participants performed by any of the authors.
Data Availability
The datasets generated and analyzed in the current study were retrieved from published papers2-5 and are publicly and freely available from them.
Significance Statement
The global prevalences of dementia, Parkinson’s disease, multiple sclerosis, and motor neuron diseases are highly correlated and grouped as a single factor explaining 96.3% of the variance in a factor analysis. Common country-specific measures that may influence prevalence, such as population size, life expectancy and latitude, accounted for 28.5% of the variance explained, while diseases were still grouped in a single factor, indicating the involvement of other, unknown pathogenic insult(s). Various such possible insults are discussed with an emphasis on globally distributed biologic pathogens (viruses, bacteria) which have been implicated in the etiology of these diseases in previous studies.
