Abstract
Infectious diseases exert a constant evolutionary pressure on the innate immunity genes. TLR4, an important member of the TLR family, specifically recognizes conserved structures of various infectious pathogens. Two functional TLR4 polymorphisms, Asp299Gly and Thr399Ile, modulate innate host defense against infections, and their prevalence between various populations has been proposed to be influenced by local infectious pressures. If this assumption is true, strong local infectious pressures would lead to a homogeneous pattern of these ancient TLR4 polymorphisms in geographically-close populations, while a weak selection or genetic drift may result in a diverse pattern. We evaluated TLR4 polymorphisms in 15 ethnic groups in Iran, to assess whether infections exerted selective pressures on different haplotypes containing these variants. The Iranian subpopulations displayed a heterogeneous pattern of TLR4 polymorphisms, comprising various percentages of Asp299Gly and Thr399Ile, alone or in combination. The Iranian sample, as a whole, showed an intermediate mixed pattern when compared with commonly-found patterns in Africa, Europe, Eastern Asia and the Americas. These findings suggest a weak, or absent, selection pressure on TLR4 polymorphisms in the Middle-East that does not support the assumption of an important role of these polymorphisms in the host defense against local pathogens.
Introduction
The innate immune system is constantly exposed to pressures from infectious diseases, suggesting that one of the evolutionary selective forces shaping our genome during human history is represented by infectious pathogens.1,2 Among immunity genes, positive natural selection, and especially balancing selection, is relatively common compared with other functional gene classes. 1 The initial step in the initiation of an immune response is represented by specific recognition of conserved structures of bacteria, viruses, fungi and protozoa by so-called pattern recognition receptors (PRRs). 3 The most studied PRRs are the TLRs. 4 Among them, TLR4, encoded by the gene with the same name located on chromosome 9, is the master receptor for the LPS component of Gram-negative bacteria, but it also recognizes other pathogen-associated molecular patterns (PAMPs) from mycobacteria, fungi, viruses and even parasites, such as malaria. 5 – 8
So far, more then 35 TLR4 polymorphisms have been described, 9 among which the most studied are two non-synonymous single-nucleotide polymorphisms (SNPs) located in the leucine-rich repeat domain responsible for ligand recognition: an A/G transition at SNP rs4986790 (896A/G) that causes an Asp/Gly amino acid change at position 299 of the molecule, and a C/T transition at SNP rs4986791 (1196C/T) that causes a Thr/Ile amino acid change at position 399. These mutations affect the ligand-binding region (Asp299Gly) of TLR4 and the co-receptor-binding region (Thr399Ile) of the receptor respectively.10,11 It has been shown that these TLR4 polymorphisms have important functional consequences related to the production of pro- and anti-inflammatory cytokines. In addition, they modulate the systemic inflammatory response syndrome in septic shock 12 and influence susceptibility to Gram-negative infections. 13
Based on the prevalence in various populations around the globe, it has been shown that both these polymorphisms are ancient and occurred more than 65,000 years ago in Africa, before the migration of Homo sapiens out of Africa. Important differences have been described in the prevalence of TLR4 polymorphisms in various populations, possibly depending on local infectious pressure and population migration. The non-synonymous polymorphism Asp299Gly has a high prevalence in sub-Saharan Africa, and it has been proposed to have protective effects against mortality from malaria. 6 However, because of its effects in increasing susceptibility to severe bacterial infections, the TLR4 haplotype containing solely this polymorphism seems to have disappeared from Asians and Americans. In contrast, Asp299Gly has been found present in co-segregation with Thr399Ile in Europeans; this haplotype showing selective neutrality.6,14
Despite the progress in understanding the biology of TLR4, several important questions remain regarding the factors influencing the prevalence of TLR4 polymorphisms in various populations. One of the most intriguing questions to be answered is that of the degree to which the Asp299Gly and Thr399Ile SNPs (and especially of the haplotype containing both mutations) have influenced susceptibility to infections (especially in Europe and West Asia), with two possible scenarios: one in which these TLR4 polymorphisms strongly influenced susceptibility to infections and, subsequently, their prevalence was under selective pressure, and another one in which they had little influence of infection susceptibility and their prevalence in the Eurasian landmass was mainly influenced by genetic drift, as previously proposed. 14 One approach to assess whether infections exerted selective pressure on the TLR4 variants is to investigate these polymorphisms in populations of different ethnic origins that have been living in the same geographical location for a long period of time and under the same infectious pressure. One would expect that in case of strong infectious pressure, the prevalence of ancient polymorphisms such as these TLR4 SNPs and haplotypes would become similar in the populations, irrespective of their ethnicity. The Middle East, and especially Iran, is an ideal target for such a study, considering its rich ethnic diversity and its key location on the routes of migration during the out of Africa human migration.
Materials and methods
Description of the Iranian subpopulations
List of the 15 Iranian subpopulations evaluated in this study
Sample collection
Blood samples were collected from unrelated healthy volunteers after obtaining informed consent. All selected participants were self-reported third-generation members of a specific ethnic group, as described above. Those individuals of mixed ancestry or from mixed marriages were excluded. The evaluated samples were collected from different Iran provinces as shown in Figure 1A.
(A) Iranian distribution of TLR4 haplotypes among the 15 groups and the main locations of these groups in Iran. Approximate geographic sampling location of each group is indicated by black arrows. Circles indicate allele frequency (red, Asp299Gly; yellow, Thr399Ile; blue, Asp299Gly/Thr399Ile). (B) The most likely paths of migration during the ‘out of Africa’ migration of modern humans and the world distribution pattern of TLR4 haplotypes in Europe (EU), Africa (AF), East-Asia (EAS) and the Americas (AM), and the key location of Iran on the route.
Genotyping
DNA samples were extracted from whole blood using the salting out method. 16 To screen for the TLR4 polymorphisms Asp299Gly (rs4986790) and Thr399Ile (rs4986791), the amplified sequences were digested with the restriction enzymes Nco-I and Hinf-I (New England BioLabs, Beverly, MA, USA) and separated on agarose gel 2% stained with ethidium bromide. PCR was performed as described by Van Der Graaf et al. 17
Single-nucleotide polymorphisms and haplotype data analysis
Data analysis used the open-source statistical environment R 18 and several of its packages (available for free download at: www.r-project.org). The linkage disequilibrium between the two loci (D' and r) 19 was computed using the LD function in package genetics, 20 and the deviation from Hardy-Weinberg equilibrium (HWE) for each locus was tested using function HWE.test in the same package. Genotype and haplotype frequencies between populations were compared using Fisher's exact test with 10,000 permutations as implemented by the Fisher test in the statistics package, 18 as well as G-tests 21 using function test.g in package hierfstat, 22 also with 10,000 permutations. We computed Wright's fixation index FST 23 between populations using calcFst in package polysat 24 and used these as genetic distances between populations. In order to search for structure in the genetic data, we projected the FST distances in two dimensions using classical multidimensional scaling (function cmdscale in package stats) 18 and we also conducted hierarchical agglomerative clustering (hclust in the same package). The geographic distances were computed as great circle distances using function distance in package argosfilter. 25 The Mantel correlations 26 between genetic and geographic distances were computed with function mantel in package vegan 27 using 99,999 permutations. Geneland 28 – 30 searches for population structure using genetic and geographic data using a Bayesian approach; we used this package in an attempt to identify patterns in our samples [with and without geographic structure, 5,000,000 Markov Chain Monte Carlo (MCMC) iterations].
The molecular genetic variation within, and among, populations was tested using different implementations of Analysis of Molecular Variance (AMOVA) as given in the R packages: pegas (function amova), 31 ade4 (functions amova and randtest) 32 and vegan (function adonis).33,34
When necessary, the P-values were adjusted for multiple comparisons using Holm's correction, 35 as implemented by function p.adjust in package stats. 18 Statistical significance was accepted for P-values < 0.05.
Results
TLR4 haplotype frequencies for the overall Iranian sample and for each of the studied Iranian subpopulations
TLR 4 896A/G (Asp299Gly) and 1196C/T (Thr399Ile) haplotype frequencies in Iran and in the various ethnic groups
The overall Iranian distribution of 896A/G and 1196C/T diplotypes
Data are presented as number (%) of diplotypes in Iranian individuals.
Fisher's exact test for TLR4 diplotype differentiation between populations. (uncorrected P-values; no test survives Holm's multiple testing correction)
To evaluate the significance of genetic differences, we also used G-tests [overall: 10,000 permutations, G = 69.24, P = 0.0024 for each pair of populations, see supplementary data (supplementary Table 1)]. As in the case of Fisher's exact test, no paired test survived Holm's multiple testing correction. These results were confirmed by AMOVA. The values of FST genetic distances between population pairs (supplementary Table 2) are graphically represented in Figure 2 (white = min, black = max). It can be seen that Lurs of Luristan and Arabs are the most different, but the differences are relatively small (min = 0.0, max = 0.039, mean = 0.0092, median = 0.0056) and statistically not significant. Multi-dimensional scaling (MDS),
37
a method for visually representing distance matrices, was used it to plot the pair-wise FST distances between the 15 studied populations in order to identify any patterns and clusters in the genetic data (Figure 3).
Graphic representation of the FST distances (white, min; black, max). MDS plot of pairwise FST distances in two dimensions. Populations are classified in terms of the language family/subfamily (for Queshm, no such classification was possible).

We also executed 10 independent runs of Geneland using only the genetic information, with 1,000,000 MCMC generations,
38
a thinning of 100 and burn-in of 2000, and all converged to classifying all individuals into a single population. Using FST to create a hierarchical clustering of the populations resulted in the phenogram represented in Figure 4. Supporting the finding that the genetic structure of these loci does not reflect geography was the very low and non-significant Mantel correlation between geographic distances (great circle distances) and FST distances between all pairs of populations: r = −0.014, P = 0.49 (9999 permutations).
Hierarchical clustering of the populations based on genetic distances.
TLR4 polymorphisms in different Iranian ethnic groups with language as a grouping factor
Moreover, to investigate the possible differences between populations speaking languages belonging to different language families, we amalgamated all sampled individuals into three ‘language groups’ (L-groups): Afro-Asiatic (107 individuals), Altaic (148) and Indo-European (420) (the individuals from Qeshm have been excluded from the analysis because of their uncertain linguistic status). Comparing groups based on linguistic affiliation can, potentially, provide insights into processes occurring over several thousand of years, including genetic drift and selection pressures. Both SNPs were in HWE and strong LD in all three L-groups, except Afro-Asiatic where the LD was less pronounced (D' = 0.47, r = 0.25) but still highly significant (P = 0.0003).
Overall g-test values after multiple testing correction
Discussion
Owing to its Middle East location, Iran, the 18th largest country in the world, has a special geographic significance for the various human migrations on the Eurasian landmass, from that of Homo sapiens out of Africa, 39 to the Silk Road, 40 and to more recent periods in history. Iran is a diverse country with various geographical landforms, a wide range of climatic variation, as well as populations of different religions and ethno-linguistic backgrounds. Iran is home to a large number of different ethnic groups including Pars, Turk, Kurd, Arab and Baloch, 41 which have interacted during history with many other groups, such as Macedonians, Arabs, Turks and Mongols. 42 These characteristics are reflected in the 15 different ethno-religious groups collected from different provinces across the country. In the present study, we took advantage of the ethnic diversity of populations that live in Iran and assessed the prevalence and distribution of TLR4 polymorphisms among these diverse ethnic groups. We hypothesized that if TLR4 polymorphisms would strongly influence susceptibility to infections, the resulting infectious pressure would result in a homogeneous pattern of the ancient TLR4 polymorphisms (Asp299Gly and Thr399Ile) in these ethnically-different populations. In contrast, a situation in which TLR4 polymorphisms would not have a strong effect on susceptibility to infections, thus resulting in weak selection pressures or even genetic drift, will have, as a consequence, a diverse pattern of TLR4 polymorphisms in the various populations.
The prevalence of the Asp299Gly and Thr399Ile TLR4 polymorphisms between Iranian subpopulations differed, as seen in Table 2 and Figure 1. Interestingly, the heterogeneity of TLR4 polymorphisms and haplotypes in the Iranian subpopulations studied was greater than on the African continent. In populations from both East and West Africa, a homogenous pattern of TLR4 polymorphisms was seen, characterized by the presence of 5–15% individuals bearing the Asp299Gly SNP, and a much smaller group of individuals bearing Asp299Gly/Thr399Ile in linkage, while no Thr399Ile polymorphism is present alone. It has been suggested that this homogeneous distribution of TLR4 polymorphisms may be caused by the protective effect against severe malaria.6,14 In contrast, it is less clear whether TLR4 polymorphisms also influence susceptibility to infections in the colder climates of Europe and Asia. The TLR4 299Gly/399Ile haplotype comprising both polymorphisms does not seem to modify the response of monocytes to endotoxin and susceptibility to infections, and it has been proposed to be the result of genetic drift in populations from Europe and Asia. 14 If TLR4 polymorphisms would, indeed, not modify susceptibility to infections in the colder climates of Eurasia, one would expect that the various ethnic populations studied here would maintain a high heterogeneity of these TLR4 polymorphisms. In line with this, the Iranian subpopulations displayed a heterogeneous pattern of TLR4 polymorphisms, comprising various percentages of Asp299Gly and Thr399Ile alone, or in combination. These differences are most probably a result of the specific geographic origin, natural borders and/or ethnic and religion barriers between the populations studied. In this respect, particular patterns can be observed for Baloch, Zoroastrian, Azeri, Jew, Arab, Kurd and Lurs of Luristan (see Figure 1A). The diversity of TLR4 polymorphisms in Iranian populations is reminiscent of the mixed pattern of TLR4 haplotypes described among Israeli subpopulations. 14 The variation of the prevalence of TLR4 polymorphisms in Iranian subpopulations living in close geographic proximity suggest a weak, or absent, infectious pressure on TLR4 SNPs in the Middle East. Evidence for the absence of a strong infectious pressure among most of the Iranian subpopulations included in this study was also found in human leukocyte antigen (HLA) class II genes. 43 A few recently published studies investigated HLA class II allele and haplotype frequencies in order to find the genetic relationship between the major Iranian subpopulations, 43 based on the fact that HLA data can be used to elucidate the genetic history of human populations.44,45 Parsee, Zoroastrian and Baloch subpopulations were found to be mainly confined to intrapopulation variations, with little subdivision among studied Iranian populations. 41 A closer genetic relationship was present between Iranian Arabs and Iranian Jews when compared with either Iranian Arabs and Middle-Eastern Arabs or Iranian Jews and other Jews. 46 When comparing Kurds and Azeris, another two major ethnic groups in Iran, considerable similarities in HLA class II allele and haplotype distributions were seen, except for DQB1*0503, which was observed with a higher frequency in Kurds in comparison to Azeris. 47
The Iranian subpopulations assessed in the present study displayed an interesting intermediate mixed pattern of TLR4 haplotypes when compared with commonly found patterns in Africa (higher GC, no AT, less GT),6,14 Europe (very rare GC, lower AT, higher GT)14,48– 50 and Central/Eastern Asia14,51– 55 and Americas (no GC, no AT, no GT) 14 (see Figure 1B). The driving forces that may influence TLR4 haplotypes frequencies in humans may be represented by local infections for which TLR4 is important as a recognition receptor (e.g. Gram-negative bacteria, mycobacteria). As mentioned above, strong effects of the TLR4 SNPs on infection susceptibility would have resulted in a homogeneous haplotype pattern for the overall Iranian sample. However, this proves not to be the case: all haplotypes were present in various percentages in the Iranian populations, suggesting the absence of strong selective forces. This is in line with Barreiro et al., 56 who showed a weak negative and/or balancing selection on extracellular TLRs, implicitly of TLR4. As suggested before, a different situation may be present in sub-Saharan Africa and additional studies involving ethnically-different populations living in close proximity should be performed also in warm climates.
To investigate the possible differences between Iranian subpopulations belonging to different language families, the prevalence of TLR4 polymorphisms was analyzed in different ethnic, groups with language as a grouping factor. We conclude that higher-level language classification explains a small part of the genetic variance in TLR4 and highlights Afro-Asiatic populations as more different than the other two language-groups; however, this may be owing to its internal inconsistency.
TLR 4 SNPs have been associated with susceptibility to certain infectious diseases6,7,12,13 because of its role as a receptor for the LPS of Gram-negative bacteria, 8 mannans from fungi, 7 and cell wall structures of the Plasmodium parasite 6 and mycobacteria 5 . Innate immune system responses to specific infectious pressure may be reflected by important differences in the prevalence of TLR polymorphisms in populations, but these differences depend on the functional consequences of certain polymorphisms. The TLR4 299Gly allele has been associated with an increased cytokine response which seems to be associated with protection from severe malaria. 14 In contrast, the TLR4 haplotype containing both 299Gly and 399Ile does not seem to modify the function of the molecule. 14
In conclusion, in the present study we describe the prevalence of Asp299Gly and Thr399Ile TLR4 polymorphisms in 15 ethnically-different populations from Iran. In contrast to the homogeneity of these polymorphisms in other populations, such as those from the African continent, the Iranian subpopulations display a broad heterogeneity of TLR4 Asp299Gly and Thr399Ile. These findings suggest a weak, or absent, effect of TLR4 polymorphisms on infection susceptibility in the Middle East. However, further studies employing an increased number of markers that are not subject to selective pressure are needed to more thoroughly assess these conclusions.
Footnotes
Acknowledgements
M.I. was supported by the Sectoral Operational Programme Human Resources Development (SOP HRD), financed from the European Social Fund and by the Romanian Government under the contract number POSDRU/89/1.5/S/64109. B.F. was supported by a Rubicon Grant of the Netherlands Organization for Scientific Research (NWO). M.G.N. was supported by a Vici Grant of the Netherlands Organization for Scientific Research.
