Abstract
Moral Foundations Theory (MFT) predicts that moral behaviour reflects at least five foundational traits, each hypothesised to be heritable. Here, we report two independent twin studies (total n = 2020), using multivariate multi-group common pathway models to test the following three predictions from the MFT: (1) The moral foundations will show significant heritability; (2) The moral foundations will each be genetically distinct and (3) The clustering of moral concerns around individualising and binding domains will show significant heritability. Supporting predictions 1 and 3, Study 1 showed evidence for significant heritability of two broad moral factors corresponding to individualising and binding domains. In Study 2, we added the second dataset, testing replication of the Study 1 model in a joint approach. This further corroborated evidence for heritable influence, showed strong influences on the individualising and binding domains (h2 = 49% and 66%, respectively) and, partially supporting prediction 2, showed foundation-specific, heritable influences on Harm/Care, Fairness/Reciprocity and Purity/Sanctity foundations. A general morality factor was required, also showing substantial genetic effects (40%). These findings indicate that moral foundations have significant genetic bases. These influenced the individual foundations themselves as well as a general concern for the individual, for the group, and overall moral concern.
Moral Foundations Theory (MFT: Haidt & Graham, 2007) is perhaps the leading psychological model of moral judgement. The theory predicts that moral foundations are not purely learned but reflect adaptations to solve specialised problems of human collaboration, implemented in brain systems, and guided by genetic mechanisms. A key prediction of the theory is that variation in moral foundations will be in part heritable (Graham et al., 2009, p. 1031). Therefore, resolving the question of whether genes influence moral foundations is essential for the theory because the lack of heritability for moral foundations would falsify a key prediction. However, only two tests of this claim have been reported to date, reaching ambiguous conclusions. Using Australian twin data, Smith et al. (2017, p. 434) concluded that ‘regardless of sample or MFQ measure, we cannot find consistent evidence for genetic influences on moral foundations’. Kandler et al. (2019), analysing a German twin sample, reported substantial genetic influences on Harm, Fairness and Purity foundations, explaining 73, 51 and 28 percent of the variance, respectively. Here, in two studies, we test the heritability of the moral foundations using the data reported in the Smith et al. (2017) and Kandler et al. (2019) studies. Before doing this, we first background the MFT, lay out the rationale for our predictions and then develop an analytic strategy to test them.
Background to the moral foundations theory and the moral foundations questionnaire
Haidt and colleagues began their work on moral foundations by conducting an exhaustive survey of previous morality studies, concluding that much of moral behaviour is driven by moral intuitions rather than by effortful reasoning or purely learned moral norms (Graham et al., 2009). Based on this, they argued that at least five distinct mental systems were required to reflect moral judgement’s complexity: Harm/Care; Fairness/Reciprocity; Ingroup/Loyalty; Authority/Respect; and Purity/Sanctity. They instantiated this model in the moral foundations questionnaire (MFQ), a thirty-item measure consisting of 15 Relevance items and 15 Judgment items (Graham et al., 2011). The Relevance items measure whether participants agree a particular behaviour is relevant to them in making moral judgements (e.g. ‘Whether or not someone acted unfairly’). By contrast, Judgment items measure participants’ agreement with particular moral judgements (e.g. ‘Compassion for those who are suffering is the most crucial virtue’). All MFQ-30 items are measured on a 6-point Likert scale.
The five foundations of the MFT are organised into two domains termed ‘individualising’ (loading on Harm/Care and Fairness/Reciprocity) and ‘binding’, reflected in high scores on the Ingroup/Loyalty, Authority/Respect and Purity/Sanctity foundations (Graham et al., 2009). Binding and individualising domains have taken on a substantial value in their own right, based on data showing that these groupings efficiently capture variance along the liberal-conservative political dimension, with liberals endorsing the individualising foundations more strongly than they endorse the binding foundations, while conservatives tend to support the five foundations strongly and equally (Graham et al., 2009; Kivikangas et al., 2021).
The MFT has generated many hundreds of papers, ranging from confirmation of moderate associations with political orientation (Franks & Scherr, 2015), attitudes towards abortion, gambling, immigration and same-sex marriage (Koleva et al., 2012) to associations of binding and individualising domains with brain volumes (Lewis et al., 2012). Here we restrict ourselves to the question of support for a genetic basis for variance in moral foundations endorsement. In Study 1, we apply a multivariate approach to modelling the Australian data reported by Smith et al. (2017). In Study 2, we integrate these data with the larger German twin dataset collected and reported by Kandler et al. (2019), enabling a test replication of the model generated in Study 1. A key feature of the present studies was the use of multivariate twin models, including a multi-group model of both datasets jointly, allowing us to combine information and thus raise power and test predictions deriving from the MFT.
Study 1
Smith et al. (2017) collected two waves of twin data using brief questionnaires as detailed below. All items used are reported on the Open Science Framework (OSF) page created for this paper. When wave one was planned, the MFQ-30 was not widely available. As a result, the questionnaire used in Wave 1 consisted of 10 items (MFQ-10) from a prototype relevance-only version of the MFQ (Graham et al., 2009, Study 1). At Wave 2, a 20-item scale (MFQ-20) was used, constructed from items in now widely available MFQ-30 (Graham et al., 2011), with 10 items chosen from each of the two MFQ-30 subscales. The items in the two waves were largely non-overlapping, with only four items included in both Wave 1 and Wave 2.
To estimate the heritability of the moral foundations, Smith et al. (2017) built and tested separate univariate genetic models for each of the five foundations and each of the binding and individualising domains, duplicating these for the male and female data and again for each wave of the data, excluding opposite-sex twins from all analyses. Across the models, many heritability estimates were non-trivial, with some squared path estimates exceeding .3. However, none of these estimates reached significance. Based on these findings, Smith et al. (2017) concluded that there is ‘little evidence that moral foundations are heritable’ (p 424). They also conducted exploratory factor analyses of the psychometric structure, suggesting that the MFQ relevance type items did not reliably measure the same construct across waves.
The present study: Combining data and theory-based multivariate models to improve power
As in any study, a range of analytic choices is available for the researchers to analyse these data. First, with two datasets, one can analyse these samples jointly in one model with the benefit of higher power to detect small effects or as separate, smaller samples, which allows modelling sample-specific variance. Likewise, separate male and female analyses could be conducted within each dataset. The more granular approach adopted in Smith et al. (2017) reduces the sample size available in each analysis, not only by restricting models to one wave and one sex but also by excluding opposite-sex twin pairs (27% of the sample). Here, we included both male and female responses in our models. Finally, in addition to combining all available data across waves and sex, we sought to increase power by modelling all five foundations jointly using multivariate models. These multivariate models also allow us to capture predicted model structure, such as the clustering of foundations into individualising and binding domains and can yield additional power over a series of individual univariate models (Schmitz et al., 1998).
Given that the five foundations are theoretically organised by binding and individualising domains, a genetic model incorporating such domains and allowing additive genetic (A), common environmental (C) and unique environment (E) to act via these latent factors (in addition to specific influences directly working on the manifest scales) is necessary to reflect the theory. Multivariate models have been developed which include such sophisticated higher-order factor structures and modes of inheritance (Neale & Maes, 1996). Most relevant here is the common pathway model (McArdle & Goldsmith, 1990).
Using this model, in Study 1, we tested three predictions: (1) genetic influences would be necessary to explain the variance in moral foundations scores; (2) each foundation would be genetically distinct (i.e. heritable) over and above the effects attributable to binding and individualising; and (3) the five moral foundations would be organised into broader moral domains of binding and individualising, and these domains would show significant heritable influences. We did not preregister these hypotheses since the data used in these studies were already collected.
The first of these predictions is derived straightforwardly from the postulate in moral foundation theory that moral foundations are evolved mechanisms with substantial genetic individual differences. Prediction 2 is derived from the claim in moral foundation theory that each of the domains is an adaptation to solve a particular problem in cooperation. If each domain requires a distinct biological adaptation, it can be predicted that they should also show distinct heritability. The third prediction regarding the heritability of the binding and individualising domains is more speculative. While the existence of these clusters among the domains is an empirical observation (Zakharin & Bates, 2021), the origin of the clustering is unknown. Models compatible with no heritability for the clusters are possible: for instance, top-down influences such as cultural learning. However, we speculate that the clusters may share input and output systems common to the domains within a cluster (for instance, a norm-detection system might feed into all three binding domains). Similarly, adaptations may evolve to coordinate and equalise the domains. For instance, very high levels of insistence on purity are likely incompatible with very low levels of hierarchy and loyalty. The complex adaptations involved would be expected to show heritable variation in either of these cases.
Method
Participants
All twins reported by Smith et al. (2017) were included. Participants were subjects from a larger sample of Australian twins (Wright & Martin, 2004). The dataset included 772 individuals. After removing participants who failed two attention checks, the final sample consisted of 766 individuals (447 females, 319 males; age M = 25.0, SD = 4.27). A total of 402 individuals participated in both waves, completing both 10-item and 20-item versions of MFQ.
Materials
Moral foundations were measured using abbreviated versions of the Moral Foundations Questionnaire (MFQ). In Wave 1, a 10-item MFQ (MFQ-10) having just 2-items per foundation was used. In Wave 2, a longer 20-item questionnaire (MFQ-20) with four items per foundation was used. MFQ-10 included only items from the relevance subscale, whereas MFQ-20 included 10 relevance and 10 judgment items (see also Data and Measures section in Smith et al. (2017)). Scales were scored according to the mean response on each item in each scale. To maximise power, we pooled data from Waves 1 and 2 into a single dataset using the following procedure. First, because MFQ-10 items were measured on a 5-point Likert scale, whereas MFQ-20 items were measured on a 6-point Likert scale, scores in each wave were scaled with a mean of zero and SD of 1 prior to subsequent analysis. Next, for the 402 subjects who had completed both waves, composite scores were generated for each domain by taking the average of scores across the two waves. The two waves were weighted equally. For participants participating in only one wave, their standardised score for this wave was entered into the analyses.
Twin models and methods
We briefly describe the ACE model and the common pathway model we use here for readers unfamiliar with twin modelling. The classical twin design relies on the naturally occurring phenomenon of twinning (Knopik et al., 2016). While monozygotic (MZ) twins are genetically identical, dizygotic (DZ) twins share on average 50% of their genes. Being raised in the same family, twin pairs share their family environment, with different families experiencing different family environments. These differences allow phenotypic variance to be decomposed into genetic (A), shared environment (C) and unique environment (E) components. MZ twins thus share both A and C, while DZ twins share C + half A. Differences between MZs thus are due only to non-shared environment effects (and measurement error). The effects of unique environment can therefore be estimated as 1-rMZ. The influence of genes and shared environment can in turn be estimated by solving a system of two linear equations: rMZ = A + C and rDZ = 0.5*A + C, where rMZ is the phenotypic trait correlation for MZ pairs and rDZ the correlation in DZ pairs. Solving both equations for A we get A = 2 × (rMZ – rDZ), allowing C to be resolved as 1 – (A + E). This simple algebraic formulation can be translated into a structural equation model (SEM) shown in Figure 1. Decomposing phenotypic trait variance into A (additive genetic), C (shared environmental), and E (unique environmental) variance in the classical twin design. Figure from umx package (Bates, Maes, et al., 2019).
Building on this ACE model, the common pathway model permits testing the hypothesis that sources of variance act on a phenotype through one or more common factors while retaining the possibility of additional effects specific to a single phenotype (Eaves et al., 1978; Neale & Maes, 1996). An example of this common pathway model structure is shown in Figure 2. Diagram of a common pathway model. Each common factor (CF) can be affected by genes (A), shared environment (C) and unique environment (E). Manifest variables may be influenced by one or more common factors but also have their own specific (s) A, C and E influences. Figure from umx package: (Bates, Maes, et al., 2019).
It is important to note here that heritability estimates obtained from twin studies have (testable) assumptions. Importantly, it is assumed that parents’ treatment of twins is independent of their zygosity (the equal environment assumption). Existing evidence indicates that the equal environment assumption is not violated in ways that bias results significantly (Barnes et al., 2014; Derks et al., 2012). One interesting test of this assumption arises when parents incorrectly believe that their monozygotic twins are dizygotic (or vice versa). In this case, it is the actual rather than the perceived zygosity (which would guide the differences in parenting) that predicts the degree of phenotypic similarity between the twins (Kendler et al., 1993; Scarr & Carter-Saltzman, 1979). Additional assumptions include a lack of assortative mating. Not modelling this leads to lower estimates of heritability and inflated estimates of the shared environment due to the greater than 50% average genetic similarity of DZs induced by parental assortment among other effects (Keller & Coventry, 2005; Keller et al., 2010).
An additional factor in twin models is testing if the models apply equally well to both sexes (Neale et al., 2006). Such sex limitation modelling is similar to the concept of measurement invariance, but specialised for hypotheses testable in twin data. Sex limitation models test whether a behavioural trait is equally heritable in both sexes or if genetic influences are larger in one sex compared to the other (termed quantitative sex limitation) or influencing the phenotype in one sex but not at all in the other (qualitative sex limitation effects).
Quantitative sex limitation expands the classic univariate ACE model by including additional paths permitting same-sex male and same-sex female twin pairs expected covariances to differ. The qualitative model further expands the quantitative model when opposite-sex pairs are available, permitting expected covariances in this group also to differ from same-sex DZ pair groups. These additional paths permit modelling of genetic effects present in one sex but not the other if present (Neale et al., 2006). In the absence of significant sex-specific influences, modelling males and females jointly is appropriate, thus increasing the power of the study to detect small effects (Neale et al., 2006).
Software used
All statistical analyses were completed in R (R Core Team, 2020) using the umx (Bates, Maes, et al., 2019) and OpenMx (Boker et al., 2011; Neale et al., 2016) packages. We used full information maximum likelihood (FIML) estimation to handle incompletely observed indicators. All model fits and comparisons between models were assessed using −2 × log-likelihood which follows a χ2 distribution, and by the Akaike Information Criterion (AIC; Akaike, 1983) which penalises un-parsimonious models. Models with lower AIC indicate a better fit. Each of these models were compared using the umxCompare() function, with the most complex model as the baseline and compared to successively simpler models. When comparing models we also present Akaike weights which are interpreted as conditional probabilities for each model to be true among the models being considered (Wagenmakers & Farrell, 2004).
Power analyses
For the univariate ACE models, and assuming no shared environmental effects, we estimated power using the power.ACE.test() function in the umx package. Given our sample size of 193 MZ and 261 DZ twin pairs, the minimum heritability effect detectable at 80% power was estimated at a2 = .18 (corresponding to a path coefficient of .42). This effect size is comparable to those reported in other studies of similar variables.
Results
Test of sex limitation showing no evidence of sex-specific effects on the moral foundations (p > .999) in Study 1. Models are compared with the most complex Qualitative model.
Note. The ACE model assumes no sex-specific genetic or environmental effects. The quantitative model allows for modelling sex-specific effects that are larger in one sex than in another. The qualitative model expands the quantitative model by allowing situations in which heritability affects one sex but not the other.
Means, standard deviations and twin correlations for five moral foundations and two moral domains in Study 1 and 2.
Note. rMZ = monozygotic twin correlation; rDZ = dizygotic twin correlation. Standard errors are in parenthesis. In the Australian dataset, variables were measured on a Likert scale ranging from 0 to 5. In the German dataset, variables were measured on a Likert scale ranging from 1 to 6.
Testing heritability of the moral foundations
Having established that the data could reasonably be combined into a single analysis, we then tested our three predictions outlined in the introduction. We began by testing our first prediction – that genetic influences would be necessary to explain the variance in moral foundations scores. While our primary test of genetic influence would be conducted using common pathway models (below), we took the opportunity here to test the minimum number of genetic factors compatible with the data as an indication of its genetic complexity. To do this, we first constructed a five-factor Cholesky ACE model of the five foundations. A multivariate Cholesky decomposition (Neale & Maes, 1996) specifies one factor for each of the five foundations for each source of variance (A, C and E) allowing estimations of both direct and shared genetic and/or environmental influences.
We compared this multivariate ACE model to one in which all heritable influences were set to zero (the CE model) and to one in which shared environmental influences were set to zero (AE model). As MZ correlations often exceeded twice the DZ correlations suggesting nonadditive genetic variance (Verweij et al., 2012), in addition to ACE models, we also tested dominance (ADE) models. We report these in the supplement (see Table S1). While both genetic (A) and shared environmental (C) effects could be dropped from the model without a significant loss of fit (AE model: χ2(15) = 2.83, p >.999; CE model: χ2(15) = 12.66, p = .628), the conditional probability of the AE model being the true model (Wagenmakers & Farrell, 2004) among the three models tested was .98, suggesting that genetic and non-shared influences provide the most parsimonious account of the moral foundations data. We sequentially dropped the latent A variables from the AE Cholesky model, beginning with the right-most variable. It was not possible to reduce the genetic influence to fewer than two additive genetic factors (i.e. loss of fit moving from two factors to a single general heritable influence was significant (χ2(4) = 13.36, p = .01). Moving from this model to one with no genetic influence resulted in a significant loss of fit (p < .001). We would further test this evidence for genetic variance in the common pathway model below, and in Study 2, but in the first instance our initial prediction, therefore, was supported: genetic influences were necessary to explain the variance in moral foundation scores, and preliminary support was found for at least two genetic influences on the data. We next moved to test our predicted common pathway model.
Common pathway model of binding and individualising domains
Having established the heritability of the moral foundations, we next tested our second prediction: that the five moral foundations show genetic effects specific to each foundation and that they are genetically organised under heritable higher-order moral domains of binding and individualising (prediction 3).
We began with a theoretically constrained common pathway model constructed with two common factors, one loading on only the individualising foundations and the other loading on only the binding foundations (and with specific A, C and E influences on each foundation). This model (Model 1) fit significantly worse than baseline model (the saturated five-factor ACE Cholesky model: χ2(21) = 204.59, p < .001). This suggested that either some of the foundations load on both binding and individualising domains or that more than two factors are needed to explain the structure of the moral foundations. We, therefore, moved to test these additional paths and factors, adding a third factor loading on all foundations and testing if any of the foundations were influenced by more than one of the binding or individualising common factors.
Testing, for each foundation, if adding a path from whichever common factor (binding or individualising) was not initially present indicated that the model improved when adding a path from individualising to the Purity/Sanctity foundation (Model 2; χ2(1) = 37.71, p < .001). In addition, adding a third common factor loading on all foundations improved the model (Model 3; χ2(7) = 144.12, p < .001). This model fit well, and the common factors corresponding to the binding and individualising domains showed strong heritable influences. The general factor was primarily influenced by the non-shared environment. It was not predicted based on the MFT, but other analyses of large multi-study MFQ datasets support the need for such general effects in the MFQ (Zakharin & Bates, 2021).
Having found that a three-factor model was required to account for the data, we next moved to test if elements of this model could be simplified without a significant loss of fit. Suggestive of a lack of shared environment effects, the three shared environmental common paths were estimated near zero (highest β = .01). We, therefore, could drop these without loss of fit. Moreover, the genetic path to the general common latent factor was also estimated near zero (estimated β = .02). All four paths were set to zero with no significant loss of fit (χ2 (4) = 0.01, p > .999). Similarly, all five specific shared environmental paths for the five moral foundations were estimated near zero (highest β = .08). Dropping these, again, did not significantly reduce fit (χ2 (5) = 0, p > .999). This model had two heritable factors corresponding to binding and individualising domains and a general factor influenced by the non-shared environment. Our second prediction was thus also supported: binding and individualising domains were significantly heritable. However, the overall variance structure of the MFQ was more complex than expected and included a third factor that was purely environmental and affected all foundations.
Finally, we tested our third prediction: that the individual moral foundations would be genetically distinct over and above the effects of binding and individualising. We tested this by dropping the genetic influences specific to individual foundations from the three-factor common pathway model. The greatest of these specific genetic effects was estimated at β = .16. However, while such effects may be significant biologically, in these data they could be dropped without a significant loss of fit (χ2(5) = 2.20, p = .821). This final model is shown in Figure 3. Table 3 shows the comparisons between the baseline five-factor Cholesky ACE model and the three common pathway models we considered. Reduced common pathway model in Study 1 showing only significant genetic and environmental influences on five moral foundations. Path values are standardised path coefficients. (95% confidence intervals in square brackets). Three common pathway models tested in Study 1 and their fit comparative to the 5-factor saturated baseline ACE model. Note. Model 1 = 2-factor common pathway model, factors correspond to binding and individualising moral domains. Model 2 expands Model 1 by adding a path from individualising domain to the Purity/Sanctity foundation. Model 3 adds a general factor (loading on all five foundations) to Model 2. AIC = Akaike information criteria. Low AIC values indicate a better fit. The best-fitting model is printed in bold.
Discussion
The multivariate analyses in Study 1 revealed a significant and strong genetic influence on moral foundations. The best-fitting model of moral foundations corresponded largely to our expectation, with heritable binding and individualising domains plus an unexpected general factor with significant unique environmental origin. Foundation-specific heritable influences did not reach significance. We next discuss these findings in more detail, linking them to our expectations and discussing the limitations of the study, especially our low power to detect foundation-level heritability, before moving to Study 2, where we focus on gaining increased power.
In terms of our three predictions, the first prediction – that the moral foundations would show significant heritability – was supported. Specifically, our initial 5-factor ACE model showed that a model requires at least two genetic influences. By contrast, shared environmental influences could be dropped without a significant loss of fit. These results imply that, in line with the ubiquitous results in behavioural genetics (Polderman et al., 2015; Turkheimer, 2000), individual differences in moral foundations scores show significant heritable influences. Prediction two was also supported, namely that in a multivariate model, binding and individualising domains would be needed for good model fit, and that these broader domains would show significant heritable influences, backing the theorised distinction between binding and individualising moral domains (Graham et al., 2009). Genetic influences explained 59% and 64% of the variance in binding and individualising domains, respectively. We also found support for a third general morality factor, influencing all five foundations positively. Speculatively, this may reflect a response bias such as acquiescence (Paulhus, 1991) or social desirability bias (Krumpal, 2011), or perhaps a more substantive factor.
Our third prediction – that the foundations would prove to be genetically distinct, independent of binding and individualising domains – was not supported. Though estimated at greater than zero, all specific genetic paths in this model could nevertheless be dropped without a significant fit loss. This may suggest that differences between foundations within the binding and individualising domains are purely learned in origin, or else do rely on a genetic basis, but this is entirely universal and present equally in all people. Equally probable from these data, however, is that genetic factors may underpin the individual foundations, but we lacked the power to detect these effects. The sample size, but more particularly the abbreviated measures with reduced ability to detect facet-specific variance mean that this possibility cannot be ruled out. We therefore conducted a second study, combining the data from Study 1 with a much larger sample that became available to us during the writing of the paper.
Study 2
In Study 2, we combined the data from Study 1 with data from a larger twin study collected independently by Kandler et al. (2019) and made available to us during the writing of this manuscript. This enabled a test of replication of the model built in Study 1 as well as additional power to test the prediction that individual foundations would also show heritable influences.
Kandler et al. (2019) administered 20 items from the MFQ-30 (Graham et al., 2011) to a sample of 822 German twins, including 142 MZ and 227 DZ complete pairs. For completeness, it should be noted that these twenty items differ by three items from the 20 chosen for wave 2 by Smith et al. (2017). Kandler et al. (2019) conducted univariate assessments of the heritability of latent scores, modelling items as indicators of latent foundations rather than sum scores (Kandler & Zapko-Willmes, 2017). The Harm/Care and Fairness/Reciprocity foundations showed high and significant heritability. By contrast, Ingroup/Loyalty and Authority/Respect reflected mainly shared environment (and a large unique environment/measurement error component). The third binding foundation – Purity/Sanctity – was influenced by both genes and by the shared environment. Kandler et al. (2019) also conducted heritability analyses of latent binding and individualising domains, finding that these mediated most of the genetic variance in Harm/Care and Fairness/reciprocity, and most of the shared environment effects apparent for Ingroup/Loyalty, Authority/Respect and Purity/Sanctity.
In the present study, we sought to use the large increase in power provided by the Kandler et al. (2019) dataset to (1) test the replicability of the structure and heritable influences on binding and individualising moral domains found in Study 1 and (2) test if the increased power afforded by the combined analysis of two data sets would provide evidence for the third prediction from Study 1, namely that each of the five foundations would be heritable in its own right, independent of genetic influences on binding and individualising domains. This was done by equating the path estimates in the models for each dataset to be the same across both models, thus generating a joint model fitted in both datasets simultaneously.
Modelling began from the 3-common pathway model found in Study 1 (see Figure 3), with the paths for specific heritability on each foundation free (to test prediction 3 in the full dataset). We also freed the genetic influence on the general factor, to test for heritable effects.
Method
Participants
In Study 2, we used the dataset described in Study 1 and, in addition, an independent dataset comprised of 573 German twin pairs from the Study of Personality Architecture and Dynamics (SPeADy: Kandler et al., 2019) including 903 females and 351 males; age M = 38.06, SD = 20.16). A total of 217 MZ and 334 DZ complete twin pairs from this second dataset were available after removing participants who responded with ‘disagree’ or ‘strongly disagree’ to the attention check item ‘It is better to do good than to do bad’.
Materials
Measurements of moral foundations in the Study 1 dataset were as described above. In the SPeADy dataset, moral foundations were measured using the 20-item version of the Moral Foundations Questionnaire (Graham et al., 2011), with four items per foundation. For compatibility with Study 1, foundations were scored by averaging item responses on each scale.
Power analyses
For the univariate ACE models, and assuming no shared environmental effects, we estimated power using the power.ACE.test() function in the umx package. Given our sample size of 410 MZ and 595 DZ twin pairs, the minimum heritability effect detectable at 80% power was estimated at a2 = .12 (path coefficient = .35) – an effect size smaller than is typical in comparable phenotypes (Polderman et al., 2015).
Results
Test of sex limitation showing no evidence of sex-specific effects on the moral foundations (p > .999) in Study 2. Models are compared with the most complex Qualitative model.
Note. The ACE model assumes no sex-specific genetic or environmental effects. The quantitative model allows modelling sex-specific effects that are larger in one sex than in another. The qualitative model expands the Quantitative model by allowing situations in which heritability affects one sex but not the other.
The correlations for MZ and DZ twins for each of the five foundations and the two broad moral domains are shown in Table 2. MZ twin correlations were higher than DZ twin correlations suggesting a genetic influence. To test this more formally, a five-factor Cholesky ACE model was constructed using just the German twin data. We then used this to test for statistical evidence of heritability for the five foundations. We compared the saturated five-factor ACE model to one in which all heritable influences were set to zero (the CE model) and to one in which shared environmental influences were set to zero (AE model). While shared environmental effects could be dropped from the model without a significant loss of fit (AE model; χ2(15) = 6.36, p = .973), dropping genetic effects led to a significant decrease of the model fit (CE model; χ2(15) = 49.13, p < .001). As MZ correlations often exceeded twice the DZ correlations suggesting the presence of nonadditive genetic variance (Verweij et al., 2012), we also tested dominance (ADE) models as well as ACE models. We report these in the supplement (see Table S2). As in Study 1, we computed Akaike weights allowing us to estimate the conditional probabilities for each model to be true (Wagenmakers & Farrell, 2004). The conditional probability of the AE model being the true model among the three models tested (ACE, AE, CE) was .99. Therefore, our first prediction that genetic influences would be necessary to explain the variance in moral foundation scores was supported in the SPeADy data set.
Testing replication of the Common pathway model of moral foundations
Having established the heritability of moral foundation scores in the new data, we next moved to building a well-fitting model of the joint data. To do this using the two datasets jointly, we estimated the same model simultaneously in both datasets, constraining the path estimates to be equal in both samples. This was implemented using a ‘supermodel’ – a model containing both the full Australian model and data, and a duplicate model containing the German data. This supermodel optimises a joint model fitted in both datasets simultaneously. The MFQ data in each sample were standardised (mean of zero, SD = 1) before being entered into the multi-group model.
We began from the final model in Study 1 (see Figure 3), but freeing-up A and C influences on the general factor, as well as on the individual foundations as we hypothesised these should be significant with more power. We also freed up the C pathway on the binding and individualising domains to test if these would be significant in the larger dataset. As in Study 1, a saturated 5-factor Cholesky ACE model was used as the baseline model for comparisons.
Two Common Pathway Models Tested in Study 2 (Joint Data) and Their Fit Comparative to the 5-Factor Saturated Baseline ACE Model.
Note. Model 1 = 2-factor common pathway model, factors correspond to binding and individualising moral domains. Model 2 = Model 1 with a general factor loading on all five foundations. AIC = Akaike information criteria. Low AIC values indicate a better fit. The best-fitting model is printed in bold.
Finally, we tested whether genetic influences specific to individual foundations could be dropped from the model without a significant fit loss. Three of these influences (those to Fairness, Ingroup and Authority) were not significant and could be dropped (χ2(3) = 2.05, p = .562). The greatest of these was estimated at β = .23, but, while such effects may be significant in larger samples, this simplification did not significantly reduce fit. Our prediction that individual moral foundations will be genetically distinct was supported (see Table 5 for the model comparisons). The final model of moral foundations is shown in Figure 4. Path diagram for the final common pathway model of the genetic and environmental influences on five moral foundations in Study 2. Path values are standardised path coefficients. Values in brackets indicate 95% confidence intervals.
We also ran this supermodel with the parameters free to differ across the two datasets as a measurement invariance check. We then tested if we could constrain the parameters to be equal across the two models. While invariance of the factors and the paths were supported (with the exception that the genetic influence on the general factor was not needed in the Australian dataset, and the path from the individualising domain to the purity foundation was not required in the German dataset), the paths’ values themselves could not be constrained equal in the two datasets without a significant loss of fit. These differences were typically not large in magnitude and may reflect the different items used in the two datasets. However, future research using unified measurement scales and larger twin populations would be useful to establish this.
Discussion
In Study 2, we jointly fit the three-factor common pathway model in two independent twin samples. The results confirmed the findings from Study 1: The moral foundations were significantly heritable, and a three-factor common pathway model fit well to the data with substantial heritable binding and individualising domains as predicted by the MFT, and a general factor. In Study 2, with additional power, significantly heritability was found for the general factor (explaining 40% of variance). This differs from Study 1, where this factor was entirely environmental. We also found that when two datasets were analysed in the same model thereby increasing the power to detect smaller effects, two of the foundations, Harm and Purity were genetically distinct independent of individualising and binding domains. This also differs from the results of Study 1 where all five specific genetic paths could be dropped without a significant loss of fit.
General discussion
The present studies aimed to test the heritability of the moral foundations, addressing three predictions: 1) Heritable influences are necessary to explain the variance in moral foundation scores, 2) Each moral foundation has specific genetic effects, over and above those inherited from within binding and individualising effects, that is, the foundations themselves are distinguishable at a genetic level, and 3) Higher-order binding and individualising domains are themselves heritable.
In two independent datasets, these predictions were largely supported: individual differences in moral foundations showed significant heritable influences. Two common factors corresponding to binding and individualising moral domains (Graham et al., 2009) were required. They were genetically influenced, with heritable effects explaining 66% and 49% of the variance in these domains, respectively, in the combined dataset. A general factor of morality was also required and was heritable with genetic influences explaining 40% of the variance. Our third prediction – that the foundations are genetically distinct from each other – was also partially supported in Study 2, with two of the five foundations evidencing significant specific genetic influences. Below we discuss these findings in more depth and consider the implications of the models for the MFT.
Two of the highly heritable common factors in our model clearly correspond to the binding (Ingroup/Loyalty, Authority/Respect and Sanctity/Purity) and individualising (Harm/Care and Fairness/Reciprocity) moral domains theorised by Graham et al. (2011), thereby supporting their model. At the same time, three out of five specific genetic effects were non-significant, which was unexpected. This may suggest that differences between individual foundations, for example, distinctions between Authority and Ingroup – are purely learned in origin. At this point, however, an equally plausible hypothesis is that we simply lacked the power to distinguish all the effects in the moral foundations. The sample size, but more particularly the abbreviated measures used in both datasets with reduced ability to detect facet-specific variance mean that this possibility cannot be ruled out. Future studies investigating these issues using larger, extended and even longitudinal twin designs and a wide range of measures would be valuable. In particular, it will be of value to explore whether the five distinct foundations reflect, at a genetic level, different combinations of these two major domains. Another area of interest for future research would be an analysis of distinctions between individualising measures such as Harm/Care and Fairness/Reciprocity and motivations such as compassion in other evolutionary models (e.g. Lin & Bates, 2021; Sznycer et al., 2017).
Finally, our model suggested the presence of a general heritable factor affecting all five foundations roughly equally. The significant general-factor genetic effect impacting all five moral foundations suggests a need to include and explain this additional system organising or coordinating moral preferences across domains. The lack of shared environmental effects on this factor was surprising, given the stress many models place on family-level effects, whether parenting, or the shared environments such as neighbourhoods or socioeconomic factors in which siblings are embedded (Nettle et al., 2011). One such potential factor could include a moral decision-making system, applying, for instance, utilitarian versus deontological reasoning to a broad array of moral problems (Kahane et al., 2018). It is also possible that this factor simply reflects the correlation between binding and individualising moral domains and implementing this correlation would diminish or even dissolve the general factor. However, the correlated common pathway model is difficult to implement with the twin data. It also should be noted that the lack of reverse-scored items in the MFQ leaves open the possibility that some or even much of the general factor variance is explained by acquiescence (Paulhus, 1991) or halo effects (Nisbett & Wilson, 1977). It is also worth mentioning that new psychometric work on the moral foundations themselves supports both the general factor, but also additional structure within several of the five foundations: for instance, independent sanctity and purity foundations and a distinction between loyalty to country (patriotism) and loyalty to group and family (Zakharin & Bates, 2021). The nature of this general factor, therefore, warrants further investigation.
Thanks to the open science practice by both Smith et al. (2017) and Kandler et al. (2019), data could be re-analysed, dramatically increasing power and allowing testing replication of the model. Our results, however, differ from those obtained by Smith et al. (2017) and Kandler et al. (2019). In contrast to Smith et al. (2017), who found no evidence of heritability for moral foundations, we found significant heritability, though mainly not at the level of all five individual foundations, but at the more general level of binding and individualising domains. The result reflects the value of multivariate analysis and modelling the structure of measures. Similarly, differences in data handling – incorporating both waves of data, testing sex limitation allowing the inclusion of opposite-sex twin data – all increased the effective sample size.
Kandler et al. (2019), in their univariate models, while reporting significant heritability for Purity/Sanctity, Harm/Care and Fairness/Reciprocity foundations, found no significant heritable influences for Ingroup/Loyalty and Authority/Respect. In contrast, in our multivariate common pathway model, we found evidence for significant heritability for all five foundations. This could be accounted for by the increased sample size and the added power of multivariate modelling. In our model, the genetic influences on Ingroup/Loyalty and Authority/Respect foundations flowed not from specific heritable influences on each of these, but from a single genetic influence on the ‘binding’ common factor which influences both Ingroup and Authority. Including this high-level domain provides additional power, and also suggests shared genetic origins for these traits.
Kandler et al. (2019) also found that all three binding foundations were influenced by a shared environment. This was not the case in our analysis. It is unclear why this occurred. In the Australian data, we found no evidence for shared environmental effects (see Study 1). Also, in the joint modelling, while non-significant, shared environment influence on the binding common factor was nevertheless non-trivial in magnitude, explaining 14 percent of the variance in this factor. It may be that random effects not present in both data lead to a false positive in the univariate models of the German data. Still, it cannot be ruled out that small C effects are present, perhaps operating via the binding domain. More extensive studies are needed in future.
We should keep in mind the limitations of the study. The present study used a self-report measure of moral values with the short versions of MFQ instead of the full 30-item instrument, in western samples only. The results of this study should be replicated using other task-based and behaviour-based morality measures. Another limitation is the absence of reverse-coded items. The inclusion of reverse-coded items in a MFQ questionnaire might decrease the loadings on the general factor, which would confirm an acquiescence bias. Finally, even though the study showed that genes influence moral domains, it did not indicate which genes are involved or what they do. Based on our findings, moral foundations may serve as promising targets for gene-hunting research. Larger samples again, and the full questionnaire would help further define and refine these findings.
Summary
Open science, allowing two datasets to be drawn together, an analysis strategy that maximised the data available for testing our hypotheses, and theoretically targeted causal models yielded clear support for the predicted heritability of the foundations, significant heritable effects of the binding and individualising domains, and a novel general factor. These positive findings suggest that further work with larger samples to generate high power, using a complete 30-item questionnaire, and perhaps molecular genetic studies with much larger non-western samples would reward further investigation.
Supplemental Material
Supplemental Material - Testing heritability of moral foundations: Common pathway models support strong heritability for the five moral foundations
Supplemental Material for Testing heritability of moral foundations: Common pathway models support strong heritability for the five moral foundations by Michael Zakharin, and Timothy C Bates in European Journal of Personality
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Data Accessibility Statement
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
