Abstract
Because literacy and numeracy are the focus of teaching in schools, whereas general cognitive ability (g, intelligence) is not, it would be reasonable to expect that literacy and numeracy are less heritable than g. Here, we directly compare heritabilities of multiple measures of literacy, numeracy, and g in a United Kingdom sample of 7,500 pairs of twins assessed longitudinally at ages 7, 9, and 12. We show that differences between children are significantly and substantially more heritable for literacy and numeracy than for g at ages 7 and 9, but not 12. We suggest that the reason for this counterintuitive result is that universal education in the early school years reduces environmental disparities so that individual differences that remain are to a greater extent due to genetic differences. In contrast, the heritability of g increases during development as individuals select and create their own environments correlated with their genetic propensities.
Although governments spend huge amounts of money on education—£50 billion annually in the United Kingdom (HM Treasury, 2012)—surprisingly little is known about the causes of individual differences in educational outcomes. Research has focused on group differences, and especially differences between countries and between schools within countries, rather than on individual differences, even though the range of individual differences within any of these groups far exceeds the average difference between groups (OECD, 2010). Because literacy (reading) and numeracy (mathematics) are the target of much early education, it would be reasonable to assume that they are less heritable than general cognitive ability (g, intelligence), which is not taught directly and is viewed as an aptitude inherent in individuals. Another reason for thinking that literacy and numeracy are less heritable than g is that literacy and numeracy are relatively recent human inventions, whereas the abstract reasoning and problem solving central to g seem to be key to human evolution.
Some results from the early literature on genetic research support the assumption that school achievement in reading and mathematics is less heritable than general cognitive ability (g) in childhood. Specifically, g is one of the most well-studied traits, and the evidence across studies indicate that it has a heritability of about .50 (Deary, Johnson, & Houlihan, 2009). Early research on school achievement, although much less studied than g, suggests lower heritability. The classic study of school achievement (more than 2,000 twin pairs) found heritabilities of about .40 for English and mathematics performance (Loehlin & Nichols, 1976). However, that study may have underestimated heritability because of restriction of range: The sample was restricted to the highest-achieving high-school twins in the United States, who were nominated by their schools to compete for the National Merit Scholarship Qualifying Test.
Other twin studies of school achievement have yielded a wide range of estimates of heritability, in part because they have been too small to provide reliable point estimates (Bartels, Rietveld, van Baal, & Boomsma, 2002; Petrill et al., 2010; Taylor, Roehrig, Hensler, Connor, & Schatschneider, 2010; Thompson, Detterman, & Plomin, 1991; Wainwright, Wright, Luciano, Geffen, & Martin, 2005). However, a recent study of more than 2,500 representative twin pairs in the United Kingdom found substantial heritabilities (~.65) for literacy and numeracy in the early school years and lower heritability for g (~.35; Kovas, Haworth, Dale, & Plomin, 2007). Similarly, a large study of 8-year-old twins in three countries found an average heritability of .77 for reading, as well as a lower heritability for vocabulary, which is often used as an index of g (Byrne et al., 2009); similar results were found in the U.S. sample alone at age 8 and again at age 10 (Olson et al., 2011). Another reason to suspect that the heritability of g might be lower than the heritability of literacy and numeracy in childhood is that the heritability of g increases in childhood and does not reach the widely reported level of .50 until later adolescence (Haworth et al., 2010).
For the first time, we explicitly compared the heritability of literacy, numeracy, and g in a large and representative sample of twins assessed longitudinally from primary school, at ages 7 and 9, to the beginning of secondary school, at age 12. On the basis of the evidence for the high heritability of literacy and numeracy in adequately powered recent studies of literacy and numeracy, as well as the evidence for increasing heritability of g during childhood, we expected to find that literacy and numeracy would be at least as heritable as g in primary school.
Method
Participants
Twins in the Twins Early Development Study (TEDS) were recruited from birth records of twins born in England and Wales from 1994 through 1996 (Haworth, Davis, & Plomin, 2013). Their recruitment and representativeness have been described previously (Kovas, Haworth, et al., 2007). Children who had severe medical problems or whose mothers had severe medical problems during that pregnancy were excluded from the analyses reported here. We also excluded children with uncertain or unknown zygosity, and those whose first language was other than English. The numbers of pairs of monozygotic (MZ) and same-sex dizygotic (DZ) twins, respectively, were 2,415 and 2,251 at age 7, 1,294 and 1,152 at age 9, and 1,942 and 2,192 at age 12, for a total of nearly 7,500 different twin pairs. (Fewer twins were available at age 9 than at the other ages because funds at that time permitted testing only twins born in 1994 and 1995.) Only same-sex twins were used in the present analyses to avoid the potential for inflation of genetic estimates that occurs when opposite-sex DZ twins are included with same-sex DZ twins.
Measures
Literacy, numeracy, and g were assessed longitudinally at ages 7, 9, and 12 using diverse measures for each trait. Psychometric properties have been reported previously for the measures used (ages 7 and 9: Kovas, Haworth, et al., 2007; age 12: Haworth et al., 2007). For example, the .70 correlation between yearlong teacher evaluation of reading performance at age 7 and the test of reading fluency at age 7 provided evidence for reliability and validity of both measures (Dale, Harlaar, & Plomin, 2005). The telephone-administered measure of g at age 7 correlated .62 with g assessed using a standard IQ test administered in person several weeks prior to the telephone test (Petrill, Rempell, Oliver, & Plomin, 2002). At age 12, the median correlation between Web-based tests and in-person administration of the same tests up to 3 months later was .81 (Haworth et al., 2007).
Literacy
Literacy at ages 7, 9, and 12 years was measured using three methods: teacher evaluations, telephone testing, and Web-based testing. Teachers assessed literacy of the twins at each age using yearlong criteria-based ratings of performance developed as part of the United Kingdom (U.K.) National Curriculum; three areas of performance were rated: reading, speaking and listening, and writing. Web testing at age 12 (Haworth et al., 2007) included the Woodcock-Johnson III Reading Fluency test, a test of fluency in reading simple sentences (Woodcock, McGrew, & Mather, 2001); the Peabody Individual Achievement Test, which assesses literal comprehension of sentences (Markwardt, 1997); and the GOAL Formative Assessment in Literacy for Key Stage 3, which is linked to U.K. National Curriculum goals and tests comprehension (e.g., grasping meaning, predicting consequences) and evaluation and analysis of written text (e.g., comparing and discriminating between ideas; GOAL plc, 2002). In addition, at ages 7 and 12, the Test of Word Reading Efficiency (TOWRE; Torgesen, Wagner, & Rashotte, 1999), which involves reading words and nonwords, was administered by telephone. As scores on the two TOWRE subtests were highly correlated, we used a standardized, equally weighted composite scale.
Numeracy
Numeracy was assessed at ages 7, 9, and 12 using teacher evaluations and Web-based testing. As with literacy, yearlong teacher ratings of performance were based on criteria developed by the U.K. National Curriculum (Qualifications and Curriculum Authority, 2003), and performance was rated in three areas: using and applying mathematics (computation and knowledge), numbers and algebra (understanding of numbers), and shapes, space, and measures (nonnumerical processes); a component about handling data was also included at age 12. Web-based testing at age 12 included tests closely linked to the U.K. National Curriculum goals based on tests of mathematics developed by the National Foundation for Educational Research (Kovas, Petrill, & Plomin, 2007; nferNelson, 2001): Computation and Knowledge, which tests the use and application of mathematics; Understanding Numbers, which covers word problems and algebra; and Non-Numerical Processes, which tests concepts related to shapes and measures.
General cognitive ability (g)
Verbal and nonverbal ability were assessed at ages 7, 9, and 12 using various formats. When the twins were 7 years old, two verbal and two nonverbal tests were administered by telephone (Petrill et al., 2002): the Vocabulary, Similarities, and Picture Completion tests from the U.K. version of the Wechsler Intelligence Scale for Children—Third Edition (WISC-III; Wechsler, 1992) and the Conceptual Grouping test from the McCarthy Scales of Children’s Abilities (McCarthy, 1972). When the twins were age 9, two verbal and two nonverbal tests were administered by parents using booklets sent to the homes. The verbal tests were the Vocabulary Multiple Choice and General Knowledge tests from the WISC-III as a Process Instrument (Kaplan, Fein, Kramer, Delis, & Morris, 1999). The nonverbal tests, which were adapted from the Cognitive Abilities Test 3 (Smith, Fernandes, & Strand, 2001), were Figure Classifications, which assesses inductive reasoning (the child is asked to identify which shape out of five continues a series), and Figure Analogies, which assesses both inductive and deductive reasoning (the child is asked to identify which shape out of five relates to another shape in the same way as shown in an example). When the twins were age 12, the Vocabulary Multiple Choice and General Knowledge tests from the WISC-III as a Process Instrument (Kaplan et al., 1999), the Picture Completion test from the Wechsler Individual Achievement Test (Wechsler, 1992), and a modified form of Raven’s Standard Progressive Matrices (Raven, Court, & Raven, 1996) were administered via the Web.
Analysis
The analyses reported in this article are based on the quantitative genetic model, which splits phenotypic variance into additive genetic (A), shared (or common) environmental (C), and nonshared (or unique) environmental (E) components (Plomin, DeFries, Knopik, & Neiderhiser, 2013). Within MZ twin pairs, both genetic and shared environmental effects are assumed to have a correlation of 1.0, whereas within DZ twin pairs, shared environmental effects have a correlation of 1.0 but additive genetic effects have a correlation of only .5. Nonshared environmental influences are assumed to be uncorrelated for members of a twin pair and thus contribute to differences within pairs. As is standard in twin analyses, we used residuals correcting for age because the age of twins is perfectly correlated across pairs, and this correlation would otherwise be misrepresented as shared environmental influence. Similarly, we also corrected residuals for sex because MZ twins are always of the same sex. Earlier TEDS studies indicated that ACE estimates differed little between males and females (Kovas, Haworth, et al., 2007), implying no significant gender-related differences in etiology, so we combined data from male and female twins in order to increase the power of our analyses.
We used standard ACE model-fitting analysis in the OpenMx package for R (Boker et al., 2011). By fitting the ACE model for MZ and DZ twins to the data, using an iterative process, we could assess the model’s goodness of fit and estimate the A, C, and E components with confidence intervals. We used a common-pathway model, which derives latent factors from the multiple tests in each domain using maximum-likelihood factor analysis (Rijsdijk & Sham, 2002). We conducted nine common-pathway analyses, one each for literacy (three measures), numeracy (three measures), and g (four measures) at age 7, age 9, and age 12. We also examined correlations within twin pairs (see Table S1 in the Supplemental Material available online). We did not conduct multivariate analyses across literacy, numeracy, and g or longitudinal analyses across ages 7, 9, and 12 because such analyses would address questions that were not central to our investigation and because they would greatly complicate the presentation of our focal results.
Results
The basic finding can be gleaned from the twin correlations (Table S1): high heritability for literacy and numeracy and more modest heritability for g. For example, the MZ and DZ correlations for the three measures of literacy at age 7 were, respectively, .82 and .52 (speaking and listening), .78 and .47 (reading), and .74 and .41 (writing). Doubling the difference between the MZ and DZ correlations as a rough index of heritability suggests heritabilities of .60, .62, and .66, respectively, for the three literacy measures. In contrast, the correlations for the four measures of g at age 7 suggest heritabilities of .22, .28, .08, and .14.
Figure 1 shows that heritabilities of literacy and numeracy as estimated from the common-pathway model fitting were substantial at all three ages: .68 on average. In contrast, the heritability of g was significantly lower at age 7 (.38) and age 9 (.41), as indicated by the nonoverlapping 95% confidence intervals in Figure 1. At age 12, the difference in heritability was no longer significant, as heritability increased for g and decreased for literacy and numeracy. Details of these results, including estimates of the contributions of the shared and nonshared environment, are presented in Figures 2, 3, and 4. These results based on our model-fitting analyses generally confirm the patterns apparent in the simple MZ and DZ twin correlations for measures of literacy, numeracy, and g at each age (see Table S1). The model-fitting results are presented in Table S2 in the Supplemental Material and indicate a good fit of the model.

Heritabilities (with 95% confidence intervals) of literacy, numeracy, and general cognitive ability (g) at ages 7, 9, and 12 years.

Results from the common-pathway models for literacy, numeracy, and general cognitive ability (g) at age 7. Parameter estimates are presented for additive genetic (A), shared (or common) environmental (C), and nonshared (or unique) environmental (E) components of the latent variables. The variance of each latent variable is 1.0, and all parameter estimates are standardized. Specific components are omitted. WISC-III = Wechsler Intelligence Scale for Children—Third Edition, United Kingdom. (See the text for details on the measures used.)

Results from the common-pathway models for literacy, numeracy, and general cognitive ability (g) at age 9. Parameter estimates are presented for additive genetic (A), shared (or common) environmental (C), and nonshared (or unique) environmental (E) components. The variance of each latent variable is 1.0, and all parameter estimates are standardized. Specific components are omitted. CAT = Cognitive Abilities Test 3; WISC-III-PI = WISC-III [Wechsler Intelligence Scale for Children—Third Edition] as a Process Instrument. (See the text for details on the measures used.)

Results from the common-pathway models for literacy, numeracy, and general cognitive ability (g) at age 12. Parameter estimates are presented for additive genetic (A), shared (or common) environmental (C), and nonshared (or unique) environmental (E) components. The variance of each latent variable is 1.0, and all parameter estimates are standardized. Specific components are omitted. PIAT = Peabody Individual Achievement Test; WIAT = Wechsler Individual Achievement Test; WISC-III-PI = WISC-III [Wechsler Intelligence Scale for Children—Third Edition] as a Process Instrument. (See the text for details on the measures used.)
Discussion
We conclude that about two thirds of the differences among children in their literacy and numeracy in the early school years can be explained by genetic differences, and that the heritability of g is significantly lower. It is unclear whether genetic or environmental factors are responsible for this difference in heritability across the three domains. There might be genetically driven neurocognitive processes—such as the use of decontextualized language and abstract symbol systems—that are brought to bear on literacy and numeracy skills, but not g, when formal schooling begins. However, we favor an environmental hypothesis: Universal education for basic literacy and numeracy skills in the early school years reduces environmental disparities so that individual differences in these taught skills are due to genetic differences to a greater extent than is the case for g. In other words, heritability of literacy and numeracy can be viewed as an index of educational equality. This is not true for g because g is not a taught skill.
Support for this hypothesis comes from cross-national comparisons showing that the heritability of early reading skill is greater in societies that teach reading regularly and consistently in kindergarten than in other societies (Samuelsson et al., 2008). Further indirect support can be seen in the contrast between the present finding of only modest shared environmental influence (.10–.20) for literacy and numeracy at ages 7, 9, and 12 (see Table S1) and the results for literacy and numeracy readiness in the same sample in their preschool years (Oliver, Dale, & Plomin, 2005), which indicated much greater shared environmental influence, a finding confirmed by other studies (Byrne et al., 2009). At ages 7 and 9, shared environmental influence was substantially greater for g (.48 at both ages) than for literacy and numeracy because, we argue, the effects of family environments on g are not mitigated by universal education, as are the effects of family environment on literacy and numeracy. By age 12, shared environmental influence on g had declined and was no longer significantly different from shared environmental influence on school achievement.
This decline in the influence of the shared environment on g in adolescence has been found consistently in other studies (e.g., Haworth et al., 2010). It has been proposed that shared environmental influences caused by differences in family environments begin to fade and heritability expands as children make their own way in the world beyond their family and increasingly select their own environments. These selected environments are correlated with children’s g-related genetic propensities (McCartney, Harris, & Bernieri, 1990), a process called genotype-environment correlation, which does not occur (or is less strong) in the case of domain-specific skills like literacy and numeracy (Plomin et al., 2013).
Regardless of the causes of the high heritability of literacy and numeracy, finding that two thirds of the total variance in these taught skills can be attributed to genetic differences between children highlights the need to incorporate genetics into educational policy. The field of education has been slow to accept the importance of genetics, as can be seen, for example, in research (Haworth & Plomin, 2011) and in textbooks for teachers (Plomin & Walker, 2003). Some of the reluctance to embrace genetics may be specific to the history and epistemology of education (Wooldridge, 1994). However, much of the reluctance involves general misconceptions about what it means to say that genetics is important (Haworth & Plomin, 2011).
The present findings suggest new ways of thinking about education. If our hypotheses are correct, teaching basic literacy and numeracy skills in the early school years largely erases environmental disparities, leaving genetics as the primary cause of individual differences in these skills between children. However, once children achieve basic literacy and numeracy skills, they can use these skills as tools for learning in general, which contributes to the active genotype-environment correlational processes responsible for the increasing heritability of g. Although basic literacy and numeracy skills require instruction (from the Latin instruere, “to build in”), a genetic way of thinking about education (educare, “to draw out”) is to foster genotype-environment correlation, giving children opportunities to select, modify, and create educational experiences in part on the basis of their genetic propensities, which include appetites as well as aptitudes. This view supports the trend toward adaptive learning systems tailored to each pupil (Tseng, Chu, Hwang, & Tsai, 2008).
Genetics will become increasingly useful in personalized learning as specific genes responsible for the high heritability of literacy and numeracy are identified. Even though many genes of very small effect are likely to be involved, identification of polygenic composites will make it possible to predict strengths and weaknesses and to create learning programs tailored to the individual child (Plomin, 2013). Finally, it is worth reflecting on the following uncomfortable truth: Success in achieving widely accepted educational goals for primary school (e.g., high educational equality, social mobility, maximized potential, and personalized learning) will increase the heritability of academic performance.
Footnotes
Acknowledgements
We thank the twins and their parents, who have contributed to the study since the twins were infants.
Declaration of Conflicting Interests
The authors declared that they had no conflicts of interest with respect to their authorship or the publication of this article.
Funding
The Twins Early Development Study (TEDS) is supported by the U.K. Medical Research Council (G0901245; and previously G0500079), with additional support from the U.S. National Institutes of Health (HD044454; HD059215). R. Plomin is supported by a Medical Research Council Research Professorship award (G19/2) and a European Research Council Advanced Investigator award (295366). Y. Kovas, I. Voronin, A. Kaydalov, and S. B. Malykh’s research is supported by a grant from the Government of the Russian Federation (Grant 11.G34.31.003).
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
