Abstract
Hart and Risley claimed the existence of an association between socioeconomic status (SES) and oral language competence, having found that children from lower SES backgrounds presented with less-developed vocabulary than children from higher SES backgrounds. The purpose of this study was to examine the accuracy and generalisability of Hart and Risley’s original finding by quantifying the association between SES and oral vocabulary test scores through systematic review and meta-analysis. Papers including data collected between 2012 and 2022 and examining the association between SES and oral vocabulary test scores through either a correlational or group comparison design were considered for inclusion. Following a database search for peer-reviewed articles, 3055 articles were subjected to title and abstract screening, and 209 were retrieved for full-text screening. In total, 17 relevant studies were identified through systematic review, with nine countries and seven languages represented. The meta-analysis of correlation coefficients for the association between SES and oral vocabulary included 10 studies, drawing upon the data of 9742 children, and yielded a combined positive effect size of approximately moderate magnitude (
Introduction
This study sought to quantify the association between socioeconomic status (SES) and the oral vocabulary of young children through meta-analysis. The idea that SES may be associated with oral language competence in young children was brought to prominence by Hart and Risley (1995) 30 years ago. The purpose of the current study was to examine whether there is empirical evidence for the existence of this association in societies internationally today. Hart and Risley’s (1995) findings have had an enormous impact on subsequent academic research and public initiatives but their study and its interpretation have also been heavily criticised and the field examining the role of oral language in educational inequity remains a highly contested one. Before delving into the nuts and bolts of the meta-analysis, this paper will provide an overview of this contentious topic.
Educational Inequity and Oral Language
Inequity in education is a feature of modern societies. One aspect of inequity relates to the consistent finding across both developing countries and more economically developed countries that children from lower SES backgrounds tend to have poorer educational outcomes compared to children from higher SES backgrounds (UNESCO, 2017). For example, in the Programme for International Student Assessment (PISA) 2015, there was an association between SES and student performance in science, reading and maths across all OECD countries that took part in the study (OECD, 2018). Such educational disparities have significant consequences for adult outcomes, with educational attainment identified as the best protective factor against unemployment among young adults across OECD countries (OECD, 2024).
Educational inequity is understood to be a multilevel and multifactorial phenomenon arising from a confluence of factors operating within the various social contexts of a child or young person’s life (McAvinue, 2022). One factor which has been suggested as not only contributing to educational inequity but also as potentially actionable in the quest to support children from lower SES backgrounds to achieve their potential within the education system is that of oral language (Hoff, 2013). The idea that oral language may play a role in perpetuating inequity makes sense given the intimate relationship between oral language and literacy (Lerner & Johns, 2012; McCardle et al., 2001; NICHD Early Child Care Research Network, 2005; Tomblin, 2005) and the integral role of oral and written language in education (Kennedy et al., 2012; Shiel et al., 2012). There is also empirical evidence for oral language competence acting as a mediator of the relationship between SES and academic achievement (Durham et al., 2007; Lurie et al., 2021; Maguire et al., 2018; Von Stumm et al., 2020; Zhang et al., 2013). The proposition that oral language competence might have a causally mediating role in the poorer academic outcomes of students from lower SES backgrounds and an associated proposal for remediation were brought to prominence within academic and public spheres by Hart and Risley (1995).
The Hart and Risley Study
In 1995, Hart and Risley (henceforth, HR) published ‘Meaningful Differences in the Everyday Experiences of Young American Children’. This monograph documented their longitudinal study in which monthly observations were conducted, over a 2.5-year period, of the language interactions of 42 families living in Kansas City, United States. The families’ SES was classified according to occupation: 13 families were deemed as belonging to upper SES (professional), 10 to middle SES, 13 to lower SES (with middle and lower SES also described as ‘working class’), and 6 were described as being on welfare. The findings revealed significant differences in the vocabulary of 3-year-old children from families of differing SES. For example, children from families on welfare presented with a vocabulary size of approximately half of that of children from families of upper SES. HR located the source of this ‘vocabulary gap’ in a second identified ‘gap’ relating to significant differences in the quality and quantity of language directed at young children by their parents. For example, children from families on welfare were recorded as hearing half as many words as children from working-class families and less than a third of words recorded for children from professional families. As part of their analyses, HR calculated a projection of the cumulative difference in the number of words that would be heard by a child in a family on welfare, compared to a professional family, by the time they reached 4 years old. The projected figure, which was approximately 30 million, was encapsulated in the by now (in)famous phrase, ‘The 30 million word gap’. This phrase has come to represent the findings of HR and more broadly, the idea that poor oral language competence has a causally mediating role to play in the poorer academic achievement of children from lower SES backgrounds.
Impact
HR put out a call to action to remedy the perceived ‘word gap’, and this call has been acted upon. Over the past 30 years, HR’s conclusions have had an enormous impact on academic research and public consciousness. For example, in relation to the former, many researchers participate in the ‘Bridging the Word Gap Research Network’. This is an interdisciplinary network made up of more than 150 researchers, practitioners, policymakers and funders in the United States who have joined forces to push forward a coordinated national research agenda. Their aim is to reduce the ‘word gap’ through initiatives which increase the capacity of parents to enrich their young children’s language learning environments (Bridging the Word Gap Research Network, 2024; see also Fernald & Weisleder, 2015; Golinkoff et al., 2019; Greenwood et al., 2017; Walker & Carta, 2020). In relation to the latter, the ‘30 million word gap’ even caught the attention of the White House when, in 2014, it featured as part of a White House summit to support working families. President Obama subsequently released a video message focused on the importance of supporting young children to bridge the ‘word gap’ to improve their chances for later success in school and life (Shankar, 2014). Indeed, several States in the United States are home to ‘word gap’ initiatives, which aim to encourage parents to enhance the quantity and quality of their language interactions with their young children. Examples include Providence Talks (2025) in Providence, Rhode Island, Too Small to Fail in Oklahoma (Clinton Foundation, 2025), the Thirty Million Words initiative in Chicago (TMW Center, 2025) and Talk with Me Baby in Georgia (Emory University NHWSN, 2025). See Bergelson (2024) for a review of the efficacy of parenting interventions in supporting early language learning.
Criticism
It may appear surprising, given the impact of the HR findings, that the HR study and its conclusions have been criticised harshly on methodological, theoretical and moral grounds.
Methodologically, the HR sample has been criticised for its size, which is very small considering the impact of the findings (Kamenetz, 2018), for the conflation of race and SES and for the comparison of groups at extreme ends of the SES spectrum (i.e. professional versus welfare), which inflates estimates of differences associated with SES (Kuchirko, 2019). The HR data collection methods have been criticised for focusing only on child-directed speech and ignoring other forms of speech within the child’s environment and for attempts at experimental control which may have compromised ecological validity (Sperry et al., 2019). The HR data analyses have been criticised in particular for the creation of the ‘30 million word gap’. This was an extrapolation from the data, which has been described as having dubious methodological properties. However, it would nonetheless be taken literally by many and become a powerful catchphrase galvanising a wide range of research and community efforts to remedy this enormous ‘word gap’ (Purpura, 2019).
As valid as the methodological criticisms of the HR study appear, even more cogent criticisms have been articulated from theoretical and moral perspectives. From a theoretical perspective, scholars from the traditions of sociolinguistics and linguistic anthropology have taken issue with how the HR findings have been interpreted, arguing that efforts to close the ‘word gap’ have proceeded without full acknowledgement of the nature of language, language development, language socialisation or language variation (Avineri et al., 2015; Baugh, 2017; Sperry et al., 2019, 2020). Researchers and policymakers who are committed to ‘bridging the word gap’ have interpreted the HR findings along the following lines: Children from lower SES backgrounds present with differences or deficits in oral language, which have deleterious implications for their later academic achievement. These differences or deficits have been caused by the nature of parental talk to children, which can be remedied by encouraging parents to engage in more frequent and higher quality language interactions with their children. Such parental practices will increase the language competence of their children, which will have a positive impact on their academic achievement, and which will ultimately advance educational equity for children from lower SES backgrounds (Kuchirko, 2019). Sociolinguists have pointed out, however, that this interpretation is, first of all, based on a reductionist understanding of language as being simply made up of words (Blum, 2015) when language is a complex system made up of phonology, morphophonology, morphology, syntax and semantics, in addition to the lexicon (i.e. vocabulary) (Baugh, 2017). Furthermore, sociolinguists have demonstrated that for the expression of sophisticated thought, the number of words used is irrelevant (Blum, 2015; Labov, 1973) and indeed, the use of an excessive number of words in an utterance can obscure meaning (Krashen, 2012). Second of all, sociolinguists have pointed out that efforts to increase parent-child language interactions are based on an implicit assumption that the language socialisation patterns common to Western societies, which typically involve a high degree of one-to-one parent-child interaction, are optimal for child language development (Hirsh-Pasek et al., 2018). This assumption betrays a lack of awareness that there are many different forms of language socialisation in existence in different cultures throughout the world, with the Western approach being quite unusual, and yet children everywhere learn the languages of their societies (Blum, 2017). Indeed, a recent study space analysis of the research literature on child-directed speech concluded that even though it is widely assumed within the field of language development that child-directed speech facilitates language learning, there is a limited amount of empirical data which can robustly and directly support this claim (Kempe et al., 2024). Thirdly, sociolinguists have pointed out that HR’s findings have been interpreted without considering the existence of language variation. Language variation refers to the tendency for language to evolve into different forms or varieties which are used by different groups (Hazen, 2008; Owens, 2016) and for different purposes (Bell, 1984; Coupland, 2007). Sociolinguists have demonstrated the linguistic equality of all language varieties, with all varieties demonstrated to be systematic, rule-governed, generative and creative (Figueroa, 2024). However, they have also discussed the prevalence within society of standard language ideologies which assume the superiority of the language variety spoken by the dominant classes (Siegel, 2006). In the HR study, the patterns of language use employed by the professional classes (e.g. use of more words) were taken as the benchmark against which the language patterns of other groups were compared, without considering that each group may have been communicating in different language varieties which were linguistically equal but just different.
From the moral perspective, research around the ‘word gap’ and efforts to remediate same have been described as constituting a social discourse that is underpinned by a deficit perspective. This deficit perspective locates the cause of educational disparities within the language practices and skills of lower SES communities, placing the onus on these communities to redress educational inequity themselves. It has been argued that this social discourse obscures the real causes of educational inequity and has the unintended consequence of contributing to the very educational inequity that it is attempting to mitigate (Abraham, 2020; Avineri et al., 2015, Baugh, 2017; D. C. Johnson et al., 2020; E. J. Johnson et al., 2017; Kuchirko, 2019; Wang et al., 2021). Some have gone so far as to describe this social discourse as a form of linguistic racism (Cushing, 2023; Figueroa, 2024).
Accuracy and Generalisability of the HR Findings
Setting aside, for the moment, the significant criticisms that have been made around the interpretation of the HR findings and the consequences of this interpretation, it is important to consider whether the findings themselves are an accurate representation of the association between SES and language and whether the identified associations generalise to other studies conducted within the United States and beyond (Purpura, 2019). The HR findings consisted primarily of two identified associations:
Children from lower SES families have significantly poorer oral language competence, as measured through vocabulary size.
Parents from lower SES families speak significantly less to their young children than parents from higher SES families.
The second of these claims has been examined in a number of empirical studies and meta-analyses. Sperry et al. (2019) took issue, in particular, with the fact that HR focused on child-directed speech and discounted other speech available within the child’s environment. They analysed language data from studies conducted in five American communities and found a weak association between social class and child-directed speech. They found no association, or even a reversal of the direction of the relationship, when more expansive definitions of the verbal environment were employed. Dailey and Bergelson (2022) quantified the association between SES and language input to young children in a meta-analysis of 19 studies involving 1991 participants speaking four languages in five countries (though mostly within the United States). When considering child-directed speech, a statistically significant association was found between SES and language input, which, although deemed to be of large magnitude (Hedges’
The focus of the current study is on the first of HR’s claims, that children from lower SES backgrounds have poorer oral language competence. This finding has not yet been examined through meta-analysis but has been examined through a couple of narrative reviews. Hoff (2013) reviewed empirical evidence on the association between SES and children’s early language skills, examining research on vocabulary size, grammatical development, narrative skills, phonological awareness and speed of language processing. She concluded that ‘the effect of SES on children’s early language skills is large, pervasive and robust’ (p. 4). Pace et al. (2017) also reviewed empirical evidence documenting how children from low-income backgrounds consistently perform below their more advantaged peers on standardised measures of ability across language domains, including prelinguistic development, vocabulary development, grammatical development, and phonological development.
Current Study
The current study sought to quantify the association between SES and oral language competence through meta-analysis with a view to establishing whether HR’s original finding generalised to other samples and other countries beyond the United States and to obtain an estimate of the magnitude of the association internationally. Scores on objective vocabulary tests were chosen as the measure of oral language competence as vocabulary size was the original measure used by HR and vocabulary is the aspect of language that has been described as being most affected by SES (Pace et al., 2017). Studies which had collected their data during a recent time period, between 2012 and 2022, were included in the meta-analysis. The HR study was conducted more than 30 years ago. Given that the socioeconomic gradient of countries can change over time (Dow & Rehkopf, 2010), it is possible that older studies examining associations between SES and various outcomes may have little bearing on current situations. The objective of this meta-analysis was to obtain a relatively current estimate of the SES–vocabulary association that may have relevance for societies today and so, studies with data collected from 2012 were admitted. As studies approached this topic using both correlational and between-group designs, separate meta-analyses were conducted synthesising the identified correlations between SES and oral vocabulary test scores and synthesising the standardised mean difference in oral vocabulary test scores between higher and lower SES groups.
Method
This systematic review and meta-analysis is reported according to the PRISMA 2020 guidelines (Page et al., 2021).
Eligibility Criteria
The aim of the meta-analysis was to obtain a contemporary estimate of the association between SES and oral vocabulary in monolingual children and young people. The purpose of focusing on monolingual samples was to avoid confounding SES with multilingualism. The inclusion and exclusion criteria designed to achieve this aim are presented in Table 1.
Inclusion and Exclusion Criteria.
Search Strategy
A database search was conducted on 31 July 2022. The databases searched and the search string used are presented in Table 2. Figure 1 presents a flowchart that summarises the results at each stage of the selection process. Records were stored and reviewed in EndNote. In total, the database search yielded 5333 records, of which 2278 were duplicates. Three thousand and fifty-five records were subjected to title and abstract screening, which resulted in the exclusion of 2846 records. Two hundred and nine papers were retrieved for full-text screening. As part of this process, it was necessary to contact 34 authors to obtain clarification of details which were not present in the paper. Queries were usually about the year of data collection, the language status of the sample children (i.e. monolingual or multilingual) and sometimes, statistical data. In each case, the corresponding author was contacted on two occasions over a period of months and if no reply was received, co-authors listed on the paper were contacted. In total, no reply was received from the authors of six papers, and these papers were excluded from the review. Of the 209 papers subjected to full-text review, 192 were excluded for reasons which are presented in Figure 1.
Details of Search Strategy.

Flowchart, based on PRISMA 2020, illustrating the number of articles identified, excluded and included throughout the literature search process.
Extraction
For all included studies, details of the publication, sample, methods and results were extracted. Extracted variables are described in Table 3.
Variables Extracted From Included Studies.
Tables 4 and 5 present a description of the correlational and group studies included in the meta-analyses. Ten studies met the inclusion criteria for the meta-analysis of the correlation between SES and oral vocabulary (Table 4). The 10 studies were conducted in seven different countries with six languages represented. One study each came from Italy (Italian), Turkey (Turkish), Chile (Spanish), the United States (English), Ireland (English) and The Netherlands (Dutch), and four studies came from China (Chinese/Cantonese / Mandarin). Collectively, the correlational studies included 9742 participants. All studies included samples of mixed gender, and all samples represented young children around the preschool or early school years (i.e. under 7 years). Seven studies met the inclusion criteria for the meta-analysis of the standardised mean difference between lower and higher SES groups (Table 5). These seven studies were conducted in six countries using four languages. One study each came from Iran (Persian), China (Cantonese), Ireland (English), New Zealand (English) and the United States (English), and two studies came from Chile (Spanish). Collectively, the group studies included 1491 participants. All samples were of mixed gender, and all included children aged 7 years and under. Although originally, inclusion criteria were open to studies including samples of children and young people aged 1 to 21 years, with the view to potentially examining differences in effect sizes for samples of different ages, no studies examining the performance of children or young people above the age of 7 years on objective vocabulary tests were identified.
Details of Correlational Studies Included in Meta-Analysis.
Details of Group Studies Included in Meta-Analysis.
Variables used to represent SES and measures used to assess oral vocabulary varied across the correlational and group studies. Where a study included multiple measures of SES, variables were chosen for extraction according to the following hierarchy: Maternal education prioritised, followed by parental or caregiver education (which is often maternal education by another name but can reflect paternal or other caregiver education), followed by a composite SES variable. Maternal education was prioritised for extraction here as it appears to be the most commonly used SES proxy within language development research (Dailey & Bergelson, 2022; Piot et al., 2022) and has been described as the component of SES that is most strongly related to child development outcomes (Pace et al., 2017). Among the correlational studies, SES was represented by maternal, parental or caregiver education in seven studies and three studies used a composite variable. Among the group studies, three studies defined lower and higher SES groups on the basis of a school disadvantaged status variable, two on the basis of an SES composite variable, one on the basis of income and one on the basis of maternal education. Studies providing either a receptive or expressive vocabulary measure were included in the meta-analysis as these are two aspects of the primary oral language system, relating to comprehending and producing words, respectively (Lerner & Johns, 2012), which tend to be highly correlated (Smith, 1997; Ukrainetz & Blomquist, 2002). Where studies included a measure of both expressive and receptive vocabulary, the data related to the expressive vocabulary test were extracted to align with HR’s measures. Among the correlational studies, the extracted correlation was based on expressive vocabulary in four studies and receptive vocabulary in six. Among the group studies, extracted data related to an expressive vocabulary test in two studies, receptive vocabulary in four studies and a combined expressive and receptive test in one study. Studies used a variety of objective vocabulary tests but common among them was the Peabody Picture Vocabulary test, which was employed by 7 of the 17 studies. In relation to the group studies, where the data from several groups were available for extraction, data related to the most disparate groups (i.e. highest versus lowest) were chosen, while taking into account sample size. Where a study had a longitudinal approach and samples were tested multiple times, the data for only one timepoint (i.e. the first timepoint after 2012) were extracted.
Meta-Analysis
The meta-analysis was conducted in R Version 4.1.1, drawing upon the meta, metafor and dmetar packages and following guidance provided by Harrer et al. (2021). The effect size relating to the correlational studies was calculated using the metacor function, while the group studies were analysed using the metacont function. The meta-analyses were fitted using a random effects model, which assumes that there is a distribution of true effect sizes underlying the effect sizes found in individual studies. Between study variability, tau-squared (τ2), was calculated using the Restricted Maximum Likelihood Estimator, which has been recommended for continuous data. A Knapp-Hartung adjustment, recommended for meta-analyses including a small number of studies, was included as a measure to control for the uncertainty in the estimate of between-study heterogeneity and to provide a more conservative estimate of statistical significance of the pooled effect size. For the correlational studies, a Fisher’s
Results
Meta-Analysis of Correlation Coefficients
The correlation coefficients of the individual studies and the pooled estimate are presented in the Forest Plot in Figure 2. The weighted average for the correlation coefficients was approximately of moderate magnitude (Cohen, 1988), statistically significant and in a positive direction:

Forest plot depicting effect sizes included in the meta-analysis of correlations.
Meta-Analysis of Standardised Mean Differences
The standardised mean differences of the individual group studies and the pooled estimate are presented in the Forest Plot in Figure 3. The weighted average for the standardised mean difference was of large magnitude (Cohen, 1988) and statistically significant:

Forest plot depicting effect sizes included in meta-analysis of SMDs.
Discussion
The focus of this systematic review and meta-analysis was on Hart and Risley’s (1995) prominent claim that children from lower SES families present with poorer oral language competence. The meta-analysis sought to provide a relatively current estimate of the association between SES and oral vocabulary, including studies with data collected between 2012 and 2022. Meta-analyses of both the correlational and group comparison studies confirmed the existence of a positive association between SES and oral vocabulary, with higher SES groups displaying higher scores on objective vocabulary tests. The pooled effect size for the group comparison studies was of strong magnitude (
The effect sizes for the SES–vocabulary association were in the same direction for all studies (i.e. favouring higher SES) but there was significant variability in relation to the magnitude of effect sizes reported across included studies. Variability may have had several sources. First of all, studies were drawn from different countries, each of which likely varies in terms of their degree of educational equity, reflected in part in each country’s socioeconomic gradient, or the strength of association between SES and educational or achievement variables (OECD, 2023). Studies varied in the measure of SES used. Among the correlational studies, seven used a measure of caregiver education and three, a composite variable. The measures of SES used among the group studies were more varied, with three using a school disadvantaged status variable, two using a composite variable, one using income and one using maternal education to divide groups into higher and lower SES. Studies also varied in the range of SES groups included in their samples. From Table 2, for example, it can be seen that the comparisons made in the group studies included comparisons between ‘low versus high’ and ‘low versus mid’ SES groups. The range of SES levels represented within samples likely influenced the effect sizes for the association between SES and vocabulary identified across studies. All studies employed an objective test of vocabulary but they varied as to whether they employed a receptive vocabulary test (ten studies), an expressive vocabulary test (six) or a combined measure (one). The individual tests used also varied, although various versions of the Peabody Picture Vocabulary Test were commonly employed (seven studies). Differences in samples employed are unlikely to have been a major source of variability as all studies employed samples of mixed gender and children aged under 7 years.
Interpreting the SES–Vocabulary Association
The current meta-analysis confirmed the existence of an association between the SES and the oral vocabulary of young children, evident in nine countries in a recent decade, an association that was highlighted by HR in 1995. Given the significant misinterpretation that has surrounded the HR findings (Avineri et al., 2015; Baugh, 2017; Blum, 2017; Sperry et al., 2019, 2020) and the suggested damaging unintended consequences of this misinterpretation (Abraham, 2020; Cushing, 2023; Figueroa, 2024; D. C. Johnson et al., 2020; E. J. Johnson et al., 2017; Kuchirko, 2019; Wang et al., 2021), it is important to be precise about what this finding signifies. Heretofore, much of the literature examining the association between SES and oral language has failed to consider language variation when interpreting findings. A lack of acknowledgement of the existence of language varieties has led to a misinterpretation of the poor performance of lower SES groups on objective language tests as representing a language deficit that is described in general terms. However, an awareness of language varieties prompts the more specific interpretation that this poorer performance reflects poorer proficiency in the language variety that is assessed in objective language tests, which is the formal version of the standard variety (Champion et al., 2003; Finneran et al., 2020; Hendricks & Adlof, 2017; Mills, 2015; Moland & Oetting, 2021; Southwood, 2013).
Poorer proficiency in the formal version of the standard variety on the part of children from lower SES backgrounds has three potential sources: First, that children from lower SES backgrounds may speak, as a first or home language, a minority language other than the dominant societal language (Heppt et al., 2015; Hoff, 2018; McCabe et al., 2013); Second, that children from lower SES backgrounds may, for their everyday communication, use a dialect that differs from the standard variety (Baugh, 2017; Blundon, 2016); And third, that children from lower SES backgrounds in monolingual homes may receive less language input in the formal version of the standard variety from their parents, who, due to lower levels of educational attainment, lack proficiency in this language register themselves and tend to communicate in informal registers (Townsend et al., 2012). In the current meta-analysis, the first potential source of variance in performance was controlled by only admitting studies which had employed monolingual samples. To date, studies have not attempted to document the use of non-standard dialect within their samples and so, the identified associations in this meta-analysis may reflect a combination of the use of non-standard dialect and limited use of the formal version of the standard variety within children’s homes. It is important to recognise that children from lower SES backgrounds may present with lower proficiency in the formal version of the standard variety without having any deficit in the language skills required for everyday communication (D. C. Johnson et al., 2020). This is an important point to note as part of the misinterpretation which has surrounded HR’s findings has been a tendency to interpret lower performance on standardised tests as reflecting poor ‘oral language ability’ in a general sense, rather than as reflecting less familiarity with the language variety that is assessed by standardised tests. Given that the formal version of the standard variety forms the basis of academic language or the language of school and education (Uccelli et al., 2015), lesser proficiency in this variety has, nevertheless, implications for academic progress and should be considered as a potential barrier to students from lower SES backgrounds to achieve their potential within the education system.
Limitations
The broad topic of interest in this paper was the association between SES and oral language competence. However, it must be recognised that the meta-analysis quantified the association between SES and oral vocabulary test scores only. Vocabulary was chosen as the purpose of the meta-analysis was to examine the HR claim around the association between SES and vocabulary. It also happens that vocabulary has been the aspect of oral language that has been most studied within this field and which has been deemed as being most affected by SES (Hoff, 2013; Pace et al., 2017). However, vocabulary is only one aspect of human language, which is a complex system constituted by phonology, morphophonology, morphology, the lexicon, syntax and semantics (Baugh, 2017). Future empirical and/or meta-analytic studies could explore the association between SES and other aspects or measures of language such as phonological processing, grammatical development or narrative skills.
The purpose of a meta-analysis is to synthesise a body of empirical data which has been provided by a number of separate studies. The ideal circumstances for a meta-analysis are that the effects to be synthesised come from studies which are homogeneous in relation to methodology. When studies vary significantly in terms of their samples, variables or measures, the interpretation of the synthesised effects is obscured by variability among the studies (Boland et al., 2017). Analyses in this study revealed a substantial amount of between-study statistical heterogeneity for both the correlational and group studies. This may in part have been due to the variation in how the core study variables, namely SES and oral vocabulary, were operationalised and measured. As noted above, although all studies employed an objective language test, studies varied in terms of whether they employed a receptive or expressive or combined measure and then, in terms of the individual tests used. Although objective tests of expressive and receptive vocabulary tests tend to have strong correlations (Smith, 1997; Ukrainetz & Blomquist, 2002), and this was borne out in one of the included studies which employed both a receptive and expressive measure (see Lohndorf et al., 2018), it is also recognised that comprehension and production are separate aspects of language which can develop at different rates (Owens, 2016). Also significant may have been the variation in the measures used to represent SES. SES is a construct which is widely accepted as being an important influential factor in psychological and life outcomes and is studied prolifically, but it is rarely explicitly defined in papers and is operationalised in a myriad of ways (Antonoplis, 2023). As noted above, while the majority of studies included in the current meta-analysis employed caregiver education as representing SES, some employed composite measures, some school-disadvantaged-status measures and one, income. Encouragingly, those included studies which examined the association between vocabulary and several SES variables found moderate to strong correlations between individual SES variables (Cheng & Wu, 2017; C. Liu & Chung, 2022; D. Liu et al., 2016; Lurie et al., 2021) and correlations with vocabulary that were broadly similar in magnitude for individual SES variables (McAvinue, 2018). That being said, it is recognised that individual SES variables capture different dimensions of the overall SES construct (American Psychological Association, Task Force on Socioeconomic Status, 2007) and so, the use of different SES variables across the included studies may have contributed to the variability associated with the pooled effects. The field would do well to converge on an agreed set of SES indicators so that findings from future studies could be meaningfully combined and compared. A recent development in this area is the framework provided by Singh et al. (2025) for the standardised collection and reporting of demographic data for studies with young children, which includes guidance around measuring SES.
This meta-analysis focused on studies conducted during a recent decade (2012–2022) with a view to providing an estimate of the SES–vocabulary association that would be relevant to countries today. The findings are, therefore, limited to this period. Future meta-analytic studies could broaden the time range, including studies conducted since the 1995 HR publication, for example. Such a study could also examine changes in the SES–vocabulary association over time.
Conclusion and Future Directions
This systematic review and meta-analysis sought to quantify the association between SES and oral vocabulary in young children in recent times. Separate meta-analyses of correlational and group design studies generated statistically significant effect sizes representing the association between SES and oral vocabulary test scores, which were in favour of higher SES groups. Regarding future directions for the field, as noted above, future empirical studies could build on the current meta-analysis and its constituent studies by expanding the focus beyond vocabulary to other aspects of language and to agree upon a standardised method of representing SES so that different studies can be meaningfully compared. Another avenue for future research may be to extend research on the association between SES and performance on objective language tests to older children and young people as no studies were identified for this meta-analysis that had included children above 7 years old.
Arguably, a more important suggestion for future work in this field is for researchers to be careful to take a considered approach to interpreting findings such as those presented in the current meta-analysis. Scholars from the traditions of sociolinguistics and linguistic anthropology have suggested that the HR findings have been misinterpreted in other academic and public arenas due to a lack of consideration of sociolinguistic and linguistic anthropological knowledge about language socialisation and language variation. Future efforts to study and address the SES-language association would benefit from multidisciplinary collaboration. To the extent that performance on objective vocabulary tests can be taken as representative of broader oral language competence, the current findings could be interpreted as indicating that lesser proficiency in the formal version of the standard language variety among young children from lower SES backgrounds is a feature of modern societies internationally. Given the reliance of education systems across the world on this language variety, lesser proficiency in this language variety is likely to cause a difficulty for children from lower SES groups. Discussion over whether and how the identified association should be tackled is beyond the scope of this paper (although see Bergelson, 2024, for an example of such discussion). However, to avoid misinterpretation from a deficit perspective, this association should not be viewed as reflecting a difference or deficit on the part of individual children, families or communities but as a source of inequity within education systems where the use of the formal version of the standard language variety poses a structural barrier to children from lower SES families to achieve their potential within the education system.
Footnotes
Author Contributions
Funding
The author received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
