Abstract
Aims and objectives:
Bilingualism has often been related to advantages in learning foreign languages (FLs) as compared to monolingualism. Previous studies on FL vocabulary learning showed inconclusive results such as a disadvantage, no advantage, or an advantage of bilingualism. The objectives of the present paper are to investigate the extent to which bilingualism affects vocabulary learning in English as an FL (EFL) and the extent to which the learner’s age and the nature of vocabulary tests moderate the potential bilingual advantage in learning EFL vocabulary.
Methodology:
We carried out a systematic review and a meta-analysis focusing on the difference in EFL vocabulary knowledge between monolingual and bilingual learners.
Data and analysis:
Thirty-seven effect sizes were extracted from 14 studies with 1,984 participants and were analysed by means of a random-effects meta-analysis. Q-statistics were used to analyse the moderator variables.
Findings/conclusions:
The results indicate a small bilingual advantage (g = .20) in EFL vocabulary knowledge. Neither the nature of vocabulary knowledge nor the learners’ age was found to moderate this bilingual advantage. We conclude that bilingualism is associated with enhanced EFL vocabulary knowledge as compared to monolingualism and that this bilingual advantage is consistent in receptive vs. productive vocabulary knowledge and child vs. adolescent bilinguals.
Originality:
This study provides new insights into the bilingual advantage in FL vocabulary learning and contributes to our understanding of bilingual advantages in FL learning.
Implications:
Bilingualism fosters EFL vocabulary knowledge, regardless of the nature of vocabulary knowledge and the learner’s age. Future meta-analyses should include more language background variables to account for the large diversity of bilingual populations.
Introduction
Various studies showed that bilingualism (i.e., the situation in which language users use in their everyday lives more than one language which was acquired during early childhood; Grosjean, 2008) can positively affect linguistic performance such as foreign language (FL) learning. These advantages in learning an FL have often been associated with enhanced metalinguistic awareness (i.e., ‘the ability to think about and reflect upon the nature and functions of language’, Pratt & Grieve, 1984, p. 2) in bilinguals (e.g., Rauch et al., 2012). When it comes to FL vocabulary learning, Keshavarz and Astaneh (2004) showed better performances on tests measuring vocabulary knowledge in English as an FL (EFL) in Turkish-Persian and Armenian-Persian bilinguals than in Persian monolinguals. However, inconclusive results have been found in the sense that there are also studies which reported disadvantages of bilingualism in learning FL vocabulary as compared to monolingualism (e.g., Catalán & Fontecha, 2019; Hopp et al., 2018). In addition, there are also studies reporting no differences between bilinguals and monolinguals in this respect (e.g., Van Gelderen et al., 2003). Thus, it is unclear whether or not bilinguals have an advantage over monolinguals with respect to FL vocabulary learning. The mixed results found in previous studies can be explained by the fact that the potential bilingual advantage is dependent upon many moderating variables that vary across studies such as the bilingual learner’s age, the tasks used, educational context, culture, and the typological distance of the two languages (Festman et al., 2022). Meta-analyses have been considered as appropriate to account for these between-study differences by adopting a systematic, objective, transparent, and statistic approach to compare studies that show mixed findings on one central research question and to allow for generalizability of results (Plonsky et al., 2021). Therefore, a meta-analysis can be appropriate to complement our understanding of the effects of factors moderating the bilingual advantage and to address the question of whether bilinguals have an advantage over monolinguals when it comes to FL vocabulary learning. To the best of our knowledge no meta-analysis has yet been done on this issue in bilingual research.
In this paper we report on a systematic review and meta-analysis investigating the extent to which there is a bilingual advantage in FL vocabulary knowledge. More specifically, we aim to investigate whether (consecutive) bilinguals perform better on tasks measuring the learner’s vocabulary size in EFL than monolinguals and to investigate to which extent the nature of vocabulary tasks (receptive vs. productive) and learner’s age (child vs. adolescent) moderate the potential bilingual advantage. In this study, we only focused on consecutive bilinguals (i.e., bilinguals who acquired one language from birth followed by a second language) and not on simultaneous bilinguals (i.e., bilinguals who acquired two languages from birth) since simultaneous bilinguals have been found to have enhanced metalinguistic awareness as compared to consecutive bilinguals due to more experience with two language systems (Kalashnikova & Mattock, 2014). As such, metalinguistic awareness may not affect FL learning to the same extent in simultaneous vs. consecutive bilinguals. Combining simultaneous and consecutive bilinguals should therefore be a confounding factor when interpreting EFL vocabulary scores in the meta-analysis. To minimize effects of the psycholinguistic and social status of FL’s on vocabulary learning we limited our meta-analysis to EFL as the psycholinguistic status of English has been found to be different from that of other FL’s (Van Tessel & Bril, 2021). Due to transfer effects, combining studies on vocabulary learning in EFL with other FL’s in our analysis would affect the outcomes of the meta-analysis. Furthermore, the social status of English in today’s society is different from that of other FL’s as English is used as a lingua franca. Therefore, the amount of input in EFL vocabulary is higher than input in other FL’s, which affects FL vocabulary learning.
Bilingualism and FL vocabulary learning
Klein (1995) suggested that bilinguals benefit from an enhanced lexical awareness as a subcomponent of metalinguistic awareness, when learning words in an FL because it allows them to better understand the form-meaning mapping of lexical items. This enhanced awareness is a result of the generally larger lexical knowledge bilinguals have when compared to monolinguals because they have access to lexical items from two languages (Palinkašević & Palov, 2014). In addition, potential bilingual advantages in FL vocabulary learning have also been related to enhanced phonological short-term memory and phonological discrimination (i.e., ‘the ability to recognize a distinction between, or distinguish between, contrasting sounds’, Smith et al., 2022, p. 81). Phonological short-term memory has been shown to play a major role in FL vocabulary learning by bilinguals (Majerus et al., 2008) as it is involved in creating long-term phonological representations of lexical items. Furthermore, Kaushanskaya and Marian (2009) suggested that knowledge of two previously acquired different phonological systems might enhance the bilinguals’ phonological discrimination abilities and consequently facilitates FL vocabulary learning. Against this background, one may hypothesize that bilingualism has a facilitating impact on FL vocabulary learning as compared to monolingualism. However, studies comparing bilinguals with monolingual peers revealed mixed results in this respect. Molnár (2010) investigated the receptive vocabulary knowledge in EFL among Hungarian monolingual adolescents, Romanian monolingual adolescents, and Hungarian–Romanian bilingual adolescents. The results revealed a bilingual advantage in receptive EFL vocabulary knowledge in the sense that the bilingual learners outperformed the monolingual learners on the receptive vocabulary task. Similar results were found in Azeri–Persian bilingual adolescents (Dibaj, 2011) and in Armenian–Persian bilingual adolescents (Kassaian & Esmae’li, 2011) who were compared to Persian monolingual adolescents and tested by means of a receptive vocabulary task of EFL in both studies. Further evidence comes from Catalonia (Sanz, 2000). In this study Catalan–Spanish bilinguals’ scores on the (receptive) vocabulary section of the CELT English proficiency test were compared to those of Spanish monolingual learners of EFL and revealed to be higher than the scores obtained in the monolingual group. In addition to performances on receptive vocabulary tests, the bilingual advantage has also been observed on productive vocabulary tests. Keshavarz and Astaneh (2004), and Zare and Mobarakeh (2013) found the bilingual advantage in Turkish–Persian and Arabic–Persian bilingual adolescents, respectively. They were tested by means of a productive EFL vocabulary test and compared to Persian monolingual peers. The results showed that the bilingual learners outperformed the monolingual learners on productive vocabulary knowledge of EFL. In a similar vein, Keikhaie et al. (2015) showed that Baluchi–Persian bilingual learners of EFL outperformed Persian monolingual learners on a productive recognition vocabulary test. Similar results were found in Russian–Hebrew bilingual adolescents showing higher scores on a productive EFL vocabulary test than their Hebrew monolingual peers (Abu-Rabia & Sanitsky, 2010).
In contrast to studies showing a bilingual advantage in FL vocabulary learning, there are also studies that do not reveal any difference in FL vocabulary knowledge between monolinguals and bilinguals, or rather reveal advantages of monolingualism (over bilingualism). Ardeo (2003) and Agustín-Llach (2019) investigated the vocabulary knowledge of EFL in Spanish–Basque bilingual adolescents by means of a receptive vocabulary test and a productive lexical availability task respectively. They compared this bilingual group to Spanish monolingual learners of EFL and both studies found no difference in EFL vocabulary knowledge between bilinguals and monolinguals. In line with these results, Van Gelderen et al. (2003) assessed the receptive EFL vocabulary knowledge of Dutch monolingual learners and bilingual learners with different native languages (L1’s) and Dutch as their L2 by means of a multiple-choice test. When comparing the monolinguals’ and bilinguals’ scores, they did not find any difference in EFL vocabulary knowledge between these populations. Catalán and Fontecha (2019) used a productive lexical availability task to measure the EFL vocabulary knowledge of Spanish monolingual adolescents and bilingual adolescents with different L1’s and Spanish as their L2. The scores obtained revealed a monolingual advantage indicating that monolinguals outperformed the bilinguals on the vocabulary task.
Whereas most studies have focused on the potential bilingual advantage in FL vocabulary learning in adolescent learners, few studies have investigated whether this advantage may also be found in children. Salomé et al. (2022) compared the receptive vocabulary knowledge tested by means of a forced-choice recognition task, a go/no-go auditive task, and an orthographic judgement task in French monolingual and French-German bilingual 10- to 11-year-old children who learn EFL. They found that bilingual learners of EFL outperformed monolingual learners on all vocabulary measures. In a similar vein, Hopp et al. (2019) found a bilingual advantage when comparing German monolingual 8- to 11-year-old children and bilingual children with the same age range, different L1’s and German as their L2 on a productive EFL vocabulary task. When controlled for socio-economic factors, bilingual learners were found to outperform their monolingual peers. However, no difference between both populations was found on receptive vocabulary knowledge. In contrast to these results, Hopp et al. (2018) observed a monolingual instead of a bilingual advantage in their study assessing the receptive and productive vocabulary knowledge in EFL of German monolingual and bilingual 7- to 11-year-old children with different L1’s and German as their L2. The results revealed that monolingual learners of EFL outperformed bilingual learners on both receptive and productive vocabulary knowledge.
Moderating factors of bilingual advantage in FL vocabulary learning
In addition to effects of bilingualism on FL vocabulary learning, we focus on factors that moderate the potential bilingual advantage. Festman et al. (2022) highlighted the importance of including moderating factors in bilingual effects research to understand the variability in findings related to effects of bilingualism. These moderating factors can be categorized in sample and task characteristics. Sample characteristics are characteristics which are related to the learner’s language proficiency, socio-economic status, biliteracy, or age for instance. Task characteristics are characteristics which are related to the type of task or the typological distance of languages used in the task. As mentioned in the introduction, we included the learner’s age and the nature of vocabulary tasks as moderating factors that may affect the potential bilingual advantage in the present meta-analysis. The reason why language proficiency, the learner’s socio-economic status, and biliteracy have not been taken into account as potential moderators can be sought in the fact that the studies included in the meta-analysis did not report data for these factors (i.e., for proficiency and socio-economic status) or did not show variation across studies (i.e., for biliteracy). The typological distance between the L1 and the L2 has been shown to affect L2 vocabulary learning (Jarvis, 2009) in the sense that L2 lexical items are learned more easily when they exhibit typological similarities with the L1 than when the L1 is typologically more distant from the L2. However, this linguistic distance effect seems to not moderate bilingual advantages in FL vocabulary learning. For both closely related and more distant L1–FL combinations (e.g., Abu-Rabia & Sanitsky, 2010; Sanz, 2000 respectively), bilingual advantages have been reported for various FL vocabulary measures. More recently, Salomé et al. (2022) did not find an effect of the typological distance between the bilingual’s L2 and FL on FL vocabulary learning (except for written cognates). Overall, transfer from the L1 to the FL has been found to be greater than from the L2 to the FL, irrespective of the typological distance (Schepens et al., 2016). As such, typological distance was not included as a potential moderator in the meta-analysis.
Regarding the learner’s age, studies investigating the bilingual advantage in learning vocabulary in EFL included participants with a large age range. Whereas some studies have focused on young children enrolled in primary education (e.g., Hopp et al., 2018, 2019; Salomé et al., 2022), other studies have focused on adolescents enrolled in (pre-) university education (e.g., Agustín-Llach, 2019; Catalán & Fontecha, 2019; Keikhaie et al., 2015). Since bilingual adolescents have fully acquired their L1 and L2, learning an FL is challenging due to more accumulated knowledge of previously acquired languages as compared to bilingual children. As a consequence, increased linguistic transfer in bilingual adolescents may complicate FL learning (MacWhinney, 2008). However, for bilingual children FL vocabulary learning may be complicated by reduced cognitive resources. Moreover, children have been shown to avoid using new lexical items for one and the same concept (Clark, 2009). Therefore, bilingual children may alternatively be expected to have more difficulties in learning FL vocabulary than bilingual adolescents.
In addition to the learner’s age as a potential moderating factor, the nature of vocabulary tasks aimed at measuring the learners’ vocabulary size in EFL varied in previous studies. Several studies administered receptive EFL vocabulary tasks (e.g., Dibaj, 2011; Molnár, 2010; Salomé et al., 2022), while other studies used tasks measuring the learners’ productive EFL vocabulary size (e.g., Abu-Rabia & Sanitsky, 2010; Agustín-Llach, 2019; Keshavarz & Astaneh, 2004). There are also studies in which both types of tasks have been used (e.g., Hopp et al., 2019; Keikhaie et al., 2015; Zare & Mobarakeh, 2013). This variability in task type may moderate the bilingual advantage in EFL vocabulary learning as receptive and productive vocabulary knowledge in an FL have often been considered as two separate components of vocabulary knowledge (González-Fernández & Schmitt, 2020). It has typically been demonstrated that the learners’ receptive vocabulary knowledge in an FL is larger than their productive vocabulary knowledge (Zheng, 2009) and that the development of productive vocabulary knowledge is more challenging than that of receptive vocabulary knowledge (Zheng, 2012). Thus, it might be expected that the use of receptive vs. productive vocabulary tests moderates the bilingual advantage in EFL vocabulary learning in the sense that receptive vocabulary knowledge influences the bilingual advantage to a greater extent than productive vocabulary knowledge.
Research questions
Based on the previous literature and the objectives formulated, we addressed the following research questions:
To what extent does bilingualism affect vocabulary learning in EFL, when compared to monolingualism?
To what extent do the learner’s age and the nature of vocabulary tests moderate the potential bilingual advantage in learning EFL vocabulary?
Method and procedures
Literature search and inclusion/exclusion criteria
We first consulted the university library database Libsearch which contains a large collection of (electronic) books, scientific journals, and scientific articles from many databases and libraries worldwide such as Google Scholar, PubMed, and Wiley Online Library. The following keywords were used and combined during the main search: ‘bilingual advantage’, ‘vocabulary’, ‘receptive’, ‘expressive’, ‘productive’, ‘L2 odds ratio (OR) second language’, ‘L3 OR third language’, ‘English’, ‘English as a Second Language (ESL)’, ‘EFL’, ‘SLA’, ‘TLA’, ‘monolingual OR monolingualism’, ‘bilingual OR bilingualism’, ‘additional language’. After this search process, we consulted Google Scholar by using the same key words in order to find articles that Libsearch has missed. The reference lists of all selected articles were screened to check whether all articles related to our objectives were included in the meta-analysis. The search was conducted in June 2020 and updated in October 2023. It was limited to articles published in high-standard peer-reviewed journals or volumes to account for a high methodological quality. These journals and volumes were mostly considered as high-impact journals in this field of study (i.e., Q1 index), were sources of previous studies on bilingual advantages in FL learning and included Bilingual Research Journal, Australian Review of Applied Linguistics, Journal of Education and Learning, International Journal of Bilingual Education and Bilingualism, Applied Psycholinguistics, Toronto Working Papers in Linguistics, English Language Teaching, International Journal of Bilingualism, Learning and Instruction, GEMA Online Journal of Language Studies, Theory and Practice in Language Studies, Foreign Languages in Multilingual Classrooms, and Bilingualism: Language and Cognition. The methodological quality of each article was independently assessed by the first and second authors of this paper. The articles were included in the meta-analysis when they met the following inclusion criteria: (a) the study was based on an experimental design (e.g., no research notes or case studies), (b) the article reported descriptive, quantitative data on vocabulary scores and participant characteristics, (c) the study tested and compared consecutive bilingual and monolingual EFL learners, (d) a receptive and/or productive EFL vocabulary test was used, and (e) the study focused on English learned in an educational setting. If studies did not report quantitative data, did not include both (consecutive) bilinguals and monolinguals, did not focus on EFL, or test EFL learners in an educational context, they were excluded from the meta-analysis. Abstracts, titles, and method sections were independently screened by the first and second authors of the paper, following the inclusion criteria. If studies were not accessible or did not report quantitative data, the authors were first contacted before exclusion from the meta-analysis. The search resulted in a total of 40 effect sizes from 15 independent studies in which 2,107 participants were included in total. A flow chart of the literature search process is shown in Figure 1.

Flow chart of literature search (k = number of articles).
Data coding
The studies selected for the meta-analysis were coded for sample, task, and study characteristics. Specifically, for sample characteristics, we coded the sample size (N), the participants’ age range and category (i.e., children or adolescents), and the participant’s L1 (and L2 in case of bilinguals). Regarding the participants’ age category, a cutoff point of 11 years was selected to distinguish children from adolescents (see Muñoz, 2011 for the same cutoff point for EFL learners). Samples in which the age range was under the cutoff point of 11 years were categorized as children (resulting in an age range of 7–11 years for children in the population), while samples with an age range above the cutoff point of 11 years were considered as adolescents (resulting in an age range of 11–25 years for adolescents in the population). The reason for coding the participants’ L1 and L2 was to check whether the monolingual and bilingual samples were equally matched across studies (i.e., based on the participants’ age and language) for a high level of comparability of monolinguals and bilinguals in the meta-analysis. For task characteristics we extracted the nature of the vocabulary task used in the study (i.e., receptive and/or productive as indicated in the study) and the vocabulary scores (mean and SD) obtained in the vocabulary tests. For study characteristics, the number of effect sizes per study for the meta-analytic procedures, the last name of the first author (and the second author when the article was written by two authors), and the year of publication were reported. All data were coded by the first and second authors of the paper. The interrater reliability for coding was k = .96, indicating a high level of interrater agreement. In case of disagreement, both raters checked the articles and discussed until agreement was reached. See Table 1 for an overview of the data extracted from the selected articles.
Sample, task, and study characteristics of the included studies.
Note. N = number of participants, CHIL = children, ADO = adolescents, k = number of effect sizes in the study.
Meta-analytic procedures
After data extraction we visualized the data by means of a forest plot to detect potential outlier studies. In addition to a forest plot, sensitivity analyses (i.e., a leave-one-out procedure during which each study was removed per analysis to detect the individual impact on the overall effect size) were performed to determine whether the outlier studies detected in the forest plot were real outliers. Based on these analyses, we excluded one outlier study (i.e., Ardeo, 2003) as it heavily impacted the outcomes of the meta-analysis. Consequently, 37 effect sizes from 14 independent studies with a total of 1,984 participants were used for further analysis. By means of the software package IBM SPSS Statistics version 29.0 a Q test of homogeneity was run to test whether the variance in effect sizes was different from zero. This test revealed that our data significantly deviated from homogeneity (Q = 431.79; p < .001). A significant Q test indicates that systematic variation is present among the effect sizes in the sample and that moderating factors need to be included in the analysis (Shadish & Haddock, 1994). In addition, we used a I2 test of heterogeneity to determine the extent to which the observed variance in effect size estimates is explained by true effects in the sample. Based on Higgins et al’.s (2003) interpretation of heterogeneity in effect sizes, our data showed a high degree of heterogeneity (I2 = 93%). This indicates that the studies included in the meta-analysis show a high level of between-study differences and that the observed variation was not related to sampling error. A publication bias between published and unpublished studies is a major problem to overcome in meta-analyses. Cooper (2016) pointed out that studies with significant results are more likely to be published than those with non-significant results, which leads to a potential overestimate of the experimental effect in meta-analyses. Since we only included published studies, a publication bias may be a limitation of our meta-analysis. Widely used methods to overcome this problem are to visually inspect the dispersion of effect sizes by means of a funnel plot and to use an Egger’s regression-based test. In case of a publication bias in a sample of studies a funnel plot reveals to be asymmetrical. However, only using a funnel plot can be misleading since asymmetry does not always show a publication bias, but also reflects the heterogeneity in effect sizes. Furthermore, a funnel plot can be difficult to interpret when a limited number of studies have been included in the meta-analysis (Walker et al., 2008). In addition to a funnel plot, we used an Egger’s regression-based test to statistically check for publication bias (however, it is impossible to fully capture all unpublished studies in the statistical test).
We conducted a random-effects meta-analysis since random-effects meta-analyses have been found to be appropriate for data with a high level of heterogeneity as compared to fixed-effects meta-analyses (Israel & Richter, 2011). As the number of participants differed across studies, effect sizes are not comparable between the individual studies in the sample and need to be weighted before inclusion in the meta-analysis (Turner & Bernard, 2006). Following Turner and Bernard (2006), we conducted three meta-analytic steps to come to an overall weighted effect size of the difference in EFL vocabulary scores between monolinguals and bilinguals. The first step was to calculate the effect sizes of comparisons between monolinguals’ and bilinguals’ vocabulary scores made within each study of the sample. To this end, we used Cohen’s d as effect size. This effect size was extracted from the study or calculated based on the mean, the SD and the number of participants reported. As the number of participants largely differed across studies, we standardized the effect sizes found in each study by transforming Cohen’ d in Hedges’ g. Hedges’ g has been shown to be appropriate for both small and large sample sizes (Field, 2013), which increases the comparability of effect sizes across studies in our sample. However, the number of comparisons between EFL vocabulary scores in monolinguals vs. bilinguals differed across studies, which leads to a total of effect sizes differing across studies. The unbalanced number of effect sizes across studies in our sample may violate the assumption of statistical independence (Hunter & Schmidt, 2004). This means that studies with multiple effect sizes report several outcomes from multiple comparisons that are based on one and the same group of monolinguals and bilinguals, while studies with one effect size only report one outcome based on these populations. The unbalanced number of effect sizes across studies renders the confidence intervals (CIs) and standard errors (SEs) inaccurate (Hunter & Schmidt, 2004) and may therefore negatively affect the outcome of the meta-analysis. To overcome this problem, we averaged the Hedges’ g effect sizes in studies with multiple comparisons to create one single Hedges’ g and calculated the 95% CI and SE of the Hedges’ g effect size per study. As such, each study of the sample comprises one Hedges’ g effect size for the comparison of EFL vocabulary scores between monolingual and bilingual participants. As the accuracy of effect sizes differed across studies due to the sampling error related to the number of participants per study, we weighted the Hedges’ g for each study to come to an overall Hedges’ g effect size for the difference in EFL vocabulary scores between monolinguals and bilinguals in our sample and calculated the 95% CI and the SE of the overall weighted Hedges’ g. A positive overall Hedges’ g effect size indicates that bilinguals outperformed monolinguals on EFL vocabulary tests, while a negative value indicates that monolinguals outperformed bilinguals on these tests.
For the potential moderating factors we followed the same procedures to calculate an overall weighted Hedges’ g for each level of the moderator variable. Based on differences in effect size between levels of these variables, we detected whether or not the nature of the vocabulary test used and/or the learner’s age moderate the overall difference in EFL vocabulary scores between monolinguals and bilinguals in our sample (see Kan & Windsor, 2010 for the same procedures). A significant difference in overall weighted Hedges’ g between the levels of the moderator variables indicates that these moderator variables explain a considerable part of the variance observed in the Hedges’ g effect sizes in our sample. This means that these moderator variables moderate the overall difference in EFL vocabulary scores between monolinguals and bilinguals. Some studies (i.e., Hopp et al., 2018, 2019; Keikhaie et al., 2015; Zare & Mobarakeh, 2013) used both receptive and productive vocabulary tasks. For each of these studies, we calculated a Hedges’ g effect size per type of vocabulary task and included these effect sizes in the calculations of the overall weighted Hedges’ g for each type of vocabulary task.
Results
Literature search
Table 1 gives an overview of the results of the data-coding process.
A qualitative analysis revealed that most studies investigated the bilingual advantage in EFL vocabulary learning by using receptive vocabulary tests more often than productive vocabulary tests (71% vs. 57%, respectively). A large variability was found in the type of receptive tests that were used, ranging from a multiple-choice test (Sanz, 2000; Van Gelderen et al., 2003; Zare & Mobarakeh, 2013) to a Vocabulary Levels test (Kassaian & Esmae’li, 2011; Molnár, 2010) and a Vocabulary Knowledge Scale test (Dibaj, 2011). Salomé et al. (2022) administered three receptive vocabulary tests, that is, a forced-choice recognition test, a go/no go auditive test and an orthographic judgement test. The same variability of selected tests was found for productive vocabulary tests: a lexical availability test (Agustín-Llach, 2019; Catalán & Fontecha, 2019), a recognition test (Keikhaie et al., 2015), a Gardner picture test (Abu-Rabia & Sanitsky, 2010) and a controlled productive ability test (Keshavarz & Astaneh, 2004).
Publication bias
Figure 2 depicts the funnel plot created to assess a potential publication bias between published and unpublished studies. Based on a visual inspection of the dispersion of effect sizes, one could argue that there is a bias in our sample. However, the Egger’s regression-based test was not significant (p = .18), indicating that there was no publication bias detected in the effect sizes.

Funnel plot of unweighted effect sizes and standard errors.
Bilingual advantage in EFL vocabulary learning
As described in the method section, Cohen’s d and standardized Hedges’ g effect sizes (with 95% CI and SE) were computed per study. To correct for the accuracy of effect sizes due to sampling error, weighted Hedges’ g effect sizes were computed to come to an overall Hedges’ g effect size of the difference in EFL vocabulary scores between monolinguals and bilinguals (see Table 2). The overall Hedges’ g effect size was found to be .20 (SE = .05; 95% CI [.15, .33]; p = .02), indicating that EFL vocabulary scores in bilinguals were overall .20 SE higher than those in monolinguals and that this estimate of effect size falls between .15 SE and .33 SE in our sample.
Data of the effect sizes of the difference in EFL vocabulary scores between monolinguals and bilinguals per study and for the overall sample.
Note. Hedges’ g* = overall weighted effect size of difference in EFL vocabulary scores between monolinguals and bilinguals, SE = standard error of Hedges’ g, 95% CI = 95% confidence interval of Hedges’ g
To explain the variability of effect sizes found in our sample we conducted moderator analyses by computing an overall Hedges’ g for each moderator variable (i.e., Age and Nature of Test). Table 3 displays the overall Hedges’ g, the SE, the 95% CI, the Q-statistics for homogeneity per level of each moderator variable, and the p-value of the difference between Hedges’ g values when comparing the levels of the moderator variables.
Results of moderator variables per level of age category and nature of vocabulary test.
Note. k = number of effect sizes, N = number of studies, Hedges’ g* = overall weighted effect size, SE = standard error, Q = Q-statistics for homogeneity.
p = .001; **p < .001.
The results showed that the overall Hedges’ g effect size of the difference in EFL vocabulary scores between monolinguals and bilinguals did not significantly differ between children and adolescents, indicating that Age did not moderate the overall effect size. Similarly, Nature of Test was not found to moderate the overall effect size of the difference in EFL vocabulary scores between monolinguals and bilinguals since the Hedges’ g for receptive vocabulary tests did not significantly differ from the Hedges’ g for productive vocabulary tests. The Q-statistics for homogeneity revealed that there was systematic variation among the effect sizes observed in each level of the moderator variables. This indicates that the variation observed in the effect sizes of the difference in EFL vocabulary scores between monolinguals and bilinguals is not related to sampling error.
Discussion and conclusions
The results revealed a small difference (i.e., .20 SE) in vocabulary knowledge between monolingual and bilingual EFL learners, indicating that bilinguals outperformed monolinguals. Based on Cohen’s (1988) interpretation of effect sizes, the results showed a small, positive effect of bilingualism on EFL vocabulary knowledge. This finding is in line with previous studies showing a bilingual advantage in FL vocabulary learning (e.g., Keikhaie et al., 2015; Keshavarz & Astaneh, 2004). We assume that this bilingual advantage may be related to enhanced lexical awareness, phonological short-term memory, and/or phonological discrimination in bilinguals as compared to monolinguals. This assumption has also been made by Salomé et al. (2022). They reported a bilingual advantage in EFL vocabulary knowledge based on a forced-choice recognition task. In this task, participants need to link the spoken word to its concept, which relies on lexical retrieval and, as such, on memory skills. The authors, therefore, suggested that bilinguals have enhanced memory skills for FL word learning. In contrast, the results did not reveal any significance when it comes to the moderator variables included in the meta-analysis. This means that neither the nature of vocabulary tests used in studies (receptive vs. productive) nor the EFL learner’s age (children vs. adolescents) affect the bilingual advantage in EFL vocabulary knowledge. As such, the bilingual advantage in FL vocabulary knowledge can be considered as consistent across receptive vs. productive vocabulary knowledge and child vs. adolescent FL learners. As the Q-statistics reported in Table 3 revealed systematic variation among the effect sizes in each level of the moderator variables, and as such, a lack of homogeneity, moderator variables other than the nature of vocabulary tests and the EFL learner’s age might influence this bilingual advantage (e.g., typological distance between two languages, amount of input received in EFL, and/or cultural context [cf. Festman et al., 2022]).
Although the receptive vs. productive nature of the vocabulary tests used in the studies did not moderate the bilingual advantage in learning EFL vocabulary, it could be possible that differences in task type moderate the effect of bilingualism on EFL vocabulary knowledge. In this respect, the qualitative analysis of vocabulary tests revealed that a wide range of receptive tests were used in previous studies. Whereas a multiple-choice test for instance, aims to test to what extent a language learner can determine the synonym or the paraphrase that fits best to a stimulus sentence, the Vocabulary Knowledge Scale test is based on the ratings on how well a language learner knows the words presented (i.e., know it well, have seen it, or don’t know it). Although these tests measure receptive vocabulary knowledge, the performances are based on different cognitive processes. Since bilinguals might have enhanced phonological discrimination abilities (Kaushanskaya & Marian, 2009) and more overall lexical knowledge (Palinkašević & Palov, 2014) due to the knowledge of two linguistic systems, they may recall words more easily in a Vocabulary Knowledge Scale test than in a multiple-choice test in which performances are also related to lexical comprehension of synonyms or paraphrases. The same holds for productive vocabulary tests as lexical availability tests rely on lexical associations that come to mind when presented with a stimulus. Gardner picture tests, however, measure the learner’s ability to name pictures. The former task is claimed to measure the learner’s vocabulary depth (i.e., how well is the stimulus known by the learner), while the latter is claimed to measure the learner’s vocabulary size (i.e., how many words are known by the learner). Vocabulary depth and vocabulary size are, however, considered as two separate aspects of vocabulary knowledge (Schmitt, 2014). Taken this variation of vocabulary tests together, one may hypothesize that the extent to which bilingualism has a positive impact on FL vocabulary knowledge may be affected by this variation across studies. However, the number of studies included in the meta-analysis was too small to test such variables.
Another limitation of the present meta-analysis is that the included studies matched the monolingual and bilingual population based on only two background variables (i.e., language and age). This way of matching at a group level has often been used in a large body of studies on bilingual advantages, but it ignores the large variation in background factors at an individual level. We attempted to control for some background factors such as consecutive or simultaneous bilingualism and the FL learned by considering them as inclusion criteria. As the number of studies included was limited, we were unable to control for more background factors to further improve the comparability of monolingual and bilingual EFL learners across studies. This may have affected the extent to which bilingual EFL learners outperformed monolingual EFL learners in our study. Because of such between-study differences the meta-analysis showed a high level of heterogeneity, indicating that the comparability of findings across studies may be limited. However, this is inevitable in meta-analyses that focus on bilingual populations since bilinguals largely differ when it comes to background factors, language proficiency, or educational background. We thus call for more attention to language background moderator variables when comparing bilinguals and monolinguals in studies investigating potential bilingual advantages. Focusing on language background variables will ensure that monolingual and bilingual language learners are matched based on a more homogeneous language background profile and as such, that future meta-analyses better account for the large diversity of bilingual populations.
Footnotes
Acknowledgements
We would like to thank the anonymous reviewers for their valuable and constructive feedback on our paper, and Roel van Steensel for his useful comments on an earlier version of the paper.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Ethical approval and informed consent statements
Since we analysed data from previous studies (and did not test participants) in our meta-analysis, we did not need ethical approval for the study.
