Abstract
Aim:
This study investigates the effects of degree of multilingualism on cognitive functions in adulthood, with focus on episodic memory recall and including measures of verbal fluency as well as global cognition.
Design:
We studied a large population-based cohort cross-sectionally, and we also assessed changes over time through longitudinal measurements on four time-points over a 15 year period. Participants were drawn from the Betula prospective cohort study in Umeå, Sweden. The participants included in this study at baseline (n = 894, mean age = 51.44, 59.4% females) were divided according to number of languages into bilinguals (n = 395), trilinguals (n = 284), quadrilinguals (n = 169), and pentalinguals (n = 46).
Data and analysis:
We analysed performance on tasks of episodic memory recall, verbal fluency (letter and category) and global cognition (Minimental State Examination, MMSE) both cross-sectionally and longitudinally. The control background variables were baseline age, gender, years of education, general fluid ability Gf (Wechsler Block Design Test), and socioeconomic status. We employed a linear mixed modelling approach with entropy balancing weights to assess effects of degree of multilingualism on cognitive functions.
Findings and conclusions:
Using bilinguals as the reference group, our results indicated that all the other multilingual groups exhibited superior performance on episodic memory recall than bilinguals at baseline. The rate of change over time did not differ for trilinguals and pentalinguals compared to bilinguals. While quadrilinguals declined more over time than bilinguals, they still scored significantly higher than bilinguals at the last test wave. For letter fluency, similarly, all language groups scored higher than bilinguals at baseline, and none of the groups differed from bilinguals in rate of change over time. With regard to category fluency, quadrilinguals scored higher than bilinguals at baseline, but trilinguals and pentalinguals did not differ from bilinguals and none of the groups differed in change over time compared to bilinguals. Finally, for global cognition (MMSE), trilinguals and quadrilinguals scored significantly higher than bilinguals at baseline with no differences in change over time for any of the groups relative to bilinguals. Our study contributes to the understanding of multilingual cognition and sheds light into an under-researched cognitive domain known to decline in normal ageing, namely episodic memory recall.
Significance:
Our study emphasizes the importance of researching less explored aspects of multilingualism on cognition, in particular on episodic memory recall, to aid our understanding of factors that could potentially aid cognitive decline in later adulthood.
Over the past decades, a wealth of research has reported an advantage for bilinguals over monolinguals in cognitive performance, particularly in nonverbal tasks involving executive functions (e.g., Bialystok et al., 2004; Costa et al., 2008; Vega-Mendoza et al., 2015; for reviews, see, for example, Bialystok, 2017; Bialystok et al., 2012). In addition, while some studies have also reported an advantage in verbal fluency tasks, in particular letter fluency purported to pose demands on executive function processes (e.g., Ljungberg et al., 2013; Luo et al., 2010; Marsh et al., 2019, but see for reference Paap et al., 2017), it has generally been shown that bilinguals exhibit inferior performance than monolinguals on linguistic tasks requiring lexical access such as in picture naming tasks (e.g., Gollan et al., 2005) in both the bilingual’s first and second language (Ivanova & Costa, 2008).
In spite of the numerous reports of enhanced executive functioning performance in bilinguals, a growing body of literature has also begun to contest the reported bilingual advantages based on methodological and statistical grounds (e.g., Paap & Sawi, 2014), as well as on biases such as publication biases (de Bruin et al., 2015) and confirmation biases (Paap, 2014). In this section, we try to present an even-handed review and assessment of the relevant literature before outlining the aims of this study.
The hypothesis that bilingualism is associated with enhanced executive function abilities (the so-called ‘bilingual advantage’) is thought to arise due to bilinguals recruiting domain-general executive control mechanisms to a larger extent than monolinguals when continuously switching between different languages and inhibiting unwanted information. The bases for this lie on proposals that bilinguals’ languages are active in parallel and the mechanisms employed to inhibit the non-target language require domain-general executive control (Green, 1998; Kroll et al., 2008). The abovementioned behavioural effects have also found support in neuroimaging studies. For example, it has been proposed that behavioural bilingual advantages are related to the activity in brain networks associated to prefrontal cortex, anterior cingulate cortex and caudate, and that increased white matter integrity and grey matter density in the frontal lobe areas are positively related to bilingualism (Abutalebi et al., 2012; Gold et al., 2013; Olulade et al., 2016).
The effects of bilingualism on executive control, however, have yielded contrastive evidence (see e.g., Sörman et al., 2019 for null findings in a sample of older adults). This heterogeneity of findings has been attributed to a number of factors such as sample sizes lacking statistical power (Paap et al., 2015). Other discussions in this regard have pointed out to factors confounded in bilingual studies, such as differences in socioeconomic status, intelligence, culture, and immigration status between monolinguals and bilinguals (for discussion see Bak, 2016). Moreover, it has been suggested that factors such as lack of standardization in operationalising bilingual and monolingual groups, interactional contexts, type of tasks (verbal and nonverbal) and removal of outliers have been overlooked in studies focusing on null effects of bilingualism on executive functions (Grundy, 2020).
Another problem within this area of research is the lack of a unified theoretical account of executive functions (henceforth EF), the components that might be involved, and how they are measured.
Taking for example working memory (WM), oftentimes considered a domain within EF, in the context of bilingualism research the capacity of WM has been conceptualized as ‘not storage space but rather the extent to which resources are available to control attention to maintain information relevant for a current task’ (Engle & Kane, 2004; Engle, 2002 cited in Bialystok, 2017, p. 249). As such, WM being involved in domain-general cognitive control mechanisms has thus been hypothesized to be potentially enhanced in bilinguals. However, the impact of bilingualism on WM has yielded different findings. On the one hand, several studies have shown that bilinguals outperform monolinguals (e.g., Sörman et al., 2017) while others have not found support for bilingual superior performance (e.g., Lukasik et al., 2018). These conflicting findings have given rise to multiple meta-analytic reviews.
Grundy and Timmer (2017) conducted a meta-analysis on the effects of bilingualism on WM and reported a small to medium effect size in favour of better WM performance in bilinguals than monolinguals. The largest effects were found in children, and also when the tasks were performed in the first language among the bilingual populations (see also Adesope et al., 2010). A more recent meta-analysis by Monnier et al. (2022) showed instead a smaller effect size, still favouring bilingual better performance on WM but also showing that for a verbal WM task, this effect was moderated by the language used in the WM task: a larger bilingual advantage when the verbal WM task was performed in the L2 compared to the L1 (cf. Grundy & Timmer, 2017). On the other hand, more recent meta-analyses evaluating evidence on bilingualism on WM have found only small effect sizes favouring the bilingual advantage (Von Bastian et al., 2017). Finally, in a meta-analysis conducted by Lehtonen et al. (2018), the authors assessed bilingual versus monolingual performance across six EF domains, including WM. While the authors initially found small effect sizes in favour of the bilingual advantage on the domains of inhibition, shifting and WM, none of these remained after correcting for biases.
Similarly, for other sub-components of EF, a number of meta-analyses have shown mixed results. An important consideration that could yield different conclusions is related to the methodology of the review employed in the meta-analytic review (e.g., from discrepancies in search terms to the measures taken from the included studies, see Paap, Anders-Jefferson, et al., 2020 for a discussion) or whether the meta-analyses have attempted to correct for biases. 1 For example, Paap, Mason, et al. (2020) provide a detailed description of how in those more recent meta-analyses that used the precision-effect test (PET) and the precision-effect test with standard error (PEESE) methods to correct for publication biases, any observed effects favouring the bilingual advantage on EF, yield weak or null evidence for the bilingual advantage.
Another moderator factor of discussion has been the age of participants and the bilingualism advantage (or lack thereof). In particular, the purportedly enhanced executive control in bilinguals has been discussed in relation to the age of participants under study again, with contrasting proposals. On the one hand, differences in improved executive functions and cognitive control favouring bilinguals over monolinguals have been reported in children (e.g., Engel de Abreu et al., 2012; Martin-Rhee & Bialystok, 2008) and older adults (e.g., Bialystok et al., 2004, 2008) and it has been proposed that younger adults may sometimes not show any advantages (for a review see Bialystok, 2017) due to their already high cognitive fitness (Bialystok, Poarch, et al., 2014), namely, a ceiling effect. On the other hand, other studies have failed to find a bilingual advantage in children (Duñabeitia et al., 2014; Gathercole et al., 2014; for a meta-analysis see Lowe et al., 2021). Similarly, there are other studies also failing to find bilingual superior performance on tasks of EF in older adults (e.g., Gathercole et al., 2014). With regard to young adults, the aforementioned ceiling effect hypothesis in this age group has also been contested. In a direct test of the ceiling effect in young adults, Paap (2019) tested a group of eight young adults, who showed a reduction in the reaction time on a flanker task (for congruent, and incongruent trials, and the flanker effect itself) as a function of practice across a number of sessions, and the author concluded that this group of young adults were thus not performing at ceiling. Age has also been included as a moderator factor in the most recent meta-analyses on the impact of bilingualism on EF and the majority of them seem to point to small to not observable advantage on EF for bilingualism even in older populations (Lehtonen et al., 2018; Paap et al., 2019; but see also Donnelly et al., 2019; Von Bastian et al., 2017).
The impact of bilingualism on cognitive functions in adults has not been restricted to the study of EF domains, including that of WM. The scope of the effects of bilingualism on other cognitive domains proposed to be mediated by executive functions has also been explored, although less often. For example, less is known about the impact of bilingualism on memory, in particular episodic memory, a function thought to also recruit prefrontal networks (Shallice et al., 1994; see also Grant et al., 2014). If executive functions are enhanced in bilingual individuals, and executive functions are necessary for episodic memory performance, it would then be expected that bi/multilingualism led to better episodic memory performance (see Grant et al., 2014). The investigation of the effect of multilingualism on episodic memory in aging populations and its changes over time is surprisingly under-researched.
A study in this area by Schroeder and Marian (2012) examined the relationship between executive functions and episodic memory in bilingual and monolingual older adults aged between 73 – 88 years. They used a task of episodic memory picture recall and a task measuring inhibitory control (the Simon task). The authors reported superior performance for bilinguals over monolinguals in both episodic memory recall and the Simon task as indexed by a smaller Simon effect. In addition, earlier acquisition as well as years of bilingualism were associated with better episodic recall in bilinguals. Crucially, the authors also reported that for the bilingual group, performance on episodic recall was positively correlated with Simon effect accuracy, suggesting an association between episodic memory and inhibitory control (although see e.g., Paap, Anders-Jefferson, et al., 2020 who argue against using the Simon task as a measure of domain-general inhibitory control).
Similarly, Ljungberg et al. (2013) studied the effects of bilingualism on episodic recall and verbal fluency but longitudinally, in a sample from the same population study as in this study. 2 In their study, performance on episodic memory recall, as well as letter fluency and category fluency were compared between monolingual and bilingual adults. The authors reported that bilinguals outperformed monolinguals on episodic memory recall and letter fluency, and that this pattern remained over time. For category fluency no group differences were observed. The dissociation between better bilingual performance on letter but not category fluency has also been reported in previous studies in children (Kormi-Nouri et al., 2012) but studies in adults have yielded mixed results (Gollan et al., 2002; Obler et al., 1986; Portocarrero et al., 2007; Rosselli et al., 2002). In other studies, the finding of bilingual superior performance on letter but not category fluency has been explained in terms of letter fluency drawing more heavily on executive functions than category fluency (Friesen et al., 2015; Luo et al., 2010; Shao et al., 2014; see also Bialystok et al., 2008). Conversely, studying young adults, Paap et al. (2017) did not find differential effects of mono/bilingualism for letter and category fluency performance (i.e., no language group by type of verbal fluency task interaction), failing to support the proposal that bilingualism benefits would be reflected on letter fluency.
Studies on the relationship between bi/multilingualism and episodic memory are scarce, yet Grant et al. (2014) have put forward a proposal that can explain the previously reported bilingual advantage on episodic recall in older adults. Based on previous neuroimaging studies, the authors reviewed the extent to which brain regions involved in increased brain and cognitive reserve in bilingualism, overlap with those subserving memory and language, which in turn are linked to executive control and to episodic memory retrieval advantages in bilinguals. The authors propose not only enhancement of prefrontal function, but also better preservation of posterior regions, as well as increased connectivity between prefrontal and posterior regions (temporal and parietal areas) as the underlying basis of bilingual brain and cognitive reserve. Importantly, their proposal also points out to the importance of these regions for successful memory functioning, which would also explain a memory advantage in aging individuals (Grant et al., 2014).
In cohort studies on older populations, bilingualism has been linked to better cognitive outcomes in later life even when controlling for childhood intelligence (Bak et al., 2014). Similarly, bilingualism and in particular lifelong bilingualism, has been associated with delaying onset of cognitive symptoms of dementia (e.g., Alladi et al., 2013; Bialystok et al., 2007; Craik et al., 2010; Woumans et al., 2015; for a review and discussion see Vega-Mendoza et al., 2019) and these bilingual positive effects have been reported to be even larger in illiterates (Alladi et al., 2013). It has also been proposed that bilingualism may contribute to a delay in symptoms of mild cognitive impairment (Bialystok, Craik, et al., 2014; Ossher et al., 2013). In other clinical populations, bilingual adults have also been shown to exhibit better cognitive recovery after stroke compared to monolinguals (Alladi et al., 2016). Much in a similar fashion as occupational status or education being associated to building cognitive reserve (Valenzuela & Sachdev, 2006), bilingualism has been proposed as a contributor to building cognitive reserve and thus offsetting the symptoms of dementia and cognitive decline (Bialystok, 2021; Schweizer et al., 2012). Data coming from neuroimaging studies have aimed to shed light onto the neural bases of this hypothesis (Duncan et al., 2018; Perani et al., 2017). For example, Duncan et al. (2018) studied patients with mild cognitive impairment or Alzheimer disease who were classified either as monolinguals or multilinguals. In their study, multilingualism was related to increased cortical thickness and tissue density in brain areas related to language and cognitive control. Multilinguals also showed a correlation between cortical thickness in language and cognitive control regions and performance on episodic memory tasks. The authors also replicated the results in native-born Canadian participants suffering from mild cognitive impairment and were able to rule out immigration as a potential confound.
Nevertheless, discussions around the bilingual delay in symptoms of dementia and on cognitive decline have also arisen. For example, on the one hand, it has been noted that immigration status may be a confounding factor in these types of studies (Chertkow et al., 2010). However, important to mention is that the study by Alladi et al. (2013) was carried out in Hyderabad, India, where bilingualism is not confounded with immigration status. Other studies have not found a beneficial effect of bilingualism in delaying symptoms of dementia (e.g., Crane et al., 2009; Lawton et al., 2015; Ljungberg et al., 2016; Sanders et al., 2012; Zahodne et al., 2014). A point of discussion in this regard has been related to the type of design, whereby some argue that positive effects of bilingualism in dementia tend to be observed in retrospective studies, whereas those studies reporting null findings tend to be prospective (Fuller-Thomson, 2015). Among the factors discussed in the literature to account for such conflicting findings, one has been related to the operationalization of bilingualism. Specifically, retrospective studies reporting an advantage on bilingualism in delaying effects of dementia tended to use more stringent definitions of bilingualism, whereas prospective studies used more liberal criteria (for a review see e.g., Perani & Abutalebi, 2015; for a discussion see Antoniou, 2019).
A factor that remains unclear in the literature is the role of knowledge of additional languages beyond bilingualism. If the experience of handling more than one language leads to enhanced domain-general executive functions or more efficient brain networks, one possibility is that more languages would result in an additive effect of bi/multilingualism. Schroeder and Marian (2017) conducted a review set out to examine the effects of bilingualism and trilingualism on different aspects of cognition in different age groups based on the supply-demand mismatch theoretical framework. This framework suggests that when levels of cognitive resources required by a certain activity are higher than the available cognitive supply, the system increases its supply leading to an improvement on the cognitive function. In the case of bilingualism Schroeder and Marian propose that the bilingual experience places higher demands on cognitive aspects such as memory and executive processes (e.g., inhibition), leading to more efficient processes. They explored the following possibilities: from the supply-demand mismatch framework, trilingualism should impose more demands than bilingualism and therefore produce larger cognitive gains. The authors also assessed the possibility that trilingualism (relative to bilingualism) may not lead to increased cognitive gains for example because a large difference between demand and supply results in the task becoming too demanding and thus no cognitive gains may be observed, or because trilingualism, relative to bilingualism does not pose increased demands or the supply is too close to ceiling level. In line with these possibilities, their findings suggested differential effects of bilingualism and trilingualism on cognition by age group. In older adults, the reviewed findings generally pointed to better performance on overall cognitive reserve measures in trilinguals compared to bilinguals. In contrast, no differences between bilingualism and trilingualism were found on measures of inhibitory control in groups of children and younger adults, while in memory generalization tasks in children and toddlers trilinguals did not exhibit the same advantage as did bilinguals in the studies reviewed.
Consistent with these findings, there are studies suggesting that the positive effects of bilingualism on cognition in later life increase with knowledge of more languages (Kavé et al., 2008; Perquin et al., 2013). For example, the results from Kavé et al. (2008), a large study on an elderly Israeli Jewish population (N = 814), showed that even when controlling for a large number of known confounding variables (age, gender, age at immigration, place of birth, and education), multilingualism proved to be a strong predictor of older adult’s cognitive state, and the more languages spoken, the better cognitive state. In addition, the multilingual advantages also extended to a non-educated group. Another study supporting protective effects of multilingualism on cognitive decline was that of Perquin et al. (2013). In their study, participants were bilinguals and multilinguals. Multilingualism was operationalized in two ways: those who practised more than two languages, and a second classification was those using only 3, 4, or more than 4 languages. Of note, similarly to the study by Kavé et al. (2008) and as in this study, there were no monolinguals in the sample. They classed participants as either presenting ‘cognitive impairment no dementia’ (CIND) or ‘free of cognitive impairment’ (CIND-Free) and results showed that using bilingualism as a reference group, participants who practised more than two languages overall had a lower risk of CIND. Similarly, compared to bilinguals, those who practised three languages and four languages showed a similar protective pattern, while relative to trilinguals, individuals who practised four or more than four languages had the same probability of presenting cognitive impairment than trilinguals. These results supported the notion that relative to bilingualism, active multilingualism slowed down the rate of cognitive decline in older adults, a finding that was explained in terms of enhanced cognitive reserve and brain plasticity. Other studies have not found additional benefits of knowing more than two languages. In the study of Alladi et al. (2013), bilinguals exhibited a delayed onset of symptoms of dementia compared to monolinguals, but knowledge of more than two languages did not confer an additional benefit. Conversely, Chertkow et al. (2010) reported a delay in age of diagnosis of Alzheimer disease in bilinguals and multilinguals relative to monolinguals in an immigrant subgroup, but a protective effect only in individuals who spoke more than two languages in their overall sample.
While the studies revised here focus on aging populations, of note, a study by Grogan et al. (2012) in younger adults provide support of the hypothesis that increasing number of languages may lead to greater cognitive benefits. Grogan et al. (2012) showed that multilinguals who spoke two or more languages in addition to their native one had increased grey matter density in the right posterior supramarginal gyrus compared to bilinguals (those who spoke one language in addition to their native one).
Conversely, in populations of younger adults and tasks of executive control, Humphrey and Valian (2012) did not find differences in performance between monolinguals, bilinguals, and trilinguals, in two non-verbal tasks aimed to tap into of executive functions (i.e., Simon and flanker tasks). Similarly, Paap et al. (2014) reported either no differences between trilinguals, bilinguals, and monolinguals, or even monolingual superior performance in a series of non-verbal inhibitory control tasks in young adults. Taken together, the evidence so far regarding the relationship between number of languages in relation to cognitive outcomes in multilingual populations of young adults, older adults, and clinical populations, remains unclear.
As we have seen, there is ample evidence for and against the hypothesis that bi/multilingualism may lead to better cognitive outcomes and/or delay effects of neuropathology. The study of memory functions in adulthood and its relation to bi/multilingualism is relevant because memory functions are reported to decline with normal ageing (e.g., Craik, 1994), including episodic memory (Rönnlund et al., 2005). Therefore, our study would test the hypothesis that multilingualism could be a factor contributing to cognitive reserve and predicting better cognitive outcomes in later life.
Also, among the most frequently confounding factors are immigration status and education (see also Bak, 2016) and much of the literature examining the effects of bilingualism on cognition has focused on comparisons between monolinguals and bilinguals. In this study, instead, we focus on the bilingual experience, in particular, degree of multilingualism as indexed by number of languages within a multilingual sample who did not have different immigration status. Our study seeks to address the impact of degree of multilingualism on cognition, in particular on episodic memory recall, and its changes over time, while controlling for multiple previously reported confounding factors (age, gender, years of education, general fluid ability Gf, and socioeconomic status). If bi/multilingualism confers enhanced executive control and better executive control is associated with better episodic memory performance (Schroeder & Marian, 2012; see also Grant et al., 2014), we hypothesize that a higher degree of multilingualism (i.e., more languages spoken by multilinguals, relative to bilinguals) would lead to better cognitive performance over time. A similar prediction can be made based on previous findings: if the experience with handling more languages entails more cognitive demands, this in turn may generate larger cognitive gains (Schroeder & Marian, 2017). Our hypotheses are also based on the findings by Ljungberg et al. (2013), which focused on monolingual and bilingual comparisons. Here, we extend those findings and explore whether multilingualism, relative to bilingualism, would lead to better cognitive outcomes on the episodic memory recall and letter fluency tasks in which bilinguals outperformed monolinguals in the previous study (a finding that was thought to be due to such tasks placing greater demands on executive control networks). We therefore explore whether degree of multilingualism provides a larger benefit over bilinguals, the reference group. Conversely, based on the previous findings of Ljungberg et al. (2013), we do not expect differences on performance in the category fluency task among the bi-/multilingual language groups. Finally, we included a measure of global cognition (MMSE), to explore whether multilingualism would provide more protection against global cognitive decline, with bilinguals as a reference group as in previous ageing studies (Kavé et al., 2008; Perquin et al., 2013).
Method
Participants
The participant sample was drawn from the Betula prospective cohort study (Nilsson et al., 1997, 2004), a project of memory, ageing, and health that started in Umeå, Sweden. In this project, participants were selected by stratified randomized sampling (age, sex). To date, data have been collected over six test waves. The first wave occurred between years 1988 and 1990 (T1), and since then, new data have been collected in 5-year intervals. At each test wave, participants are tested in two sessions, approximately about 1 week apart. The first session focuses on health assessment, the second on cognitive ability. So far, six samples have been included in the Betula study: S1 (T1–T6), S2 (T2–T3), S3 (T2–T6), S4 (T3), S5 (T4), and S6 (T5) (Figure 1). Of these samples, only Samples 1 and 3 have participated in more than two test occasions, making longitudinal analyses possible. Furthermore, these two samples have also answered questions included in the Betula study about language skills, and thus of primary interest for this study. Age range at inclusion for Sample 1 was 35–80 years, and 40–85 years for Sample 3. All participants in the Betula study are screened for dementia, sensory impairments, and intellectual disability. Furthermore, to have Swedish as native tongue is an inclusion criterion for participation. The Betula prospective cohort study has been approved by the regional Medical Ethical Committee at Umeå University.

The design of the Betula study showing the samples (S1 and S3) and the test waves (T) included in this study (figure adapted from Ljungberg et al., 2013).
For this study, a total sample of 894 participants from Samples 1 and 3 that reported knowledge in more languages than Swedish and with complete data on target variables were included (S1 n = 432; S3 n = 462). Baseline was considered the first time participating in the study (i.e., T1 for Sample 1 and T2 for Sample 3), see Figure 1. Mean age for the total sample was 51.44 (SD = 11.90) years. The study sample included 59.4% females and 48.3% of the participants belonged to sample 1.
As part of a language history questionnaire, participants reported if they spoke other languages in addition to Swedish (see Supplemental Material). Participants that reported knowledge of two languages (including Swedish) were categorized as Bilinguals (n = 395), three languages as Trilinguals (n = 284), four languages as Quadrilinguals (n = 169), and five languages as Pentalinguals (n = 46). Most participants (97%) in this study reported knowledge of English, 51% in German, 23% in French, 6% in Spanish 4% in Finnish, 1% Italian, and 3% knowledge in other languages (e.g., Greek, Norwegian, Russian). Overall, 94% considered English to be their second language. With regard to language use, approximately 91% of the bilingual participants indicated that they spent up to 2 hours a day speaking their second language, in the trilingual group it was 88%, among quadrilinguals 83%, and finally among pentalinguals it was 80%. Unfortunately, no information was available about the frequency that participants used any language beyond their second language. Most participants (80%) started to learn their second language at primary school within the formal and mandatory education system in Sweden (at the age of 9). Participants indicated that they mainly used their second language at home, at work, or when travelling.
Measures
Episodic memory recall
Results from four tasks were used as indicators of episodic recall ability (see, for example, Ljungberg et al., 2013; Nilsson et al., 1997, 2004). The first was a subject-performed task (SPT). In this task, participants enacted 16 actions (e.g., lift the book) over a duration of 8 seconds per action, and directly after, participants performed free oral recall of these verb-noun actions. The maximum time for recall was 2 minutes. The second task used was a verbal task (VT), including free oral recall of 16 verb–noun sentences that were studied by the participant while it was also read aloud by the experimenter. Like the SPT task, a duration of 8 seconds was used for encoding of the material, and a maximum of 2 minutes was used for recall. For both SPT and VT, the number of correctly recalled sentences (both verb and noun) was used as a measure of performance. The third measure was based on performance in a category-cued recall (CCR) task. In this task, participants were told to recall the nouns included in SPT and VT tasks. Eight semantic categories (e.g., reading material) presented on a paper served as cue. Four categories related to nouns in the SPT task, and four to nouns in the VT task. Time for recall was 3 minutes and the number of correctly recalled nouns served as measure of performance in this task. The last measure used as indicator of episodic recall ability was the number of recalled words in a free recall task. In this test, a list of 12 nouns was read aloud by the experimenter at a pace of 2 seconds, and directly after, the participant was asked to recall as many words as possible in any order, but also at a pace of 2 seconds. Time for recall was 45 seconds and a metronome was used to indicate pace both at encoding and recall. A unit-weighted (z) episodic recall composite score was computed based on performance in each test.
Letter fluency
Participants were asked to verbally generate as many words as possible with initial letter A (excluding proper names, for example, Anna, Anders, Andersson) during 1 minute (Nilsson et al., 1997, 2004). The number of generated words was used as dependent measure.
Category fluency
Participants were asked to verbally generate as many occupations as possible with initial letter B during 1 minute (Nilsson et al., 1997, 2004). The number of generated occupations was used as dependent measure.
Global cognitive status (MMSE)
The MMSE (Folstein et al., 1975) with a maximum score of 30 points, includes several dimensions of cognitive status in attention and calculation, recall, language, registration, orientation, and ability to follow simple commands. This questionnaire is commonly used in clinical settings for screening of cognitive impairment.
Covariates
Additional factors included in the analyses were age, gender, and years of education. Performance in WAIS-R Block Design Test was also included since it is considered to be a reliable measure of general fluid ability, Gf (Wechsler, 1981). Furthermore, two measures of socioeconomic status (SES) were used. The first, Occupational SES, was based on a Swedish socioeconomic classification and ranged from ‘Unskilled employees in goods or service production, ‘Skilled employees in goods or service production’, ‘Assistant non-manual employees–lower level and higher level’, ‘Intermediate non-manual employees’, ‘Upper-level executives/Professionals and other higher non-manual employees’ to ‘Self-employed professionals/Self-employed other than professionals/Farmers’. The second SES indicator, living condition, was calculated based on the number of rooms in household excluding kitchen.
Statistical analyses
A frequent criticism of multilingualism studies using observational data is that the language groups often differ systematically on critical characteristics. Hence, this study used entropy balancing (EB) weights (Hainmueller, 2012) to ensure that the linguistic group participants were as similar as possible and thereby reducing selection bias. Hence, a weighted sample was obtained such that the language groups were exactly balanced (in terms of standardized mean differences) on the following covariates: baseline age, gender, years of education, Block Design (as an indicator of Gf), SES – Occupational status, and SES – Living conditions. The weighted sample reflects a synthetic sample where all linguistic groups had the same covariate characteristics as the original sample.
In a second step, we applied linear mixed modelling on the weighted dataset to examine the association between the language groups and cognitive functioning. The models included time, modelled as a linear trend, the language groups, modelled as dummy variables, as well as their interactions with time as fixed effects. In the weighted mixed models Bilinguals constitute the reference group. An individual intercept was further included as a random effect. The entropy weights were included as model weights to account for covariate imbalance between the groups (since exact balance was achieved for all covariates, none of the covariates used to calculate the weights were included as fixed effects in the model). Robust 95% bootstrap confidence intervals were computed for the parameters in the models. The weights were obtained using the WeightIt package and the mixed models were estimated using the lmer package in R version 4.1.2.
Results
The cohort at baseline included 894 participants with a mean age of 51.44 (SD = 11.90) years. The mean for years of education was 12.23 (SD = 3.62 years) and for block design, used as indicator for Gf, mean performance (raw score) was 30.60 (SD = 9.24). For the two SES indicators, number of rooms and occupational status, the means were 4.51 (SD = 1.41) and 3.60 (SD = 1.39), respectively. The total sample included 59.4% females. Within the sample, 44.2% participants were bilinguals, 31.8% trilinguals, 18.9% quadrilinguals, and 5.1% pentalinguals. While the EB weighted participant characteristics across the language groups are the same to the full sample (as described above), unweighted participant characteristics, as a function of language group, are provided in Table 1. For completeness, Table 2 shows a summary of the mean weighted scores by group for each cognitive test at each test wave among available participants at each wave.
Characteristics of the language groups (unweighted).
Note. SES = socioeconomic status; MMSE = Minimental State Examination.
Average weighted scores for the cognitive tasks at each time point by language group.
Note. MMSE = Minimental State Examination.
We performed EB weighted linear mixed model analyses to each of the four cognitive measures to examine the association between the language group and cognitive functioning at baseline and over time. For Episodic recall, bilinguals had a mean baseline score of −0.16 (95% CI [−0.24, −0.08]) and decreased by 0.06 standardized score points on average for each test wave (95% CI [−0.09, −0.03]). The other language groups scored significantly better on Episodic recall compared to bilinguals at baseline, that is, trilinguals 0.23 standardized score points higher (95% CI [0.11, 0.36]); quadrilinguals 0.62 standardized score points higher (95% CI [0.46, 0.78]); pentalinguals 0.58 standardized score points higher (95% CI [0.30, 0.88]). Quadrilinguals decreased more over time compared to bilinguals, −0.06 (95% CI [−0.12, −0.01]). Although quadrilinguals declined more, they still scored significantly higher compared to bilinguals at the last test wave, 0.43 standardized score points higher (95% CI [0.25, 0.61]). The rate of change for trilinguals and pentalinguals did not significantly differ from bilinguals, 0.002 (95% CI [−0.037, 0.045]) and −0.06 (95% CI [−0.17, 0.06]), respectively.
For MMSE, bilinguals had a mean baseline score of 28.1 (95% CI [27.9, 28.3]) and decreased by 0.19 score points on average for each test wave (95% CI [−0.29, −0.09]). Trilinguals and quadrilinguals scored significantly higher on MMSE compared to bilinguals at baseline, while pentalinguals did not reach significance. That is, trilinguals 0.37 score points higher (95% CI [0.10, 0.65]); quadrilinguals 0.42 score points higher (95% CI [0.07, 0.78]); pentalinguals 0.37 (95% CI [−0.30, 1.01]). No significant differences were found in change over time compared to bilinguals, that is, trilinguals −0.14 (95% CI [−0.27, 0.00]) noting that trilinguals scored at a similar rate as bilinguals at the last test wave, that is, 0.05 score points lower compared to bilinguals (95% CI [−0.32, 0.23]), quadrilinguals 0.12 (95% CI [−0.06, 0.30]), and pentalinguals 0.09 (95% CI [−0.28, 0.46]).
For letter fluency, bilinguals had a mean baseline score of 12.0 (95% CI [11.6, 12.4]) and a stable performance over time, that is, change over time was −0.01 score points (95% CI [−0.16, 0.17]). The other language groups scored significantly higher at baseline on letter fluency compared to bilinguals. That is, trilinguals 1.06 score points higher (95% CI [0.41, 1.69]); quadrilinguals 2.59 score points higher (95% CI [1.79, 3.46]); pentalinguals 2.38 score points higher (95% CI [0.87, 3.90]). No significant differences were found in change over time compared to bilinguals, that is, trilinguals 0.11 (95% CI [−0.15, 0.36]), quadrilinguals −0.24 (95% CI [−0.62, 0.10]), and pentalinguals 0.14 (95% CI [−0.50, 0.82]).
For category fluency, bilinguals had a mean baseline score of 5.01 (95% CI [4.88, 5.31]) and a non-significant decrease over time by −0.07 score points (95% CI [−0.17, 0.03]). Quadrilinguals scored 0.56 score points higher (95% CI [0.15, 1.01]) compared to bilinguals at baseline. The two other language groups did not significantly differ from bilinguals, that is, trilinguals 0.04 (95% CI [−0.27, 0.36]); pentalinguals 0.29 (95% CI [−0.50, 1.08]). No significant differences were found in change over time compared to bilinguals, that is, trilinguals 0.06 (95% CI [−0.09, 0.21]), quadrilinguals −0.12 (95% CI [−0.36, 0.11]), and pentalinguals 0.02 (95% CI [−0.42, 0.45]). Figure 2 (panels a–d) shows the estimates from the weighted mixed models over time for each of the cognitive tasks.

Longitudinal estimates from the weighted mixed models for each cognitive task. Episodic memory recall (panel a), Minimental State Examination (MMSE, panel b), Letter fluency (panel c), and Category fluency (panel d).
Discussion
The aim of this study was to investigate the impact of degree of multilingualism on cognitive performance in adulthood, in particular on an under-researched cognitive domain, namely episodic memory recall. We also studied performance on verbal fluency (letter and category) and global cognition, while accounting for a number of well-known confounding factors, namely age, gender, block design (as an indicator of Gf), years of education, and SES. We also examined rate of change of cognitive performance longitudinally over 15 years in relation to number of languages.
Our results using entropy balance weighted linear mixed models showed that using bilingualism as the reference group, all other language groups (i.e., trilinguals, quadrilinguals, and pentalinguals) showed superior performance on episodic memory recall than bilinguals at baseline and the rate of change did not differ for trilinguals and pentalinguals, relative to bilinguals. Notwithstanding that quadrilinguals declined more over time, they still showed superior performance over bilinguals at the last test wave. With regard to letter fluency, similarly, all language groups scored better than bilinguals at baseline and the groups did not differ in rate of change over time, compared to bilinguals. For category fluency, only quadrilinguals scored higher than bilinguals at baseline and none of the groups differed in change over time from bilinguals. Finally, regarding global cognition (MMSE), trilinguals and quadrilinguals scored higher than bilinguals at baseline and no differences in change over time were found for any of the groups compared to bilinguals.
In line with our predictions, based on the findings by Ljungberg et al. (2013), our results showed better performance for multilinguals (relative to bilinguals) on episodic memory recall and letter fluency at baseline and no differences in rate of change over time relative to bilinguals. Our findings for episodic memory recall are similar to the findings of studies reporting superior performance on episodic memory recall in bilinguals (compared to monolinguals) in Schroeder and Marian (2012) and Ljungberg et al. (2013), whereby the latter focused on a sub-sample of participants as in this study. Our study, however, adds to those previous findings with regard to number of languages in a multilingual sample rather than a monolingual/bilingual comparison: our results suggest superior episodic memory performance for multilingualism relative to bilingualism. A possible explanation for this is the proposal that bi- and multilingualism may lead to enhanced cognitive control. In turn, cognitive control has been linked to episodic memory recall (Grant et al., 2014) and the experience of handling more than two languages may result in a larger advantage than handling two.
From a different, non-mutually exclusive perspective, these results can also be explained through the supply-demand mismatch framework (Schroeder & Marian, 2017). More specifically, increasing number of languages incur higher cognitive demands and therefore produce larger cognitive gains. Our study provides insights with regard to the boundaries of such multilingual effects in an understudied cognitive domain, namely that of episodic memory recall.
Similarly, for letter fluency all multilingual groups performed better than bilinguals at baseline and the groups did not differ in rate of change over time compared to bilinguals. With regard to category fluency, only quadrilinguals performed better than bilinguals at baseline and again, no differences in change over time for any of the groups compared to bilinguals. These findings are partially in line with Ljungberg et al. (2013) who found bilingual superior performance over monolinguals in letter but not on category fluency, and these results could in principle be explained by the type of cognitive demands that letter fluency places in terms of executive functioning (Friesen et al., 2015). Generating words beginning with a certain letter (e.g., F) is generally hypothesized to pose higher demands on executive control ability than category fluency. Unlike category fluency, for letter fluency, semantically related associates need to be inhibited (Luo et al., 2010; Shao et al., 2014). Such differential demands by type of verbal fluency task are supported by neuroimaging studies showing the recruitment of different brain networks associated with letter and semantic fluency, in particular, the former activating more inferior frontal cortex and temporoparietal cortex, and the latter activating more the left temporal cortex (Gourovitch et al., 2000). However, we also note that other authors contest that letter fluency is an index of EF (Paap et al., 2017, see also Paap, Anders-Jefferson, et al., 2020; Paap et al., 2019), therefore we take caution in interpreting this result solely on the basis of letter fluency being assumed to be an index of EF. In addition, the way category fluency was measured in this study is different from the traditional way insofar as in this study participants were asked to generate items within a certain category (occupations) but also with the additional constraint of them beginning with a certain letter (B). This, however, is how the task was devised as part of the Betula study and therefore we are not able to directly compare and make robust claims using results of category fluency from other studies.
With regard to global cognition as measured through the MMSE, trilinguals and quadrilinguals performed better than bilinguals at baseline with no differences in change over time for any of the groups relative to bilinguals. Of note, while the difference between pentalinguals and bilinguals in terms of score points was similar to the difference in score points between quadrilinguals and bilinguals, the former did not reach statistical significance. This could be owed to pentalinguals being the smallest group and thus rendering the results underpowered. Our findings of superior performance on global cognition for trilinguals and quadrilinguals relative to bilinguals at baseline, are, however, in line with literature reporting better global cognitive performance in multilingual healthy older adults (relative to bilinguals) associated with increasing number of languages (Kavé et al., 2008). These results are also in line with those of Perquin et al. (2013), showing that multilingualism (relative to bilingualism) provided more protection against cognitive decline.
Our study has limitations. In our study, we used self-reports to categorize degree of multilingualism and we were not able to gather and analyse information related to languages beyond the second language, including that related to proficiency, use or age of acquisition. Given the nature of the data in this cohort, this was unfortunately not available for the present analyses. Therefore, we cannot be certain about the contribution of such aspects of the linguistic experience on performance on the cognitive tasks analysed here.
An initial observation was that the groups differed in baseline characteristics, such as years of education, block design, and SES, specifically trilinguals, quadrilinguals and pentalinguals were groups with higher scores than their bilingual counterparts. However, we were able to overcome the issue of imbalanced data across the language groups by employing entropy balancing (EB) weights (Hainmueller, 2012) to achieve balance in the relevant covariates namely, baseline age, gender, years of education, block design, and SES, across the groups. At the same time, we also acknowledge that the very nature of studies examining the effect of bilingualism on cognition imposes strong challenges in overcoming the issue of reverse-causality, and furthermore, as random allocation of participants to conditions cannot be carried out, and data are thus still observational in nature.
One of the strengths of our study is its contribution to a more underexplored cognitive domain in the context of multilingualism in adults, namely, that of episodic memory recall. Much of the literature on effects of bilingualism and multilingualism in healthy older adults has focused on global cognitive functioning or on executive control abilities, whereas our study sheds light into a more underexplored cognitive domain, namely memory (see Grant et al., 2014) and in particular, episodic memory recall. In addition, our study benefits from a uniquely large, homogeneous, and representative sample of the Swedish population. An advantage of this is that our participants are therefore similar in linguistic and cultural backgrounds and participants in this study shared the same native language (Swedish). Also, studying a multilingual sample, rather than comparing monolinguals and bilinguals, reduces other potential confounds that have been discussed to modulate differences between monolingual and bilingual groups, such as immigration status (for a discussion see Bak, 2016). Moreover, we had access to and accounted for a number of other important variables such as general fluid ability Gf, age, gender, SES, and education. Finally, our study also benefitted from both baseline and longitudinal measurements in a large sample allowing us to explore cognitive trajectories over a time period of fifteen years.
Our study also emphasizes the importance of researching less explored aspects of multilingualism on cognition, in particular on episodic memory, to aid our understanding of factors that can potentially protect against cognitive decline in adulthood.
Supplemental Material
sj-pdf-1-ijb-10.1177_13670069221139155 – Supplemental material for A longitudinal study of episodic memory recall in multilinguals
Supplemental material, sj-pdf-1-ijb-10.1177_13670069221139155 for A longitudinal study of episodic memory recall in multilinguals by Mariana Vega-Mendoza, Daniel Eriksson Sörman, Maria Josefsson and Jessica K. Ljungberg in International Journal of Bilingualism
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: MVM, DES, and JKL were funded by the Knut and Alice Wallenberg Foundation grant (KAW 2014.0205). This article is based on data collected in the Betula prospective cohort study, Umeå University, Sweden. The Betula Project is supported by the Knut and Alice Wallenberg Foundation, the Swedish Research Council (K2010-61X-21446-01), the Bank of Sweden Tercentenary Foundation (grant number 1988-0082:17 and J2001-0682); the Swedish Council for Planning and Coordination of Research (grant numbers D1988-0092, D1989-0115, D1990-0074, D1991-0258, D1992-0143, D1997-0756, D1997-1841, D1999-0739, and B1999-474); the Swedish Council for Research in the Humanities and Social Sciences (grant number F377/1988-2000); the Swedish Council for Social Research (grant numbers 1988-1990: 88-0082 and 311/1991-2000); and the Swedish Research Council (grant numbers 345-2003-3883 and 315-2004-6977).
Supplemental material
Supplemental material for this article is available online.
Notes
Author biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
