Abstract
Purpose:
Previous research on bilingual vocabulary has focussed largely on words for imaginable objects and actions (e.g., ‘apple’, ‘write’), but did not consider abstract words. We looked for a disproportion across two languages (a cross-language gap) in bilingual children’s knowledge of concrete verbs (e.g., ‘jump’, ‘drink’) and metacognitive verbs (e.g., ‘think’, ‘know’). We also investigated the effects of language exposure and age on bilinguals’ knowledge of both concrete and metacognitive verbs.
Methodology:
Thirty-nine Polish–English children aged 4;0–7;7 living in the United Kingdom performed vocabulary tasks in both languages: subtasks from Cross-linguistic Lexical Tasks (CLTs) measuring concrete verbs comprehension and production, and metacognitive vocabulary task (METVOC) measuring metacognitive verbs comprehension. Information on children’s cumulative exposure (CE) to each language was collected via parental reports. The amount of metacognitive talk children received in Polish was obtained from parental oral semi-structured narratives.
Data and Analysis:
Mixed-effects regression models were fitted separately for each task.
Findings:
A cross-language gap was observed for comprehension of concrete verbs, but metacognitive verbs did not show a cross-language gap. In production of concrete verbs, the English scores showed a steeper increase over age than the Polish scores. CE affected only production of concrete verbs. Correlational analyses showed children’s knowledge of metacognitive verbs in Polish (but not in English) was related to parental metacognitive talk in Polish.
Originality:
To date, few studies on bilingual children focused on words beyond those referring to imaginable objects and actions, and no study has explored both concrete and metacognitive vocabulary knowledge in bilinguals.
Implications:
A cross-language gap was observed for bilinguals’ concrete verbs, but metacognitive verbs showed a carry-over effect across languages. The two categories of verbs were also related to different types of linguistic input. While CE affected production of concrete verbs, parental metacognitive talk supported children’s knowledge of metacognitive verbs, but only in the language it was provided in.
Introduction
Bilinguals rarely know words in the two languages equally well as shown by research on bilingual children’s vocabulary size (e.g., Abbot-Smith et al., 2018; Budde-Spengler et al., 2021; Rinker et al., 2017). This vocabulary gap is often a source of worry to parents and practitioners, because vocabulary size is a predictor of future language and academic performance (Fernald et al., 2013; Lee, 2011). Crucially, research on bilingual lexical development has focused on concrete words that refer to observable objects and actions (e.g., ‘apple’, ‘run’), and are easily presented to children on picture boards (Clark, 1993; Kauschke et al., 2007). But apart from concrete words, children also acquire abstract metacognitive verbs that refer to mind and cognition (e.g., ‘think’, ‘guess’, ‘remember’), the use of which is intrinsic to human communication, and often serves as a proxy of children’s mentalizing abilities (Brown et al., 1996). Still, research on metacognitive verbs in bilinguals is scarce, even though bilinguals’ general linguistic and cognitive development has been comprehensively studied (e.g., Bialystok et al., 2010; Gunnerud et al., 2020). We postulate that researching metacognitive terms would broaden our understanding of vocabulary knowledge in bilinguals and give us insight into how differences between concrete and metacognitive vocabulary influence their acquisition across bilinguals’ two languages.
The first aim of the present study is to investigate whether Polish–English bilingual children show a cross-language gap in their knowledge of concrete and metacognitive verbs. Previous studies have confirmed such a disproportion in concrete vocabulary knowledge across the two languages of bilingual children (e.g., Abbot-Smith et al., 2018; Hoff et al., 2012; Rinker et al., 2017), but a cross-language gap in the area of metacognitive vocabulary is yet largely unexplored. Another aim of the study is to measure how the knowledge of concrete and metacognitive verbs in each language is affected by the child’s age and language exposure. Age is linked with cognitive development and a growing interest in mental states (e.g., Hughes & Dunn, 1997), which enhances the acquisition of metacognitive vocabulary. In bilinguals, age is closely linked to changes in patterns of language exposure, which influence the rates of vocabulary growth (Jia et al., 2014; Pham & Kohnert, 2014). Language exposure is directly linked to concrete vocabulary development in bilingual children (Elin Thordardottir, 2011; Hoff et al., 2012). Its impact on metacognitive vocabulary in bilinguals has not been investigated so far, but exposure to references to mental states has been identified as crucial for metacognitive vocabulary acquisition (e.g., Tompkins et al., 2021).
It is important to note that the majority of studied terms referring to cognition are verbs. This follows from the common naturalistic use: we use metacognitive terms as verbs with the person experiencing the mental state as the subject (Hall & Nagy, 1979, p. 8). Metacognitive verbs are also easier (than metacognitive nouns) to employ in child-targeted tasks, for example, picturing somebody explaining something is easier than creating a context in which explanation will come to the child’s mind. For this reason, the task we used, metacognitive vocabulary task (METVOC) created by Astington and Pelletier (2004), focuses on verbs only. Since our metacognitive terms were all verbs, we decided to narrow our set of concrete words to verbs only, to ensure consistency in our analyses. Therefore, we used subtasks of the Cross-linguistic Lexical Tasks (CLTs) (Haman et al., 2015) referring to verbs in Polish and English (for more information, see the ‘Tasks’ section).
Cross-language gap in concrete vocabulary in bilingual children
Bilinguals rarely show an equal knowledge of concrete words in the two languages (Abbot-Smith et al., 2018; Budde-Spengler et al., 2021; Hoff et al., 2012; Rinker et al., 2017). In a study on a similar sample as the one tested here, that is, Polish–English bilingual children aged 4;5–5;9, Abbot-Smith et al. (2018) showing an L1 (Polish, home language) predominance over L2 (English, majority language) in verb production, verb comprehension, noun production, and noun comprehension. Conversely, Jia et al. (2014) found a predominance of L2 (American English) in bilingual children aged from 5 to 18. Few cross-language vocabulary studies distinguish between particular word classes, but Klassert et al. (2014), who studied noun and verb production in Russian–German bilinguals aged 5–6, found L2 German predominance only for nouns, while verb production was similar in Russian and German. The predominance of one language over the other is often due to the differences in the quantity and quality of input in each language (De Houwer, 2007; Hoff et al., 2012, 2014), which translate directly to the rates at which children acquire their languages (Hoff et al., 2012; Lauro et al., 2020).
Cross-language gap in metacognitive vocabulary in bilingual children
Few studies have investigated metacognitive vocabulary in bilingual children. They often included references to emotions, desires, and cognition, and were focused on the children’s production in a narrative. Otwinowska et al. (2020) found that Polish–English bilingual children (aged 3–7 years) produced a similar amount of references to all internal states in both languages. In a subsequent study (using a sample that partially overlapped with Otwinowska et al., 2020), Mieszkowska (2018) found that the amount of metacognitive terms (nouns and verbs) was similar in the bilinguals’ two languages. In a study with children of similar age (5;4–6;6), but English–Hebrew bilinguals, Altman et al. (2016) calculated the frequency of use of metacognitive terms in children’s storytelling (the number of terms divided by content words) and found that the frequency (or ratio) of metacognitive verbs was higher in Hebrew (L2) than in English (L1). Yet, the L2 stories were shorter, which could have inflated the ratios in the L2.
Notably, as Cummins (1979) suggested, some knowledge in the two languages of a bilingual may be interdependent, sharing a common underlying cognitive foundation. Evidence for such language interdependence comes from research on reading (Bialystok et al., 2005; Oller & Eilers, 2002) and narrative abilities (Gagarina, 2016; Otwinowska et al., 2020; Uccelli & Páez, 2007). Mixed results are also found with regards to concrete vocabulary, with some studies finding a positive correlation between L1 and L2 vocabulary scores (e.g., Leseman, 2000; Scheele et al., 2010; Umbel & Oller, 1994), some finding no correlation (e.g., Cobo-Lewis et al., 2002). Generally, evidence for language interdependence in bilinguals is found for cognitive and metalinguistic skills (see Sierens et al., 2019 for a review). The knowledge of metacognitive terms taps into the child’s cognition and not only their linguistic abilities, and therefore may be at least partially language interdependent. So far, interdependence was not tested in bilinguals’ metacognitive vocabulary.
Factors that impact the acquisition of concrete vocabulary
Concrete verbs, which are the focus of the present study, imply referential ambiguity: when hearing a new verb, it may be difficult to know exactly which aspect of action/experience is being referred to (the so-called ‘packaging problem’, e.g., Tomasello, 1995). For this reason, verbs are relatively less suited to learning by observation (Gleitman et al., 2005; Tomasello, 1995). Moreover, verbs are not all alike: some are more imaginable and their form is relatively simple (e.g., ‘jump’), while others include inseparable prefixes or particles added to the root (e.g., ‘uncover’, ‘cover up’) (see Behrens, 1998, for a review). The verbs investigated here are of simplex form, and are relatively easily imaginable. Importantly, these different kinds of verbs show different behaviours in language use in terms of their syntactical connections and prosody (Behrens, 1998). Children’s age, being integral with growing cognitive maturity and accumulated exposure to language, is a strong predictor of vocabulary development. As children grow and accumulate language experience, they start to notice similarities in the linguistic input and use them to build categories. As they contrast their categories with more linguistic input, they start to form schemas, rule-like phenomena that are not tied to particular linguistic examples (Behrens, 2009, 2021).
However, none of this would be possible without language exposure which is directly linked to vocabulary development, also in bilinguals (e.g., Elin Thordardottir, 2011; Hoff & Ribot, 2017; Unsworth, 2016). Elin Thordardottir (2011) found that children with similar relative exposure to English and French in Canada had similar vocabulary size in both languages, whereas children with unequal patterns of exposure showed dominance in the language to which they have been exposed more. In a longitudinal study of Spanish–American bilinguals aged 2.5–5, Hoff and Ribot (2017) found that the effect of English exposure was quadratic, not linear, that is, larger relative exposure to English at home (i.e., at 75% or higher) was related to larger gains in children’s English growth, than English exposure that accounted for 50% or 25% of at-home language exposure. In turn, Spanish exposure showed a linear effect, suggesting that the effect on vocabulary scores was constant across the full range of exposure.
Language exposure patterns and its changes in bilinguals are closely related to child’s age. Emergent bilinguals, who are the focus of the present study, tend to be dominant in their L1 in their early years due to an increased exposure at home (Haman et al., 2017). When they start formal schooling, their L2 acquisition is pressingly supported and the exposure to the language of the community and school is on the rise (Welsh & Hoff, 2020). Pham and Kohnert (2014) studied longitudinally vocabulary knowledge in Vietnamese–English bilingual children (aged from 6 to 11 on the study entrance). The children’s performance in both languages increased with age, but the gains for English were relatively greater. Jia et al. (2014) found that English proficiency (majority language) overtook home language proficiency from the youngest bilingual age group (5- to 7-year-olds) and reached monolingual norms already in 8-year-olds, while home language vocabulary remained at the level of elementary school even in the oldest group (17- to 18–year-olds).
Factors that impact the acquisition of metacognitive vocabulary
Most research on metacognitive vocabulary focuses on verbs. And as such, metacognitive verbs share certain acquisition challenges with other (concrete) verbs. Verbs generally pose greater referential ambiguity than nouns (e.g., Tomasello, 1995). Metacognitive verbs additionally refer to abstract, non-observable experiences that only increase the difficulty. Because of these characteristics, metacognitive verbs are relatively difficult to acquire by observation only.
To infer the meanings of metacognitive terms, children may use, among others, information on known words in a sentence, and even growing syntactic knowledge. As children gain more linguistic experience, they begin to categorize and extract schemas (Behrens, 2009). Metacognitive verbs may offer some syntactic clues that cue mental states. More specifically, metacognitive verbs take clausal complements (e.g., ‘The man thought
An important drive towards the understanding of metacognitive terms is the children’s growing interest in the mental states of self and others. Papafragou et al. (2007) suggest that metacognitive or mental verbs rise to relevance in the context where an incorrect mental state (a false belief) is observed. Thus, children’s developing theory of mind (ToM) draws attention to metacognitive vocabulary that describes these mentalizing experiences. Hughes and Dunn (1997) and Brown et al. (1996) show that at pre-school age, children start to take particular interest in mental states and refer to mental states in the context of pretend play (e.g., ‘Do you think Captain Hook could be policeman?’, ‘Pretend we’re pirates’, Hughes and Dunn, 1997, p. 1030). This interest might fuel further acquisition of metacognitive terms.
But further acquisition could not be possible without the exposure to such terms in the child’s immediate environment. References to mental states are generally not frequent in parental input (Adrián et al., 2007; Jakubowska et al., 2018). Jakubowska et al. (2018) found that mental utterances (i.e., sentences containing a reference to a mental state) constituted on average 21% of all utterances in parental stories told to 4-year-olds, of which 14% were simple utterances (e.g., ‘They didn’t know that someone was coming’), and only 7% were mental clarifications (e.g., ‘He thought that he would climb very fast and he would eat that bird’). However, these relatively infrequent maternal clarifications, especially when addressing the child’s mental states, serve as a scaffolding for children’s understanding and use of mental state vocabulary (Tompkins et al., 2021). Adrián et al. (2007) found that mother’s use of metacognitive terms during picture-book reading, though constituted only up to 1.2% of overall mothers’ talk, predicted the 3- to 6–year-olds’ understanding of mental states. In younger children than those studied here, Taumoepeau and Ruffman (2008) showed that mothers’ reference to others’ thoughts predicted the children’s use of metacognitive terms at 33 months. Lecce et al. (2021) found that teachers’ readiness to refer to mental states predicted pupils’ (8–12 years) ToM even when child- (i.e., age, verbal ability, number of siblings, and socioeconomic status [SES]) and teacher-related variables (i.e., ToM, verbal ability, and years of experience) were controlled. Notably, this mentalistic input is yet another aspect of linguistic experience accumulated by children, which allows them to notice similarities in the linguistic input and use them to build categories (Behrens, 2009, 2021). Together, this implies that children’s acquisition of metacognitive terms benefits from the interaction of their developing social abilities and the mentalizing input around them, which allows them to build mental state representations.
As children get older, their tentative meanings of metacognitive terms are further deepened (e.g., narrowed) with more mentalistic input and their cognitive development. In Moore et al.’s (1989) study children aged 3–8 were asked to determine the location of a hidden object based on two puppets giving conflicting statements with the use of the words ‘know’, ‘think’, and ‘guess’ (e.g., ‘I guess it’s in the blue’ vs ‘I know it’s in the red box’). The results showed that by the age of 4, some of the children could differentiate ‘know’ from ‘think’, and ‘know’ from ‘guess’. They were also aware of the fact that ‘know’ indicates a higher reliability than ‘think’ and ‘guess’. This understanding was evident in all children by the age of 5. However, the distinction between ‘think’ and ‘guess’ was not well understood even by the age of 8 years old (the oldest group studied in the experiment), suggesting that the process of full understanding of the terms’ meanings is still on going.
Thus, based on the previous research, we presuppose that in order for children to acquire metacognitive verbs, that is, infer their meanings, children make use of their growing linguistic experience (growing vocabulary and syntactical knowledge), and the mentalistic input they receive from their immediate environment, which allow them, together with their growing interest in the mental states of self and others, to establish and revise the inferred meanings of metacognitive terms.
Current study
The study had two overall aims: to investigate whether Polish–English bilinguals show a cross-language gap in their concrete and metacognitive verbs knowledge, and to measure how the knowledge of concrete and metacognitive verbs in each language is affected by language exposure and the child’s age. For each aim, two research questions were asked.
With respect to the first research aim, the two questions set out here were:
RQ1: Do bilinguals show a cross-language gap in comprehension and production of concrete verbs?
RQ2: Do bilinguals show a cross-language gap in comprehension of metacognitive verbs?
Previous research repeatedly found that, when the vocabulary score indicated general (concrete) word knowledge (i.e., including nouns and verbs) bilinguals in pre- and early-school age rarely show equal vocabulary size in their two languages (e.g., Abbot-Smith et al., 2018; Rinker et al., 2017; Uccelli & Páez, 2007). Based on the magnitude of previous research, we expected to find a cross-linguistic gap in bilinguals’ concrete verbs knowledge (but, cf. Klassert et al., 2014). With regards to comprehension of metacognitive verbs, no such cross-language comparison was performed before. The available evidence is scarce, focused on production and shows conflicting results: Altman et al. (2016) found that English-Hebrew bilingual children aged 5;4–6;6 used more metacognitive verbs in Hebrew than in English. However, previous analyses on Polish–English bilingual children of similar age showed that bilinguals produced similar amounts of metacognitive terms (nouns and verbs) in their two languages (Mieszkowska, 2018), therefore, similarity was expected also in the comprehension of metacognitive verbs.
Turning to the second aim of the study, the research questions asked were as follows:
RQ3: What is the effect of age and language exposure on comprehension of concrete verbs?
RQ4: What is the effect of age, language exposure, and the parental use of metacognitive verbs on comprehension of metacognitive verbs?
With regard to the concrete vocabulary, the role of language exposure and age was thoroughly investigated in bilingual children. Language exposure is directly linked to vocabulary development in bilinguals (e.g., Elin Thordardottir, 2011; Pham & Tipton, 2018; Unsworth, 2016). The impact of general language exposure on the metacognitive verbs in bilinguals’ two languages has not been investigated so far, but child’s linguistic experience may help them to categorize and extract schemas (Behrens, 2009) and use some syntactic clues that cue mental states (syntactic bootstrapping, Gleitman et al., 2005, but see also Behrens, 1998; de Villiers, 2007). Also, as bilingual children grow and start formal schooling, their L2 performance may gain on the L1, or even outpace the L1 (e.g., Pham & Kohnert, 2014). With regards to the metacognitive vocabulary, we expected to find an age-associated increase in the comprehension of metacognitive verbs, reported in previous studies (e.g., Moore et al., 1989). Also, we expected that the more the children’s parents tend to refer to metacognitive states, the better the child’s understanding of metacognitive verbs, in line with Adrián et al. (2007) and Taumoepeau and Ruffman (2008).
In addition, by examining an interaction of age and language, we also investigated whether vocabulary performance in the two languages is similarly affected by the child’s age. Previous studies, for example, Pham and Kohnert (2014) and Klassert et al. (2014) found that age-associated increase in vocabulary knowledge was different for the bilinguals’ L1 and L2. Therefore, we included a similar interaction in our analyses, to see whether this pattern would be repeated in our data as well.
Participants
The participants were Polish–English bilingual children aged 4;0–7;7 (M = 5;7, SD = 12 months) living in the United Kingdom. Their parents were Polish, so they were exposed to Polish from birth, Polish was their first language and the language used at home. The children had lived in the United Kingdom for at least 2 years; 29 were born in the United Kingdom or Scotland, and 10 were born in Poland. Their first exposure to English was, on average, around the age of 1;2 (1 year; 2 months) (SD = 14.52 months, min. = 0;1, max. = 3;9). More information on the cumulative exposure (CE) to each language was calculated from data gathered with a parental questionnaire on child’s language development (see the ‘Tasks’ section). Forty-five children entered the study, 39 completed the testing and were included in the final sample. All of the children were typically developing and obtained age-appropriate scores on a non-verbal intelligence test, Raven’s Coloured Progressive Matrices (Polish adaptation: Jaworowska & Szustrowa, 2003).
Apart from the children, data from 33 parents were analysed (six of them had two children in the sample). The parents’ task was to tell children a story (in Polish) based on a particular picture story. Thirty-one of them were mothers (94%) and two were fathers (6%). All of them reported being the main caretaker and being typically involved in story-telling and story-reading with their children.
Tasks and procedure
Tasks
The testing battery included a parental questionnaire and four tasks performed by children: a standardized and normed non-verbal intelligence test, an experimental vocabulary task used in previous research, and a task adapted specifically for this study, measuring comprehension of metacognitive verbs, and a narrative task. Parents were given a narrative task to perform. All the tools are presented below together with measures derived from these tasks.
Questionnaire on child’s linguistic development
A parental questionnaire, Polish adaptation of Questionnaire for Parents of Bilingual Children (PABIQ) for pre- and early-school bilingual children entitled ‘Kwestionariusz Rozwoju Językowego’ was developed within COST Action IS0804 (English version: Tuller, 2015; Polish version: Kuś et al., 2012). It provided information about the child’s language development, and the child’s CE to Polish and English. The index of CE was based on the total time spent in Poland and in the United Kingdom (in the child’s lifetime), and the amount and the quality of exposure to language the child received in each of these countries. The formula for calculating CE was the following: for Polish: (time spent in Poland) × 91 1 + (time spent in the UK) × (exposure to Polish while in the UK); for English: (time spent in Poland) × 0 2 + (time spent in the UK) × (exposure to English while in the UK). More details on CE can be found in the work of Haman et al. (2017). Since CE took into account the time the child has spent in the United Kingdom/Poland, the index correlated with the child’s age (i.e., the older the bilingual child, the more time he or she has spent in the United Kingdom in his or her lifetime).
Tests of vocabulary comprehension and production
LITMUS CLTs (Haman et al., 2015) in Polish and in British English were used to measure vocabulary comprehension and production in children. CLTs language versions are not translated, and thus, they do not include the same items across languages but they are comparable on age of acquisition (AoA), and word form complexity of the items (Haman et al., 2015). In the comprehension part of CLT, the child is presented with a board of four pictures: one picture depicts the target word (noun or verb), and three are distractors of similar AoA and similar complexity. The child is asked to point to the picture that appropriately depicted the heard target word. In the production task, the child is presented with one picture at a time (depicting object or action) and asked to name the picture with one word. The correct answer in the production part included the target word, its close synonym, or a dialectal variant. For the present purposes, only the scores for verb comprehension and verb production were used, as the task assessing metacognitive vocabulary (see below) included only verbs.
Metacognitive vocabulary test
METVOC (Astington & Pelletier, 2004) and its Polish adaptation (Mieszkowska et al., 2016) were used to measure the child’s comprehension of metacognitive verbs. The test targets the degrees of certainty (e.g., ‘know’ vs ‘guess’), or the variation in knowledge (e.g., ‘remember’ implying prior knowledge vs ‘guess’ implying absence of knowledge). The experimenter reads the child 14 stories illustrated by pictures. At the end of each story, the child is asked to select one of two verbs to describe the character’s state of mind, for example, ‘Does John know it’s raining or does John remember it’s raining?’. Please note, that the CLTs and METVOC (tasks used to measure concrete and metacognitive vocabulary) differ in the complexity of their procedures: in CLT word comprehension, the child responds to a short question (e.g., ‘where is someone jumping?’), while in METVOC, the child is presented with a series of complex pictures each depicting a scene/situation and is told a short story, then asked a relatively more complex question (e.g., ‘Does John know it’s raining or does John remember it’s raining?’). Thus, METVOC requires the child to store relatively more information in their working memory and then use that information to respond to the question. As such, METVOC’s procedure might make it more demanding for the child.
The original English-language version of the test (Astington & Pelletier, 2004) involves 12 test verbs: ‘learn’, ‘teach’, ‘guess’, ‘figure out’, ‘understand’, ‘know’, ‘explain’, ‘remember’, ‘predict’, ‘forget’, ‘wonder’, and ‘deny’. In the Polish adaptation (Mieszkowska et al., 2016), the items were chosen on the basis of a process of translation and back-translation and their spoken-language frequency in the Polish National Corpus (Przepiórkowski et al., 2012). We have also added two more verbs: ‘pretend’ and ‘dream’ that frequently appear in children’s spontaneous production (Bretherton & Beeghly, 1982) and during playtime with peers (Hughes & Dunn, 1997).
In the adaptation process, the illustrations were made by a professional artist, and we used a past tense common for story-telling in Polish. We counter-balanced the place of the target verbs: if in Version A the target is placed in the first half of the sentence (e.g., ‘Does John
Narrative task
We used the LIMTUS Multilingual Assessment Instrument for Narratives (MAIN, English version: Gagarina et al., 2012; Polish version: Kiebzak-Mandera et al., 2012), designed to assess narrative skills in bilingual children. However, in the present study, we used MAIN with adults, that is, parents of our bilingual children, who were asked to tell their child (in Polish) one story from the MAIN. They were asked to tell the story in the same manner that they usually tell a story to their child. The stories told by parents were short and lasted approximately from 2 to 3 minutes. The stories were coded for metacognitive verbs (types and tokens).
General procedure
All children were tested individually in a quiet room in their homes and one child was tested at school (in both languages). The testing was performed in two sessions, each devoted to one language. Fifteen children were tested first in English, then in Polish, and 24 children were tested first in Polish, then in English. The average time between the sessions was 19 days, with a maximum break of 2 months. These circumstances were forced by the availability of the children and the parents. The order of the tasks within each session was counter-balanced across the participants. The tasks in Polish were administered by a native speaker of Polish, while the tasks in English were administered by a native speaker or a highly proficient user of English. 3 The English testing included vocabulary tests, METVOC, and a narrative task. The Polish testing included additionally a non-verbal intelligence test and the parental story. The parents who had two children in the study told their stories twice, that is, one story to each of their children.
Analysis
Our general aim was to explore the knowledge of concrete and metacognitive words in relation to both languages of the bilinguals and children’s characteristics: age, CE to language and parental use of metacognitive words. In the introductory analyses, we searched for potential cross-language relationships by exploring correlations between the vocabulary tests in Polish and English. Next, to make sure that our predictors were appropriate for this sample, we explored the correlations between vocabulary tests and children’s age, CE, and parental use of metacognitive words.
In the main analyses, we explored the impact of language, age and CE on children’s vocabulary knowledge. We ran a series of linear mixed-effects regression models (Baayen et al., 2008): three separate models: for METVOC, CLT verbs comprehension, and CLT verbs production. Mixed models include fixed effects, that is, the variability explained by the independent variables, and random effects, that is, the variability due to individual subjects’ performance and individual item differences. Our fixed effects included age, language, and CE; in addition, we added a two-way interaction (language × age) to investigate whether the impact of age on vocabulary knowledge is different in Polish and English. We could not include the parental use of metacognitive words in the models, as it was obtained only in one language (Polish), and all our models investigated the effect of language and the interaction with language. Our random effects were the effects of participant and item. First, we assumed that the gap between the scores in Polish and English tasks is different for each participant (random slope for language over participants). Second, we assumed that some children would generally perform better than others (random intercept for each participant), and some items in a given task would be easier than others (random intercept for each item). Next, we assumed that while item accuracy should increase with age and CE, individual items may behave differently over age and CE increase (random slope for age over items and random slope for CE over items). Finally, we assumed that the trajectory (over age) of individual items might behave differently across languages (random slope for the interaction of language and age over items) and that individual participants might perform differently over age and across languages (random slope for the interaction of language and age over participants).
We used the lme4 package (Bates et al., 2015) in R (R Core Team, 2017). We fitted a generalized mixed-effects model with a binomial link function, with response accuracy as a binary dependent variable (0,1). Age was centred at the mean, and CE was log transformed, scaled and centred at 0. For the language, English was coded as 0 and Polish was coded as 1. When convergence problems arose while running the models, the random structure was simplified according to the procedure outlined by Barr et al. (2013). The p-values were derived using analysis of variance (ANOVA) model comparisons. In CLT, we treated each item as one, arriving at the total number of 64 items per model (e.g., in CLT comprehension: 32 Polish verbs + 32 English verbs). In METVOC, even though the items in each language version were the same, we still treated the Polish and English equivalents of an item as two separate items, since their language-specific forms could have influenced the word’s accuracy. Thus, we arrived at the total number of 28 items (14 Polish verbs + 14 English verbs) in the METVOC model.
Knowing that the CE is linked to age, and thus the two variables might show collinearity, we tested each model for multicollinearity using variance inflation factor (VIF). A VIF of 1 indicates that there is no correlation between a given predictor variable and any other predictor variables in the model. A VIF between 1 and 5 indicates moderate but acceptable correlation between a given predictor variable and other predictor variables in the model. A value greater than 5 indicates potentially severe correlation between a given predictor variable and other predictor variables in the model. In this case, the coefficient estimates and p-values in the regression output are likely unreliable. The VIF values are given with each model (see the ‘Main analyses’ section).
Results
In the introductory analyses, we used correlational analyses to see whether these potential predictors (age, CE, and parental use of metacognitive verbs) were in fact related to the scores on the vocabulary tests. In the main analyses, we included in the series of linear mixed-effects regression models the effects of language, age, and CE on children’s vocabulary knowledge. Below, we present first the introductory, correlational analyses, and in the main analyses, we present results from the linear mixed-effects models.
Introductory analyses
All of the data files and scripts for the analyses are available online at Open Science Framework, https://osf.io/he6zq/. First, we compared the CE bilinguals received in each language. We found that on average, children accumulated nearly twice as much exposure to Polish than to English, Polish: M = 328.40, SD = 66.58, English: M = 177.09, SD = 82.09, t(67) = 8.59, p < .001, Cohen’s d = 2.02, large effect size. Since the parental use of metacognitive words was investigated only in Polish (parents’ native language), we could not compare it across the two languages.
Table 1 presents descriptive statistics for METVOC and CLT verbs scores in bilinguals’ two languages. Generally, both in the area of metacognitive and concrete verbs, the mean Polish scores were higher than the mean English scores.
Descriptive statistics for METVOC and CLT raw scores in bilinguals’ two languages: means (Ms) and standard deviations (SDs).
Next, we explored cross-language correlations in task performance. We found a moderate positive correlation for METVOC scores in Polish and English, r(37) = .53, p < .001. However, the cross-language correlations for CLT (raw score) were non-significant, CLT verbs comprehension: r(37) = .27, p = .09; CLT verbs production: r(37) = .24, p = .14.
Next, we considered the relationships between vocabulary knowledge and our potential predictors (children’s characteristics). Table 2 shows the values of all correlations performed. We found positive correlations between the children’s age and each of the vocabulary task, though the age–task correlations were stronger for the English tasks than for the Polish tasks. CE in Polish correlated positively with all Polish vocabulary scores, and with most English vocabulary scores, apart from CLT verbs production, where the relation was non-significant. CE in English correlated positively with all English vocabulary scores, and with Polish METVOC scores (but not with Polish CLT scores). We found a weak positive correlation between the number of distinct metacognitive verbs (types) used by the parent (in Polish), and the child’s score on METVOC in Polish, r(37) = .35, p = .03, but not in English, r(37) = .11, p = .51. Parental use of metacognitive verbs did not correlate significantly with the CLT measures.
Pearson correlation coefficients (r) between vocabulary tasks and children’s characteristics.
p < .05, **p < .01, ***p < .001.
Main analyses
Our first pair of research questions was focussed on the potential cross-language gap in the area of concrete and metacognitive verbs. Our second pair of research questions focussed on measuring how the knowledge of concrete and metacognitive verbs in each language is affected by particular child characteristics: age, CE, and – in the case of metacognitive verbs – the parental use of metacognitive verbs. We built a series of linear mixed-effects regression models (one for each task: CLT verb comprehension, CLT verb production, METVOC), and included the fixed effects of language, age and CE as well as a two-way interaction (language x age).
Metacognitive vocabulary task
The model for METVOC converged with the full random structure. The results of the analysis of accuracy in METVOC are presented in Table 3. The analysis showed a significant positive main effect of age: the older children performed better than the younger children. The interaction between language and age was non-significant (see Figure 1).
Mixed-model: accuracy results for METVOC.

Mixed-models results: probability of a correct answer in each task by language and age.
We tested the model for multicollinearity using VIF. All the VIF values were relatively low: language, VIF = 1.20, CE, VIF = 1.39, age, VIF = 3.14, thus indicating moderate but acceptable collinearity of each of those variables with other predictor variables.
CLT verbs comprehension
The model for CLT verbs comprehension converged with the full random structure except for a random slope for an interaction between age and language by participant. The results are presented in Table 4. We found a main effect of language, showing that Polish CLT verbs comprehension scores were significantly higher than English CLT verbs comprehension scores. We also found a significant positive main effect of age, associating the increase in age with higher CLT verbs comprehension scores. The age × language interaction was not significant, hence, the increase in scores associated with age was similar for both Polish and English (see Figure 1).
Mixed-model: accuracy results for CLT verbs comprehension (raw score).
We tested the model for multicollinearity using VIF. All the VIF values were relatively low: language, VIF = 1.46, CE, VIF = 1.62, age, VIF = 1.91, thus indicating moderate but acceptable collinearity of each of those variables with other predictor variables.
CLT verbs production
The model for CLT verbs production converged with the full random structure except for the random slope for age by item. The results are presented in Table 5. We found significant positive main effects of age and CE. Generally, higher CLT verbs production scores were associated with higher ages of the children and higher CE values. The significant interaction between age and language showed that the increase in CLT verbs production accuracy associated with age was steeper for English than for Polish (see Figure 1).
Mixed-model: accuracy results for CLT verbs production (raw score).
We tested the model for multicollinearity using VIF. All the VIF values were relatively low: language, VIF = 1.55, CE, VIF = 1.98, age, VIF = 4.18, thus indicating moderate but acceptable collinearity of each of those variables with other predictor variables.
Summary of the results
With regards to our first research aim, the mixed-effects regression showed no cross-language gap in the area of metacognitive vocabulary, but showed such a cross-language gap in concrete verbs comprehension: our bilinguals knew more words in Polish than in English. In the area of concrete verbs production, our bilinguals showed a steeper increase in their knowledge of English than in Polish (a significant interaction age × language in the mixed-effects regression). With regards to our second aim, that is, exploring the effect of age, CE, and parental use of metacognitive verbs on the knowledge of concrete and metacognitive vocabulary in bilinguals, with the mixed-effects regression, we found that both the knowledge of concrete and metacognitive vocabulary increased with age. The effect of CE was visible only in concrete verbs production, and correlational analyses showed that the parental use of metacognitive verbs (in Polish) was linked to the child’s comprehension of the verbs (METVOC) in Polish, but not in English.
Discussion
So far, research on bilingual children’s lexicon has focussed on concrete, that is, easily imaginable words, and repeatedly found a cross-language (L1-L2) gap in bilingual vocabulary (Abbot-Smith et al., 2018; Budde-Spengler et al., 2021; Rinker et al., 2017; Uccelli & Páez, 2007). However, little was known about the cross-language gap in metacognitive vocabulary knowledge in bilingual children (cf., Altman et al., 2016; Mieszkowska, 2018). To fill in this gap, we investigated bilingual children’s knowledge of concrete and metacognitive verbs in their L1 and L2. Our focus was on Polish–English bilinguals aged 4;0–7;7 living in the United Kingdom, who spoke Polish at home (with both parents), and English at school. We tested bilinguals’ comprehension of metacognitive verbs (via METVOC task), and their comprehension and production of concrete verbs (via CLT subtasks), and investigated the effects of language, age and language exposure.
One of our most important findings relates to the bilinguals’ knowledge of metacognitive verbs. First, we found a positive moderate cross-language correlation between METVOC scores in Polish and in English. Second, we found no effect of language in the task, as analysed by the mixed-effects regression. We therefore conclude that in the area of comprehension of metacognitive verbs, Polish–English bilingual children at early school age show a balanced performance in the two languages. Our results corroborate previous findings (Mieszkowska, 2018; Otwinowska et al., 2020), and extend these results to comprehension of metacognitive verbs in bilinguals’ two languages.
In the light of a positive cross-language correlation between METVOC scores in Polish and in English, we propose that the knowledge of metacognitive verbs is at least partially language interdependent. The idea that skills acquired in the L1 provide a scaffolding for skills in the L2 was put forward by Cummins (1979), and since then, evidence for such interdependence was found for cognitive and metalinguistic skills (see Sierens et al., 2019 for a review). The knowledge of metacognitive verbs taps into the child’s cognition rather than purely linguistic abilities. The cross-language correlation between children’s comprehension of metacognitive verbs in Polish and English suggests that knowledge of metacognitive verbs may profit from a carry-over effect between the languages. Similar findings, but for comprehension of cognitive concepts related to colour, shape, quantity, space, and relations (e.g., ‘all’, ‘in’ or ‘equal’) were reported by Sierens, et al. (2019). With regards to concrete vocabulary interdependence, there are mixed results (e.g., Cobo-Lewis et al., 2002; Leseman, 2000; Scheele et al., 2010; Umbel & Oller, 1994). In our sample, we found no significant correlation between CLT scores in Polish and English, neither in verb production or verb comprehension, suggesting no language interdependence. A possible explanation is that concrete words tap purely into linguistic abilities, and the interdependence is possible for more cognitive- or metalinguistic-oriented skills. Another explanation may be that some concrete words (e.g., connected with home life, e.g., ‘to iron’) may be slightly more frequent in bilinguals’ input in one language than in the other language, which leads bilingual children to develop domain-specific vocabulary (e.g., ‘home’ vocabulary) in one language, but not another (Bialystok et al., 2010). If this specific (e.g., home) vocabulary is not needed in the other language, there is little opportunity for developing this vocabulary in the other language (Bialystok et al., 2010; Pham & Kohnert, 2014). Metacognitive verbs, on the other hand, might be generally less frequent (than concrete verbs), but could be present in many different contexts, for example, at home and at school, thus encouraging a carry-over effect.
Contrary to metacognitive verbs, the comprehension of concrete verbs showed a significant effect of language (i.e., Polish scores were higher than the English scores). Such a cross-language gap in concrete vocabulary has been reported for bilingual children at an early school age, and thus, it was anticipated in this study. Also the direction of the effect, that is, a better performance in Polish than in English, was previously established by Abbot-Smith et al. (2018), who studied Polish–English children living in the United Kingdom, aged 4;5–5;9. Using the same tool (i.e., CLT), they examined the comprehension and production of both nouns and verbs. Similarly to us, they found that children performed better in Polish than in English in both modalities (comprehension and production) across nouns and verbs.
We did not find an effect of language in the production of concrete verbs. Instead, we found an interaction of age and language, whereby the increase associated with age was steeper for English than for Polish. Our results partly corroborate findings from Klassert et al. (2014). They found that verb production was similar across bilinguals’ Russian and German and that the growth of verb production was similar in the two languages. We also found no effect of language in verb production in Polish and English, but we found a steeper growth for verb production in L2. This result may reflect the changes in the exposure to the two languages over time: as Polish–English bilinguals start formal schooling, their exposure to English becomes more intensive and the L2 (majority language) is better supported through school efforts. Previous research has already shown such a steep increase for the majority (L2) language performance. Pham and Kohnert (2014) found greater gains for L2 (concrete) vocabulary and interestingly, the biggest differences between the L1 and L2 growth were found for expressive vocabulary. The growth rate for L2 was nearly 4 times greater than the growth rate for L1 (Pham & Kohnert, 2014, p. 779). Similar findings were reported by Jia et al. (2014). This entails that the school-based exposure to the majority language provides bilinguals with substantial input in the language, as well as output production, allowing them to compensate for the previous gap, or even shift the language dominance. However, it should be noted that our models showed a main effect of age for all tasks, meaning that there is a general age-related increase in performance in both languages even if some areas of vocabulary may show a steeper increase in the L2, forecasting the language dominance shift.
As for the role of CE on the children’s knowledge of metacognitive and concrete words, we expected to observe CE effects on the children’s scores for concrete verbs. The results were partially affirmative: we found a clear effect of CE on the production of concrete verbs, but it was not repeated in any other task (concrete verbs comprehension or metacognitive verbs comprehension). Possibly, production requires more language exposure to develop relative to comprehension, a finding already reported before (Elin Thordardottir, 2011; Haman et al., 2017).
Finally, the knowledge of metacognitive words was influenced rather by the parental use of metacognitive words, than by general language exposure, as measured by CE. This may suggest that general linguistic experience (as indicated by CE) is less important for the acquisition metacognitive verbs than the mentalistic input. This is in agreement with previous findings showing that metacognitive words are best learned through interaction rich in references to mental states (Adrián et al., 2007; Taumoepeau & Ruffman, 2008; Tompkins et al., 2021). Interestingly, we found that parental input – rich in metacognitive verbs – might affect children’s comprehension of metacognitive verbs in the language used with parents, but not in the other languages, although children’s comprehension of metacognitive verbs in Polish and English were positively correlated.
At the beginning of the paper, we proposed that researching bilinguals’ metacognitive vocabulary would broaden our general understanding of the development of their lexical knowledge. We asked whether for metacognitive verbs a disproportion in knowledge across two languages of bilinguals would be similar to a gap previously reported for concrete verbs. Crucially, we indeed found a gap across bilingual children’s languages in the production and comprehension of concrete verbs, but not in the comprehension of metacognitive verbs. Although metacognitive verbs are more abstract and less frequent in parental input (Adrián et al., 2007; Jakubowska et al., 2018), their relative difficulty seemed to be neutralized.
We believe that this result alone poses new questions for the field. Why does metacognitive vocabulary develop symmetrically across languages? Is it a result of conceptual transfer across the child’s languages? Or is it because mentalistic vocabulary is less domain-specific than concrete vocabulary, that is, is included in many contexts, both at home and at school? Future research might explore the possibility that metacognitive verbs, though generally less frequent, might be more common in different types of discourse at home and at school. Thus, children possibly receive some mentalistic input at home and at school, both in their languages. This is in contrast to concrete words, where particular domains of life encourage use of domain-specific vocabulary (Bialystok et al., 2010). Therefore, metacognitive vocabulary might be more predisposed to show carry-over effects across the child’s languages than concrete vocabulary.
Limitations
This study has several limitations stemming from the procedures used to elicit the data. First, our two tasks measuring concrete and metacognitive verbs comprehension were not fully comparable, as their procedures differed (see the ‘Tasks’ section). However, the differences in the tasks reflect the difference in the nature of the words. It has to be also noted, that the present study does not serve to directly contrast the two areas of words (i.e., concrete and metacognitive) because the tasks we used differ in their design and the complexity of their procedures (see the ‘Tasks’ section for an explanation). Thus, with our results, we cannot rightfully say whether children know their concrete or metacognitive verbs better. Rather, the study aims to investigate whether the cross-language difference is found for both vocabulary areas: the concrete and metacognitive verbs, and whether the factors that impact one area, have similar impact on the other areas of words. In this way, we wanted to use the tasks available to spark a wider research interest in the relation between concrete and metacognitive verbs which have been rarely studied together. Second, although for concrete verbs we measured both comprehension and production, for metacognitive verbs our measurements were confined only to comprehension. This was mostly a methodological issue: while eliciting concrete words in a picture naming task was relatively easy, eliciting metacognitive words would be much more demanding. Even designing a picture-naming task for metacognitive words might not be feasible due to the potential ambiguity of illustrations.
Also, since the parents of bilingual children in our study used Polish at home, we decided that eliciting narratives told to children in English would violate the ecological validity of the testing and we gathered parental narratives only in Polish. We found a relationship between parental use of metacognitive verbs and children’s comprehension of those verbs in Polish (but not in English). Due to the positive correlation between METVOC results in both languages, we suggest a carry-over effect for bilingual children’s metacognitive words. However, we did not measure the metacognitive input in English (which could be available to children e.g., from teachers), and thus, the interpretation of the correlation is somewhat speculative.
Finally, as the data presented here were cross-sectional, the trajectories of vocabulary growth in both languages of the bilingual should be treated with some reservations. Similar data collected longitudinally could support the developmental patterns in vocabulary and provide more adequate information on the changes in the exposure to both languages.
Conclusion
This is one of the first studies to consider the child’s lexicon as composed of both concrete and metacognitive verbs. We conclude that: first, while the knowledge of concrete verbs in bilinguals may show a cross-language gap, the knowledge of metacognitive verbs may be more balanced, possibly due to the carry-over effect between the languages. Second, bilinguals’ vocabulary, both metacognitive and concrete, improves with age in the two languages. The production of concrete verbs improves with the child’s age mostly in the majority language, which is probably connected to the quantity and quality of L2 exposure. Third, while for concrete verbs CE in a given language is most influential, for metacognitive verbs, it is the use of such words by the child’s parents and caretakers that matters most. Thus, parents and practitioners working with bilingual children should be encouraged to provide rich input in children’s two languages, including the references to mental states. This is crucial for the acquisition and development of words referring to mental processes, emotions and desires which are intrinsic to human communication. Knowledge of such metacognitive words expands both our ability to engage in reflective and independent thinking, and to consciously reflect on the nature of language.
Footnotes
Acknowledgements
The authors thank the study participants and the research assistants who were engaged in the testing and data coding. The CLTs and MAIN were developed within the COST Action IS0804 (
). CLTs’ development was financially supported by BI-SLI-PL project ‘Cognitive and language development of Polish bilingual children at the school entrance age–risks and opportunities’ (809/N-COST/2010/0). The authors would also like to thank Dr Jakub Szewczyk for his invaluable statistical advice.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The research presented in this paper was supported by the National Science Centre in Poland under research project (2015/17/N/HS2/03215) titled ‘The influence of parental storytelling on bilingual children’s use and understanding of metacognitive terms in both of their languages’ granted to Karolina Muszyńska.
