Abstract
Purpose:
This study explores lexical skills in Russian among Russian-Kazakh bilingual children, using it as a lens to examine patterns of language dominance and post-Soviet language dynamics in multilingual Kazakhstan. The study addressed three key questions: (1) How do Russian-Kazakh bilingual children compare to monolingual Russian-speaking peers in lexical abilities in Russian, particularly on noun and verb naming? (2) What factors predict lexical abilities in both groups? (3) What are the characteristics of cross-linguistic influence from Kazakh (if any) on Russian in bilingual children’s non-target responses?
Methodology:
The study involved 43 Russian-Kazakh bilingual children aged 5–8 from urban areas in Northern Kazakhstan, who were compared to 22 age-matched monolingual Russian-speaking children. Data on demographics and language use were collected, and lexical abilities were assessed through noun and verb naming tasks, with quantitative and qualitative analyses used to evaluate performance and bilinguals’ strategies for addressing lexical gaps.
Findings:
Russian-Kazakh bilinguals exhibited lower proficiency than monolinguals but demonstrated creative strategies to manage gaps in their vocabulary. Age and word characteristics affected lexical performance for both groups, with older children performing better in both noun and verb tasks. Minimal cross-linguistic influence from Kazakh on Russian was observed, primarily in noun production, highlighting the dominance of Russian in bilinguals’ environments.
Originality:
The study sheds light on how Kazakhstan’s post-colonial linguistic landscape shapes language acquisition in children, showing that Russian retains a dominant role while coexisting with Kazakh in a multilingual context. It highlights the evolving role of Russian as a transnational lingua franca in Kazakhstan’s diverse society.
Implications:
This study investigates Russian lexical proficiency among Russian-Kazakh bilingual children in Northern Kazakhstan and explores how varying language exposure affects lexical outcomes. The findings highlight differences in word retrieval accuracy between the two groups, suggesting the role of exposure patterns and the sociolinguistic environment in shaping bilingual vocabulary development. These results contribute to ongoing discussions about language use in Kazakhstan, particularly on the functional role of Russian in early education, without making claims about language policy effects.
Keywords
Introduction
The linguistic situation of Kazakhstan offers a compelling context for studying bilingual development, shaped by the coexistence of Kazakh, Russian, and, increasingly, English, within a complex sociopolitical history (Altynbekova, 2005; Suleimenova, 2009, 2020; Suleimenova & Smagulova, 2005; Zharkynbekova & Chernyavskaya, 2022; Zhuravleva & Agmanova, 2021). Since gaining independence in 1991, Kazakhstan has promoted Kazakh as the state language to strengthen national identity. Yet, Russian remains deeply embedded in daily life, education, and interethnic communication, fostering social cohesion in the country’s multiethnic society (Suleimenova et al., 2021). Moreover, English has gained prominence through the state-backed “Trinity of Languages” initiative, reflecting Kazakhstan’s aspirations for global engagement and competitiveness (State Program for the Implementation of Language Policy in the Republic of Kazakhstan for 2020–2025). Unlike many nations where one or two dominant languages prevail, Kazakhstan’s linguistic environment is distinctly multifaceted, redefining conventional notions of mono-, bi-, and multilingualism.
In this multilingual context, investigating how children navigate and develop their linguistic resources offers important insights into the relationship between individual language acquisition and broader societal dynamics. The present study examines Russian-Kazakh bilingual children in Northern Kazakhstan, focusing on their Russian vocabulary skills and how these may reflect the shifting sociolinguistic realities of the region. It seeks to understand how these bilingual children perform in specific lexical tasks relative to monolingual Russian-speaking peers, what factors shape their language outcomes, and whether traces of Kazakh influence emerge in their Russian lexical knowledge.
Sociolinguistic dynamics in Kazakhstan
While multilingualism is widespread globally, Kazakhstan’s language policies continue to recognize the presence of monolingual speakers. However, purely monolingual contexts are increasingly rare. In the south, many individuals primarily use Kazakh, whereas in northern urban areas, Russian often dominates. Yet, the legacy of historical Russification and contemporary language initiatives means that most Kazakh speakers maintain at least functional Russian skills, while Russian speakers are increasingly exposed to Kazakh through education, public signage, and media, blurring the boundaries between linguistic communities. Russian-Kazakh bilingualism is particularly prevalent in northern and eastern regions, where Russian frequently serves as the dominant medium of communication in education, media, and everyday life for both ethnic Russians and many ethnic Kazakhs (Zharkynbekova & Chernyavskaya, 2022). Across the country, subordinative bilingualism—characterized by fluid code-switching and the blending of linguistic systems—remains the prevailing pattern, fostering a unified linguistic worldview and leading to diverse proficiency levels and functional hierarchies between languages. These complex dynamics have been extensively examined by Kazakh scholars from multiple perspectives (Abdulkhalikova, 2014, 2017; Aldabergenova, 1999; Alishariyeva, 2014; Rakhimzhanov, 2013; Shayakhmet, 2016; Zharylkapova & Bekbenbetova, 2017).
Kazakhstan thus exemplifies the coexistence of traditional language ideologies with flexible multilingual practices, where code-switching, dual-language use, and shifting preferences constitute what Pavlenko (2008) describes as a multilingual repertoire—a linguistic polyphony reflecting the country’s diversity and adaptability. The interplay between Kazakh and Russian, each carrying distinct symbolic and functional roles, mirrors broader multilingual phenomena in post-Soviet and global contexts, where heritage preservation intersects with modern communicative needs (Blommaert, 2010; Fishman, 1991). Within this complex framework, Kazakhstan serves as a compelling case study for exploring language shifts (if any) in post-colonial contexts.
Kazakhstan’s linguistic environment has grown even more complex with the early introduction of English in the national curriculum from the third grade, fostering a trilingual milieu. Children from ethnic minority backgrounds often navigate this rich linguistic ecology, speaking native languages at home, learning in Kazakh or Russian at school, and acquiring English as an additional language. These multilayered interactions shape a linguistic landscape characterized by flexibility and fluidity (Pavlenko, 2008).
Bilingual vocabulary development: social and experiential factors
Understanding lexical development in Russian-Kazakh bilingual children requires first examining broader research on bilingual lexical skills and their influencing factors. Lexical growth in bilinguals arises from a complex interplay of developmental, social, and experiential factors. Although age is a strong predictor of vocabulary size in monolinguals, its role in bilingual development is more nuanced (Czapka et al., 2021). Bilingual proficiency reflects not only age, but also cumulative exposure, patterns of language dominance, and broader sociocultural dynamics (Bylund et al., 2021; Kim & Kim, 2022). An earlier age of onset of bilingualism (AoB) generally leads to higher proficiency in the second language (L2), but this advantage depends on sustained input and the contexts in which each language is used. Moreover, AoB interacts with variables such as frequency of language use, types of interlocutors, and specific situational contexts, which can either strengthen or diminish its effects (Kim & Kim, 2022).
Family dynamics play a significant role in shaping bilingual input. For instance, older siblings often introduce the dominant societal language into the home, thereby shifting the balance of language exposure for younger siblings (Bridges & Hoff, 2014). Sibling interactions can even provide richer L2 input than parental speech, as shown by Duncan and Paradis (2020). However, early exposure does not always predict stable language dominance throughout life. Adult bilinguals frequently undergo shifts in their primary language use, emphasizing the need to evaluate language input and proficiency across different life stages rather than relying on static measures (Macbeth et al., 2022).
Beyond the home, broader environmental factors—including community language practices, media consumption, and the number of social interlocutors—strongly influence bilingual vocabulary development (Gollan et al., 2015; Lin & Siyanova-Chanturia, 2015; Paradis, 2023). Media engagement and extracurricular activities, in particular, contribute significantly to vocabulary growth.
Socioeconomic status (SES) is another crucial influence on bilingual lexical outcomes. Children from lower SES backgrounds often exhibit weaker receptive and productive vocabulary skills due to differences in the quality and quantity of linguistic input (De Cat, 2021). Similarly, parental education positively correlates with children’s proficiency in both languages (O’Toole et al., 2017).
Nevertheless, the dominance of one language often limits exposure to the other, resulting in reduced vocabulary size in the less-used language and contributing to bilinguals frequently lagging behind monolingual peers in measures of vocabulary breadth (Bialystok et al., 2010; Bialystok & Luk, 2012; Paradis, 2023; Sun et al., 2020). To compensate for these gaps, bilinguals employ strategies such as code-switching, code-mixing, and borrowing, creatively integrating elements from both languages to maintain communication (Fridman & Meir, 2023). This linguistic creativity extends beyond strategic language use to encompass innovative stylistic and structural expressions. Studies link bilingualism with enhanced divergent thinking and verbal creativity, highlighting cognitive and expressive advantages (Kharkhurin, 2007; Lasagabaster, 2000; Ricciardelli, 1992). Kachru (1985) and Bhatia and Ritchie (2008) further argue that bilinguals are not merely code-switchers but creative agents who actively reshape linguistic norms, especially in dynamic multilingual settings.
Given these complexities, researchers have increasingly questioned the validity of using monolingual norms as benchmarks for assessing bilingual children. Studies consistently show that while bilingual children often have smaller vocabularies in each individual language, their total conceptual vocabulary can equal or surpass that of monolinguals, especially when factors like maternal education are accounted for (De Houwer et al., 2014). Umbel et al. (1992), in research with 8-year-old Spanish-English bilinguals, highlighted the distributed nature of bilingual lexical knowledge, underscoring the importance of evaluating vocabulary across both languages rather than imposing monolingual standards. Thordardottir (2011), studying French-English bilinguals, further demonstrated that vocabulary size closely aligns with the quantity and quality of language input, supporting continuous models of exposure rather than rigid bilingual classifications.
Collectively, these insights underscore that bilingual lexical skills cannot be explained by isolated factors but reflect the dynamic interplay of age, input quality, social networks, and broader cultural forces. Assessment frameworks must therefore move beyond monolingual standards to capture the realities of dual-language acquisition, and the unique cognitive and linguistic resources bilinguals possess.
Previous evidence on Russian-Kazakh and Kazakh-Russian bilingualism
Within Kazakhstan, scholarly interest in bilingual lexical skills has grown, though empirical research remains relatively limited. Kazakhstani researchers have begun to document how bilingualism shapes children’s vocabulary acquisition and cognitive development, including executive functions and metalinguistic awareness (Abaeva, 2019; Aldaberdikyzy, 2013; Suleimenova et al., 2024; Zhakupova et al., 2024a). In Kazakhstan, preschool education typically serves children aged 3 to 6 and includes kindergarten or early childhood programs. Primary school begins at age 6 or 7 and continues through age 10, covering grades 1 to 4.
Studies reveal that Kazakh-Russian bilingual children exhibit unique patterns of lexical skills, often characterized by cross-linguistic influence. For example, Shayakhmet (2015) found that bilingual children define words differently than monolingual peers, reflecting distinct strategies. Aldaberdikyzy (2013) observed that the most frequent type of lexical interference among Kazakh-Russian bilingual preschoolers arises through analogy, where children extend familiar morphological patterns to new forms. Erkebayeva (2014) highlighted interlingual polysemy between Kazakh and Russian, noting that words with identical dictionary meanings can diverge significantly in contextual use, complicating translation and comprehension. Furthermore, Amanov and Gagarina (2022) observed that Turkic-Russian bilingual children produced narrative macrostructures similar to monolinguals but differed in narrative complexity and organization. Akhmet et al. (2023) found that bilingualism affects perception in Kazakh-Russian bilingual children, particularly in acquiring Russian vocabulary. Their findings show that among 7- to 8-year-old students receiving instruction in Russian, 90% correctly identified қошақан (lamb), compared to just 33% of those taught in Kazakh. In a grammar task, 86% of students taught in Kazakh selected an incorrect plural form influenced by Russian syntax. These findings illustrate how the language of schooling shapes both lexical skills and grammatical accuracy in early bilingual development Zhakupova et al. (2024b) reported that while bilingual children may struggle with explaining word meanings based on formal-semantic features, they often demonstrate greater linguistic creativity and flexibility than monolinguals. For example, bilingual children interpreted černosliv (prune) as “slivatʹ židkostʹ” (“to pour out liquid”), creatively segmenting the word by sound. One child associated šalun (mischievous child) with “kak kloun” (“like a clown”), and another with the Kazakh word šal (“old man”).
Despite these advances, Kazakhstani scholars emphasize ongoing challenges for bilingual children, including managing increased linguistic material and navigating bidirectional cross-linguistic influence between Kazakh and Russian. To support vocabulary development, researchers advocate for integrating contextual tasks and interdisciplinary connections in education, as well as fostering supportive language environments at home and in the community. However, modern challenges—such as declining reading habits, increased screen time, and reduced focus on text-based learning—pose obstacles to children’s active vocabulary growth and cognitive development. Consequently, there is growing interest in pedagogical strategies that combine traditional and digital resources to enhance vocabulary acquisition in bilingual settings.
The current study: sociolinguistic background and research questions
Against this backdrop, the present study investigates Russian lexical skills among Russian-Kazakh bilingual children in Northern Kazakhstan—a region where Russian continues to dominate urban communication, media, and education, despite active state efforts to promote Kazakh. National census data from 2021 further reveal that 64.9% of ethnic Kazakhs report fluency in Russian, while 24.2% of ethnic Russians report fluency in Kazakh. Participants in this study were drawn from Northern Kazakhstan, specifically the Akmola region and the city of Kokshetau. The region has a diverse population of approximately 782,995 people, including 436,984 Kazakhs, 210,769 Russians, 40,563 Ukrainians, 30,121 Germans, 13,707 Tatars, and smaller numbers of other ethnic groups. Although Russians constitute only about 15% of Kazakhstan’s overall population, their presence significantly influences language practices and the public use of Russian in urban centers such as Kokshetau, Shchuchinsk, and Stepnogorsk, where Russian predominates in education, media, and daily communication. In contrast, Kazakh monolingualism is more common in rural areas, especially among locally born Kazakh families. Among Kazakh repatriates (oralman) returning from countries such as Russia, Mongolia, and China, linguistic backgrounds are often heterogeneous, with many individuals arriving with greater proficiency in the official languages of their countries of origin and acquiring Kazakh primarily through schooling or integration programs (Fierman, 2005; Suleimenova, 2017). Meanwhile, in many multiethnic villages, Russian continues to serve as the principal medium of interethnic communication, especially in schools, local administration, and commerce.
The bilingual children in this study reflect diverse linguistic profiles: those for whom Kazakh is the first and stronger language, with Russian as a second, weaker language; those for whom Russian is the first and stronger language, with Kazakh as a second, weaker language; and those who consider Kazakh mainly a school subject rather than a language of daily communication. Instead of rigidly dividing bilingual participants into discrete groups, this study adopts a bilingual continuum model, consistent with contemporary approaches to bilingualism research (DeLuca et al., 2019; Kremin & Byers-Heinlein, 2021; Rothman et al., 2023).
The study aimed to explore three key research questions:
How do Russian-Kazakh-speaking bilingual children compare to their monolingual Russian-speaking peers in terms of lexical expressive abilities, particularly in noun and verb naming tasks?
What factors predict lexical expressive abilities in both monolingual and bilingual children? We consider lexical characteristics, word stability, frequency, and age of acquisition, as well as exposure factors such as chronological age. In bilinguals, we aim to evaluate the contribution of language exposure variables, such as language usage at home, the onset of Kazakh, the role of educational setting.
What are the characteristics of cross-linguistic influence from Kazakh on Russian lexical skills, as seen in the non-target responses of bilingual children?
Methodology
Participants
Bilingual participants were recruited from public preschools and early primary schools in Kokshetau, a mid-sized city in Northern Kazakhstan. All bilingual participants in this study attended schools where Russian is the primary language of instruction, while Kazakh is taught approximately 12 academic hours per week as part of the standard curriculum (Order of the Minister of Education and Science of the Republic of Kazakhstan No. 500 dated November 8, 2012). Schools were selected based on demographic data indicating a mix of ethnic Kazakh and Russian-speaking families. All participating children came from urban, middle-class backgrounds to minimize variation due to socioeconomic status. The bilingual group comprised children who had been exposed to both Kazakh and Russian from an early age, typically with Russian exposure beginning before age 2. Within the bilingual group, Kazakh exposure varied: some children used Kazakh regularly at home, while others encountered it mainly at school. All bilingual participants were functionally proficient in Russian.
Initial participant selection was conducted in collaboration with school staff who distributed consent forms to families. Inclusion criteria required children to be between 5 and 8 years of age and not have any diagnosed speech, cognitive, or hearing impairments. Parents completed a short version of the BIPAQ language background questionnaires (Abutbul-Oz & Armon-Lotem, 2022), including items on home language use, age of first exposure to each language, current proficiency, and language use patterns across contexts. To address the heterogeneity within the bilingual group, we adopted the theoretical framework of bilingualism as a continuum (Kremin & Byers-Heinlein, 2021; Rothman et al., 2023), which acknowledges variation in language experience, proficiency, and exposure patterns rather than relying solely on categorical groupings. We also collected their detailed background information, including AoB, frequency of use across domains (home, school, community), and self-/parent-reported proficiency. These continuous measures were considered in the interpretation of results, allowing us to reflect the nuanced and dynamic nature of bilingual development in our analysis.
Table 1 presents the demographics and key linguistic factors for Russian-Kazakh bilingual children (BiRuKz, n = 43) compared to Russian-speaking monolingual children (MonoRu, n = 22). Both groups are similar in age, with no significant difference, and the gender distribution is fairly balanced, although the bilingual group includes more males. Bilingual children began speaking Russian before the age of 2 (M = 1.8, SD = 0.83, range: 1.5–4), while the onset of Kazakh language acquisition shows a wider range (M = 3.4, SD = 2.3, range: 1.6–8), highlighting variability in language acquisition timing. Parental ratings of children‘s language proficiency reveal that children’s Russian proficiency was rated high (M = 3.7, SD = 0.6, range: 1–4), while their Kazakh proficiency was notably lower (M = 2.09, SD = 0.71, range: 1–4), suggesting that Russian is likely their dominant language. Further supporting this dominance, parents report that 90% of bilingual children use Russian at home, with only 10% using both Kazakh and Russian, reflecting some degree of bilingual exposure. Both monolingual and bilingual children come from mid-to-high socioeconomic backgrounds, as measured by parental education levels.
Participants’ background (M and SD).
Monolingual child controls for this study were drawn from Fridman and Meir (2023). The study included a group of 22 monolingual Russian-speaking children (12 females and 10 males) with a mean age of 6 years (SD = 1). Participants were recruited from Russia, Ukraine, Kazakhstan, and Belarus to ensure representation of Russian-speaking populations beyond the Russian Federation. Fourteen children were tested via Zoom, while eight participated in face-to-face sessions. All children were identified as monolingual by their families and exhibited native-level proficiency in Russian. They were reported to use only Russian both at home and in educational settings.
Russian noun and verb lexical task and procedure
The vocabulary task employed in the study was a picture-naming production task designed to assess lexical access in Russian (see Fridman & Meir, 2023), used to evaluate the lexical access of monolingual and bilingual children and adults. The stimuli were selected from the “Verb and Action Stimuli Database” and the “Noun and Object Stimuli Database” developed by Akinina et al. (2015), which include lexical items normed for subjective age of acquisition (sAoA), frequency, and naming consistency. The indices were derived from 1,202 completed surveys comprising 869 females, 332 males, and one participant of unidentified sex, with ages ranging from 16 to 76 years (M = 27.46, SD = 10.52). Educational attainment varied from secondary school to doctoral degrees, with 85.19% of participants reporting at least some higher education. Normative data for each word were derived based on responses from 100 participants per item. Only items with high naming consistency and early sAoA were retained, ensuring that they fell within the expected lexical range of young children. Though some examples may appear simple, such items were deliberately chosen to match the developmental and cognitive abilities of the target population. The resulting scale was thus developmentally appropriate, sensitive to lexical differences across groups, and grounded in both psycholinguistic norms and empirical piloting.
Participants completed the Russian Noun and Verb Lexical Task (Fridman & Meir, 2023), which involved viewing 102 black-and-white images displayed on a computer screen. These images were sourced from the “Verb and Action: Stimuli Database” and the “Noun and Object: Stimuli Database” (Akinina et al., 2015). Of the 102 images, 51 depicted objects (nouns), which participants were asked to identify using the prompt, “What is this?” The remaining 51 images depicted actions (verbs), and participants were asked to describe the action using prompts such as “What is happening here?” or “What is X doing?” (see Figure 1). All instructions and prompts were delivered in Russian. The task was untimed, and the researcher proceeded to the next image once the participant provided a response.

Nouns and verb images of the lexical task: (a) noun naming (Target: kastrjula “pot”) and (b) verb naming (Target: ona raskatyvaet “she is rolling.”)
The databases by Akinina et al. (2015) include a stability metric, representing the percentage of monolinguals (out of 100) who produced the target response. These databases also include supplementary data for each stimulus, such as word frequency and sAoA. For nouns, only stimuli with a stability score above 90 (average of 97) were used, while for verbs, stimuli with a stability score above 80 (average of 92) were selected. The stimulus set contained an approximately equal distribution of words from different frequency bands (low, mid, high) to ensure a diverse sample. There were no significant differences between nouns and verbs in terms of sAoA, t(100) = .57, p = .57, or word frequency, t(100) = 0.94, p = .35. For more detailed information on the stimuli, see Fridman & Meir (2023).
To minimize sociocultural cues during testing, the experimenter was a native speaker of Russian with no prior relationship to the participants. All instructions were standardized and delivered in Russian, and the testing environment was kept consistent across sessions. Although complete control of all background variables (such as object familiarity or implicit social hierarchies) is not feasible in field settings, care was taken to ensure uniform exposure and minimal bias in how tasks were administered. We did not manipulate interlocutor variables such as experimenter bilingualism or code-switching to preserve ecological validity and maintain focus on lexical recall in Russian as a test language.
Data coding
We employed a multilayered coding system, informed by frameworks established by Altman et al. (2018), Foygel and Dell (2000), and Ramsay et al. (1999). A total of 6,528 responses were elicited across both noun and verb tasks. First, responses were classified binarily as target (1) or non-target (0). Noun diminutives (e.g., svecha “candle”—svechka “candle.DIM”) and plurals (e.g., glaz “eye”—glaza “eyes”) were counted as targets. Similarly, responses with incorrect inflection (e.g., glaz “eye”—glazy “eyes.FEM”) were treated as targets. For multiple responses, if one response was correct, the item was coded as a target. In cases where a participant provided both an “I don’t know” response and a word, the produced word was coded.
Non-target responses were further categorized based on whether the incorrect part of speech was used (e.g., a non-noun in the noun task or a non-verb in the verb task). These non-target responses were then subdivided into specific types and subtypes (see Table 2). “Unrelated” and “Unknown” responses were also treated as independent categories, offering insights into participants’ linguistic competence. For more details on the coding schema, refer to Fridman and Meir (2023).
Coding of non-target response.
The children were explicitly instructed to name the pictures in Russian. While all instructions were also provided in Russian, responses in Kazakh were treated as code-switches and were not scored as correct. This decision aligns with the study’s focus on Russian lexical retrieval and the role of Russian in bilingual children’s productive vocabulary. While code-switching is typically analyzed as a discourse-level feature, research has shown that it can also serve a lexical retrieval function in bilingual speakers. In cases where access to low-frequency or later-acquired words in the home language is limited, bilinguals often substitute these items with equivalents from their dominant language of schooling. This process reflects a lexical-level compensatory strategy rather than spontaneous conversational code-switching (Polinsky & Kagan, 2007; Fridman & Meir, 2023). For example, Fridman and Meir (2023) found that in a word production task, heritage speakers of Hebrew in the U.S. systematically drew on their dominant societal language (English) through direct borrowing and calquing, even in monologic elicitation settings. Such findings align with earlier accounts describing fossilization of high-frequency lexical items and the strategic use of second-language vocabulary to fill lexical gaps (Polinsky, 2006; Sullivan et al., 2018). Thus, even in word-level tasks like picture naming, lexical code-switching emerges as a meaningful indicator of bilingual lexical organization, particularly when vocabulary access is asymmetric across languages. This has important pedagogical implications: rather than marking a deficit, such responses may reflect an adaptive mechanism for maintaining communicative efficiency. Language educators and clinicians should recognize code-switching in these contexts as a reflection of strategic resource use, rather than misperformance or confusion.
Results
Target accuracy scores
The target accuracy scores for Russian-Kazakh bilingual children compared to monolingual Russian-speaking controls on noun and verb tasks are presented in Figure 2. Panels for Nouns and Verbs illustrate the comparative performance of the two groups across the two lexical categories, providing a visual representation of how bilinguals and monolinguals perform in naming objects and actions. Bilingual children demonstrated considerable variability in their responses to noun-naming tasks (M = 0.69, SD = 0.44, range: 0.33–0.92), showing greater variability compared to their monolingual peers (M = 0.77, SD = 0.40, range: 0.56–0.96). A similar pattern of wide variation was observed in verb naming, with bilingual children scoring even lower overall (M = 0.52, SD = 0.14, range: 0.29–0.78) compared to the monolingual group (M = 0.64, SD = 0.14, range: 0.43–0.88).

Noun and verb target production scores per group (MonoRu vs. BiRuKz).
Given the binary nature of the scores (1 = target, 0 = non-target), we fitted binomial mixed-effects logistic regression models separately for nouns and verbs. Both models included fixed effects such as age, group (with monolingual children, MonoRu, as the reference point), sAoA, stimulus stability, and word frequency, along with interaction terms between group and the other fixed variables. Random effects were included for participants and stimuli.
Starting with the model for nouns (Table 3, Panel A), age had a significant positive effect (OR = 1.41, p < .001), indicating that older children are more likely to correctly identify nouns. The interaction between age and group did not improve the model fit, suggesting age affects both monolingual and bilingual children similarly. However, bilingual children (BiRuKz) showed a significantly lower likelihood of producing correct noun responses compared to monolinguals (OR = 0.35, p < .001). sAoA negatively impacted noun target scores (OR = 0.06, p < .001), suggesting that later acquisition of words reduces the likelihood of correct responses. Stimulus stability had a positive effect (OR = 1.25, p = .005), whereas word frequency did not significantly affect noun scores (p = .22). However, the interaction between group and frequency approached significance (p = .05), suggesting frequency may have a different effect on bilinguals. Random effects for participants and stimuli were included, with variance components of 3.29 for residuals, 0.79 for participants, and 1.68 for stimuli. The intraclass correlation coefficient (ICC) was 0.43, indicating that 43% of the variance was due to differences between participants and stimuli. The model’s marginal R² was 0.410, meaning the fixed effects explained 41% of the variance, while the conditional R² was 0.662, meaning the full model, including random effects, explained 66.2% of the variance in noun target scores.
Final models for noun and verb target responses.
For verbs (Table 3, Panel B), age similarly had a significant positive effect (OR = 1.40, p < .01), indicating that older children were more likely to produce target verbs correctly. Bilingual children (BiRuKz) had a lower likelihood of producing correct verb responses compared to monolinguals (OR = 0.42, p = .035), highlighting a disadvantage in verb retrieval. sAoA negatively impacted performance (OR = 0.19, p < .01), suggesting later acquisition reduces the odds of correct verb responses. Stimulus stability positively affected verb performance (OR = 1.08, p = .009), while frequency also had a small but significant positive effect (OR = 1.00, p = .036), indicating more frequent verbs were easier to identify. The interaction between group and frequency was not significant (OR = 1.04, p = .831), suggesting that frequency does not differentially affect bilingual children. No significant interaction was found between the group and sAoA. Variance components were 3.29 for residuals, 0.23 for participants, and 1.21 for stimuli. For verbs, the marginal R² was 0.249, indicating that the fixed effects explained 24.9% of the variance, while the conditional R² was 0.478, meaning that the full model, including random effects, explained 47.8% of the variance in verb target scores.
In comparing the models for nouns and verbs, several key differences and similarities emerge. Both models demonstrate that age has a significant positive effect on target accuracy, indicating that older children perform better on both noun and verb tasks. However, bilingual children consistently show lower accuracy compared to monolinguals for both nouns and verbs. In addition, sAoA negatively impacts performance in both lexical categories, suggesting that later acquisition reduces accuracy, but the effect is more substantial for nouns (OR = 0.06) than for verbs (OR = 0.19). Stimulus stability positively influences target accuracy for both nouns and verbs, though the effect is slightly stronger for nouns (OR = 1.25) than for verbs (OR = 1.08). Word frequency, while not significant for nouns, plays a small but significant role in verb retrieval, highlighting frequency’s particular relevance to verb processing. The interaction between group and frequency approached significance for nouns, suggesting a potential differential effect of frequency for bilinguals in noun tasks, but no similar pattern was observed for verbs. Overall, the results highlight the nuanced challenges bilingual children face in both noun and verb retrieval, with differences in frequency effects and the impact of acquisition timing contributing to varying performance across lexical categories.
Exposure measures and target noun and verb production in bilingual children
The correlation matrix (see Figure 3) illustrates the relationships between several linguistic and demographic variables in bilingual children, with values ranging from −1 (perfect negative correlation, dark red) to + 1 (perfect positive correlation, dark blue). A strong positive correlation is observed between accuracy in naming nouns (AccNouns) and verbs (AccVerbs) (r = 0.67, p < .0001); those who score higher in naming nouns also tend to score higher in naming verbs. Moderate positive correlations, such as between age and accuracy in naming nouns (AccNouns) (r = 0.34, p < .0001) and verbs (AccVerbs) (r = 0.52, p < .0001), show that older children perform better on these tasks.

Correlation matrix for noun and verb production and exposure measure in bilingual children.
In addition to examining correlations between noun and verb naming accuracy, the correlation matrix also provides insights into other experiential variables. Notably, there was a significant negative correlation between L1 and L2 proficiency (r = –0.58, p < .0001), indicating that children with stronger Russian skills tended to have weaker proficiency in Kazakh. Given that most bilinguals in our sample are L1-Russian speakers, this asymmetry suggests unbalanced bilingualism, with Russian as the dominant language. Furthermore, Kazakh home use was negatively correlated with parents’ preference for speaking Russian (r = –0.48, p = .01), suggesting that dominant Russian use at home may limit Kazakh exposure, thereby reinforcing Russian proficiency. Conversely, there was a significant positive correlation between the use of Kazakh at home and proficiency in Kazakh (RatingKZ) (r = 0.69, p = .02). These findings provide valuable insights into how language proficiency, home language use, and demographic factors interact in bilingual environments.
Non-target noun and verb production in bilinguals
Before delving into the analyses of non-target responses, we first present the item-level findings. In the BiRuKz group, the top 3 correctly produced nouns, with 100% target accuracy, were “leg,” “glasses,” and “fork,” where all 43 participants’ target responses were. Conversely, the nouns with the lowest accuracy in the BiRuKz group were “muzzle,” “udder,” and “holster,” with target production rates of 16.28%, 2.33%, and 0%, respectively. For verbs, the highest accuracy was observed with “to wash” (100%), followed by “to sing” (97.67%) and “to draw” (95.35%). In contrast, the verbs “to clink glasses” (6.98%), “to embroider” (9.30%), and “to erupt” (13.95%) had the lowest target response rates.
Instances of code-switching were rare, with only two instances where participants produced the target word in Kazakh: inelik for “dragonfly” and alma for “apples.” Due to their rarity, these instances of code-switching were excluded from further statistical analyses. Subsequently, we examined the non-target noun and verb responses using Poisson regressions to explore patterns beyond correct responses.
Figure 4 compares the types of noun non-target responses (Innovation, Phonological, Semantic, Unknown, and Unrelated) between the MonoRu and BiRuKz groups. The MonoRu group produced a higher proportion of Semantic responses, indicating a stronger tendency to offer semantically related alternatives. Both groups exhibited low proportions of Phonological and Innovation responses, with no notable differences between them. However, the BiRuKz group showed a significantly higher proportion of Unknown responses than the MonoRu group, suggesting that bilingual children struggled more with providing responses in certain cases. In addition, the BiRuKz group produced slightly more Unrelated responses compared to the MonoRu group, which showed a greater reduction in these types of responses. Overall, the MonoRu group displayed a stronger tendency toward Semantic responses, whereas the BiRuKz group was more prone to producing Unknown and Unrelated responses.

Non-target noun production per group (MonoRu vs. BiRuKz).
Table 4 presents the results of Poisson regressions for different types of Noun Non-Target Responses across two groups: Monolingual (MonoRu, represented by the intercept) and Bilingual Children (BiRuKz). The models did not include Code-Switching (CS) responses, as the MonoRu group had no CS instances, making comparison between groups unfeasible. For Semantic responses, the difference for the BiRuKz group was not significant (p = .21), suggesting that the groups did not differ significantly in their production of Semantic responses. Similarly, for Phonological responses, neither group showed any significant differences, with p-values close to 1.00, indicating no meaningful distinction between the groups. For Innovation responses, there was no significant difference between the two groups, as the effect for the BiRuKz group was not statistically significant (p = .90). Likewise, for Unrelated responses, neither group exhibited a significant difference, with p-values close to 0.99. Finally, for Unknown responses, while the MonoRu group produced fewer, the BiRuKz group did not show a significant difference (p = .78). Overall, there were no statistically significant differences between the MonoRu and BiRuKz groups across the different types of non-target responses.
Poisson regressions for noun non-target response types comparing MonoRu vs. BiRuKz.
The bilingual participants demonstrated creativity and proficiency in Russian word formation, often employing innovative strategies to address lexical gaps. For example, they used obodok (“rim or band”) instead of “wreath,” and kost’ (“bone”) for “skull.” In one case, a participant described a humming top as igrushka, kotoraja krutitsja (“a toy that spins”), and referred to a fan as Nu, kotoroe vozdukh delat’, ja zabyla kak eto nazyvaetsja (“Well, the thing that makes air, I forgot what it’s called”). Similarly, creative responses included karman dlja loshadi (“pocket for a horse”) for “holster,” and sobaka s tugim vorotnikom (“dog with a tight collar”) for “muzzle.” Phonological substitutions were also common, such as vener for “veer” (fan) and vysazhivat’ (“to plant”) for vysyzhivat’ (“to hatch”). In addition, associative replacements like greet jajtsa (“warms eggs”) for “to hatch” and stuchat’ stakanami (“knock glasses together”) for “to clink glasses” illustrate conceptually related but incorrect responses.
Innovative constructions included sagvozdka (“nail-pin”) for “paperclip,” zakrapka (“fixing-dot”) for “pin,” svin’ja-monetka (“coin-pig”) for “piggy bank,” showing the creative strategies bilinguals use to fill lexical gaps. In summary, the BiRuKz group appears to produce more Semantic, Innovation, and Unknown responses compared to the MonoRu group, though these differences did not reach statistical significance.
Figure 5 presents the distribution of Verb Non-Target Response Types across the MonoRu and BiRuKz groups. There were no instances of CS in either group for nouns, so this response type was excluded from the analysis. The Poisson regression analysis (see Table 5) of Verb Non-Target Response Types across the MonoRu and BiRuKz groups reveals significant differences in two categories: Semantic and Unknown responses. The BiRuKz group produced significantly fewer Semantic responses (p < .01), and significantly more Unknown responses (p < .005) compared to the MonoRu group. No significant differences were found between the groups for Phonological (p = .44), Innovation (p = .58), or Unrelated responses (p = .12), indicating that these types of non-target responses were distributed similarly across both groups. Overall, the results suggest that while Semantic and Unknown responses differ between monolingual and bilingual children, other non-target response types show no meaningful variation between the two groups.

Non-target verb production per Group (MonoRu vs. BiRuKz).
Poisson regressions for verb non-target response types (MonoRu vs. BiRuKz).
Discussion
The present study investigated the lexical abilities of Russian-Kazakh bilingual children in comparison to their monolingual Russian-speaking peers, focusing on noun and verb production. Kazakhstan presents a linguistically complex environment shaped by its post-Soviet history, multilingual population, and evolving language policies. While such dynamics are not unique globally, the specific coexistence of Russian and Kazakh in official, educational, and informal domains makes it a particularly informative case for studying bilingual development in post-Soviet contexts. During the Soviet era, Russian was established as the dominant language in Kazakhstan, deeply embedding itself in the country’s social, educational, and governmental structures. Even after gaining independence in 1991, the legacy of Russification persists, with Russian remaining the primary language of communication in many urban areas, despite state policies aimed at revitalizing Kazakh (Altynbekova, 2005; Suleimenova, 2009; Zharkynbekova & Chernyavskaya, 2022). This study focuses on the lexical skills of bilingual children, acknowledging that their performance in one language is shaped by experiences across both linguistic systems. The findings offer insights into how variation in language exposure and use among Russian-Kazakh bilingual children may reflect broader sociolinguistic trends in Kazakhstan.
The results demonstrate that Russian-Kazakh bilingual children performed lower in target accuracy for both nouns and verbs compared to their monolingual peers. These findings align with broader research on bilingualism, which frequently indicates that bilinguals face more challenges in lexical retrieval due to exposure to multiple linguistic systems (Bialystok et al., 2010; Fridman & Meir, 2023). In particular, the bilingual children in our study showed higher rates of non-target responses such as unknown and unrelated words, suggesting that they experience greater difficulty in expressive vocabulary tasks. This pattern is consistent with studies that highlight the increased cognitive load bilinguals face when navigating two languages, particularly in environments where one language is societally dominant (Paradis, 2023).
Similar to previous studies, bilingual children demonstrated a wide range of scores in target noun (M = 0.69, SD = 0.44, range: 0.33–0.92) and in verb naming (M = 0.77, SD = 0.40, range 0.56–0.96), highlighting individual differences in bilingual children. Russian–Kazakh bilingual children perform better compared to bilinguals raised in immigrant environments where Russian is less dominant, c.f. Russian–Hebrew child bilinguals for nouns (M = 0.60, SD = 0.49, range 0.15–0.98) and for verbs (M = 0.43, SD = 0.49, range 0.15–0.80) and Russian-English bilinguals for verbs (M = 0.40, SD = 0.49, range 0–0.80) and for nouns (M = 0.52, SD = 0.49, range 0.20–0.96) from Fridman and Meir (2023). The results highlight the particular status of Russian in Kazakhstan, where children’s lexical proficiency in Russian remains high. This reflects the fact that Russian continues to be highly supported in educational settings across the country. In the current study, all participating children were attending Russian-medium schools with regular formal classes in Kazakh. Russian school exposure likely contributes to maintaining and strengthening their proficiency in Russian despite the multilingual environment. In the section on limitations and directions for future research, we also propose additional lines of inquiry to further explore how different educational contexts and language policies shape Russian language development among bilingual children in Kazakhstan and other post-colonial settings.
Our results for noun and verb production provide important insights into the factors influencing lexical performance in bilingual children, particularly within the unique post-colonial environment of Kazakhstan. This study extends previous research conducted in migrant contexts by examining Russian language development among bilingual children who reside in their heritage region rather than in diaspora settings. Age was found to be a significant predictor of accuracy for both nouns and verbs, with older children performing better across both groups. This is noteworthy, as the effect of age is robust in monolingual typical language development, yet it is not always observed in bilingual development (e.g., Paradis, 2023). sAoA negatively impacted accuracy, particularly for nouns, reflecting the difficulties associated with retrieving words acquired later in life. Stimulus stability had a positive effect on both noun and verb retrieval, indicating that words used more consistently across speakers were easier for children to recall. While word frequency did not significantly affect noun accuracy, it played a small but significant role in verb retrieval, highlighting the importance of frequently encountered verbs in language processing. In our group of Russian-Kazakh bilingual children, the factors influencing lexical skills were identical to those observed in their monolingual peers, suggesting the unique status of Russian in Kazakhstan as a dominant language that shapes lexical acquisition similarly across both bilingual and monolingual children.
For bilingual children, our goal was to examine the associations between language exposure variables and lexical skills. Previous research on bilingualism consistently demonstrates strong links between language exposure and lexical abilities (for a review, see Paradis, 2023). The results of the study also provide a deeper understanding of language-use trends among bilingual children in Kazakhstan. Despite Kazakh’s status as the state language, the vast majority of bilingual children in our sample primarily used Russian, especially in home settings, where 90% reported communicating in Russian. While Kazakhstan’s sociolinguistic landscape is regionally diverse, this pattern reflects the continued dominance of Russian in urban and northern areas such as Kokshetau and supports findings from prior research highlighting the persistent role of Russian in everyday communication despite state-sponsored efforts to elevate Kazakh (Altynbekova, 2005; Suleimenova, 2009; Zharkynbekova & Chernyavskaya, 2022). The correlation matrix reveals important relationships between linguistic and demographic factors influencing bilingual children’s language abilities, which reflect broader trends observed in previous studies on Russian-Kazakh bilinguals. A strong positive correlation between accuracy in naming nouns and verbs suggests that children who excel in one lexical category tend to perform well in the other. However, a significant negative correlation between Russian and Kazakh proficiency was observed in our sample, suggesting a possible trade-off in proficiency levels across the two languages among these children. While this pattern warrants further investigation, it may reflect the influence of dominant language use on the development of the other language. This aligns with previous research, which has shown that the dominance of Russian in Kazakhstan can hinder the development of Kazakh, particularly when Russian is predominantly used at home and in educational settings (Suleimenova & Smagulova, 2005; Zharkynbekova & Chernyavskaya, 2022; Zhuravleva & Agmanova, 2021). Increased use of Russian at home correlates negatively with Kazakh home use and parents’ preference for speaking Russian, further supporting the idea that home language practices are key in shaping children’s proficiency in both languages. In addition, age is moderately correlated with improved performance in both noun and verb tasks, highlighting the developmental aspect of language acquisition. The positive correlation between school language and Russian proficiency suggests that language use in educational settings plays a crucial role in reinforcing Russian language development, consistent with research that has shown the importance of school language environments in bilingual proficiency (Suleimenova, 2020). These findings underscore the complex interplay between home language environment, school language exposure, and demographic factors in shaping the linguistic proficiency of Russian-Kazakh bilingual children.
The study found minimal cross-linguistic influence from Kazakh on Russian, with CS occurring only in noun production. Research on bilingual development consistently shows that nouns are more frequently code-switched than verbs in children’s speech. Poplack (1980, 1982), analyzing natural bilingual conversations, reported that nouns are the most commonly switched content words, while verbs rarely appear in isolation and are typically part of larger clause switches. This pattern is further supported by heritage bilingual studies; for instance, Klassert et al. (2014) observed that nouns are more “fragile” than verbs—being more susceptible to variation or loss—whereas verbs tend to remain relatively stable in bilingual children’s speech. Prior studies have highlighted that in Kazakhstan’s bilingual context, Russian remains dominant due to its historical prevalence and institutional support, which continues to influence bilingual children’s language use (Suleimenova & Smagulova, 2005; Zharkynbekova & Chernyavskaya, 2022; Zhuravleva & Agmanova, 2021). As seen in our findings, nouns, due to their concrete nature and frequent use, night be more prone to cross-linguistic influence (Paradis, 2011; Jarvis & Pavlenko, 2008), which explains the occasional use of Kazakh nouns in Russian-speaking contexts by bilingual children. However, this CS was rare, likely due to the stronger exposure to Russian at home and in school, which reduces the reliance on Kazakh during lexical retrieval. The dominance of Russian in daily interactions shapes the children’s linguistic landscape, echoing the broader patterns seen in Kazakhstan, where Russian often overshadows Kazakh despite state efforts to promote the latter (Suleimenova, 2009; Zharkynbekova & Chernyavskaya, 2022).
In terms of non-target responses, our results align with earlier findings on bilingualism in Kazakhstan, which suggest that bilingual children often employ creative strategies to navigate lexical gaps, especially in contexts where they are more proficient in one language than the other (Aldaberdikyzy, 2013; Erkebayeva, 2014). However, interestingly, the Poisson regression analysis revealed no significant differences between monolingual and bilingual children in most non-target response categories, such as Phonological, Innovation, and Unrelated responses, reflecting similar linguistic behavior across both groups. Bilingual children tended to produce more Unknown and Innovation responses in verb tasks, suggesting they rely on creative strategies like semantic approximations and lexical inventions when encountering retrieval difficulties. This mirrors findings from other studies, where Russian-Kazakh bilinguals have been shown to demonstrate linguistic creativity, particularly when balancing two languages with different degrees of exposure and proficiency (Aldaberdikyzy, 2013). For example, in our study, innovative constructions such as sagvozdka (“nail-pin”) for “paperclip” and svin’ja-monetka (“coin-pig”) for “piggy bank” reflect the bilingual children’s ability to creatively compensate for lexical gaps. These findings underscore that while bilinguals in Kazakhstan demonstrate variability across their two languages, this reflects dynamic adaptation to uneven language input rather than inherent difficulty, aligning with research emphasizing bilinguals’ flexibility and resourcefulness in lexical skills (Bylund et al., 2021; De Houwer et al., 2014; Paradis, 2023; Thordardottir, 2011).
Limitations and future research
While this study made key contributions to understanding the lexical abilities of Russian-Kazakh bilingual children, some limitations should be considered. The relatively small sample size limits the generalizability of the findings. The linguistic diversity and variation in bilingual experiences across Kazakhstan were not fully captured, as the study focused on 43 bilingual children from urban areas in Northern Kazakhstan, where Russian is dominant. Future studies should include larger and more diverse samples, incorporating children from various regions and socioeconomic backgrounds, to provide a more comprehensive picture of bilingual language development.
In addition, the study’s focus on noun and verb naming tasks offers valuable insights into lexical retrieval but does not encompass the full range of language abilities. These tasks, while useful for measuring expressive vocabulary, do not capture more complex linguistic skills such as syntax, pragmatics, and comprehension. Future research should incorporate a wider variety of tasks, including narrative production, comprehension, and more functional language use, to gain a deeper understanding of how bilingual children navigate their linguistic environment.
Another important limitation is that the study primarily focused on Russian language skills, without evaluating objective Kazakh proficiency to the same extent. To fully understand bilingual language development, both languages must be assessed comprehensively across multiple domains. Future research should examine proficiency in both Kazakh and Russian, exploring how each language influences the other and assessing the balance of language use in different contexts. This approach will provide a clearer picture of how bilingual children manage and develop their skills in both languages within Kazakhstan’s unique multilingual environment. Another possible direction is to compare Russian-Kazakh bilinguals to bilingual Russian-speaking children in other post-colonial contexts, such as Russian-Estonian, Russian-Ukrainian, Russian-Latvian, or Russian-Georgian bilinguals. Exploring these diverse contexts could shed light on how different sociopolitical histories, language policies, and community attitudes influence the linguistic development and identity of Russian-speaking populations outside the Russian Federation.
Finally, while lexical items were selected based on frequency norms derived from Russian corpora, we recognize that these norms may not fully reflect word familiarity or usage within the Kazakhstani context. For example, bi-valent words such as vilka (“fork”), which have less common equivalents in everyday Kazakh, may not carry equal lexical weight across both languages. Although the dominance of Russian in public discourse and media—particularly in northern regions like Akmola—supports the relevance of Russian frequency data, regional variation in language exposure could influence children’s lexical access. Future studies should consider incorporating Kazakhstan-specific corpora or frequency data to better align stimulus selection with local linguistic realities.
Conclusion and pedagogical implications
This study examined the lexical abilities of Russian-Kazakh bilingual children from Northern Kazakhstan, revealing that while bilinguals showed lower accuracy in noun and verb production compared to their monolingual peers, they exhibited comparable linguistic creativity and flexibility. Bilingual children effectively used adaptive strategies such as word substitution, borrowing, and inventive constructions to compensate for lexical gaps, demonstrating their cognitive flexibility in managing two languages. The relatively high level of Russian proficiency among bilinguals, though lower than that of monolinguals, reflects the unique linguistic environment of Kazakhstan, where Russian remains dominant in social and educational contexts despite the state’s promotion of Kazakh.
Kazakhstan’s post-colonial sociolinguistic context plays a key role in shaping bilingual development, with Russian continuing to serve as a powerful force in daily communication. The interaction between Russian and Kazakh creates both challenges and opportunities for bilingual children, as they navigate two languages with different social functions. This study highlights the importance of understanding regional language dynamics, emphasizing that while Russian input strengthens lexical abilities, Kazakh still influences children’s linguistic identity. The findings underscore the need for education policies that support balanced bilingualism in Kazakhstan and similar post-Soviet contexts, where language policies and real-world usage diverge.
Footnotes
Funding
This research was supported by the Ministry of Science and Higher Education of the Republic of Kazakhstan under grant No. IRN AP26197041, Linguistic and Metalanguage Reflection in Bilingual and Multilingual Children: Sociolinguistic, Psycholinguistic, and Lexicographic Perspectives (2025–2027), implemented at Sh. Ualikhanov Kokshetau University (Kokshetau, Kazakhstan). In Israel, the study received support from the Israel Science Foundation (ISF) under grant No. 552/21, Towards Understanding Heritage Language Development: The Case of Child and Adult Heritage Russian in Israel and the USA, awarded to Natalia Meir.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
