Abstract
This paper presents the results of a retrospective study that investigates the cognitive effects of learning a foreign language in late adulthood. The learner group, consisting of 21 L1 Chinese speakers who have been learning to read Arabic for 2 years and 4 months, were compared to the matched group on their performance on a series of cognitive tasks that tap into working memory, processing speed, reasoning, conflict monitoring, and attention. The results showed that the learning group’s performance was significantly better in attention (measured by the Posner cueing attention task). Their working memory capacities (measured by the digit span tests) were also better, but the difference only reached marginal significance. The findings suggest that language learning may lead to improvement in attention abilities, which is in line with the converging evidence in the field of bilingualism showing that executive attention may underlie the mechanism of how bilingual experience can alter brain and the cognitive system.
Introduction
The world is aging. In 2020, there were 727 million people aged 65 or over. This number is expected to more than double in the next three decades, reaching 1.5 billion in 2050 (United Nations Department of Economic and Social Affairs, Population Division, 2020). Aging means the accumulation of worldly wisdom and experience. However, it is also accompanied by declining physical and cognitive abilities, which adversely affect the life quality of older adults and have repercussions on society’s healthcare costs. Most of these declining cognitive abilities are associated with executive control processes, including attention (Glisky, 2007; Veríssimo et al., 2022), processing speed (Hartshorne & Germine, 2015; Spreng & Turner, 2019), episodic memory (Nyberg et al., 2012; Salthouse, 2003), and working memory (Bopp & Verhaeghen, 2005; Cansino et al., 2013), to name a few. On the other hand, the aging brain preserves a large part of its plasticity and remains receptive to new skills training (for a review, see Lustig et al., 2009; Park & Bischof, 2013), including new languages (e.g., Lövdén et al., 2013; Raz & Lindenberger, 2013). Because of this plasticity, different countermeasures have been proposed to battle against the adverse effects of aging on cognitive abilities. It has been shown that cognitive stimulation (e.g., video game training) has protective effects on the aging brain, maintaining or even enhancing the cognitive functioning of older adults (Anguera et al., 2013; Boyke et al., 2008; Willis & Nesselroade, 1990; Wilson et al., 2002).
Suppose cognitive abilities can be improved through training. In that case, the question arises of whether language learning, in and of itself a cognitively stimulating activity, can also slow down the speed with which cognitive abilities decline. There are two reasons to assume that this might be the case. First, an extensive brain network is engaged in learning a new language (Rodríguez-Fornells et al., 2009). Notably, the brain areas involved in this learning process largely overlap with the aging-related declining brain network (Raz, 2000). Indeed, positive brain changes responding to language learning have been observed among healthy older adults and those at risk of neural dysfunction in previous research (Valenzuela et al., 2003). Second, the Bilingual Advantage Hypothesis (BAH) posits that bilingual experience may have some protective effects on the aging brain, for example, by delaying the onset of some neurodegenerative illnesses such as Alzheimer’s Disease (Alladi et al., 2013; Bialystok et al., 2007; Woumans et al., 2015, but see Lawton et al., 2015; Zahodne et al., 2014 for lack of such evidence). However, the body of research on the cognitive effects of learning a language in late adulthood is petite and still in its infancy. The existing studies have produced somewhat mixed results.
A few studies have shown that learning a foreign language in late adulthood can improve cognitive functioning. Bak et al. (2016) examined the impact on attention functions of a 1-week intensive language course of Scottish Gaelic. The participants (n = 33) covered a wide age range, from 18 to 78. After 1 week with an average of 14 hr of learning, the learning group improved their performance in the attentional switching task compared to the control group. In this task, they listened to a sequence of high, middle, and low tones and were instructed to count the middle tones, upwards if preceded by a high, and downwards if preceded by a low tone. Crucially, the improvement was observed across participants of all ages in the learning group and retained 9 months later for those who had practised the learned language 5 hr and more per week. Pfenninger and Polz (2018) also observed positive effects of learning an additional language at a late age on inhibition and interference resolution abilities (measured by the Stroop test). The improvement was observed among all participants (n = 12), both bilinguals and monolinguals. It is worth mentioning that these participants also reported increased attention and concentration abilities after the language training, even though the performance difference before and after the training was not significant.
Another study that has also found a positive effect of late language learning on cognitive abilities is Wong et al. (2019), a randomized control study with a larger sample size. Two hundred thirty-five participants (aged 60–85) in Hong Kong participated, and 153 completed the study in its entirety, among whom 53 were assigned to English language learning, 51 gaming, and 49 music appreciation. The training lasted for 6 months (5 hr per week), amounting to 130 training hours. A minimum of 76 hr of training was required for the final analysis. The participants were assessed with the Alzheimer’s Disease Assessment Scale Cognitive Subscale (ADAS-Cog, the primary outcome measure). They were also tested on their working memory, attention, and cognitive inhibition abilities before and after the training. Foreign language learning and gaming improved ADAC-Cog scores (hence better cognitive performance), and this improvement was retained at a 3-month follow-up. Foreign language learning also led to a significant positive change in working memory capacities (measured by the backward digit span test). In contrast, gaming improved their attention skills (measured by the ANT test).
However, the positive correlation between language learning at an old age and improved cognitive abilities has not always been found. Ramos et al. (2017) measured task-switching abilities before and after language training. Twenty-six Spanish monolinguals attended a Basque course for 8 months in small groups of 10 participants at the maximum per class for 5.5 hr of training per week. Before they embarked on their learning, their task-switching abilities were measured. Participants responded to stimuli either by “color” or “shape” in the task. In some trials, the task remained the same as in the previous trial. These were repeat trials. Switch trials were those trials in which the task differed from the one in the previous trial. Participants’ switching ability was measured by the switching cost, that is, the difference between switch and repeat trials. After the language training, they were tested with the same task again. The results showed that language learning did not significantly impact participants’ overall performance in this task or their switching ability. Ware et al. (2017) also failed to observe any significant change in the scores of the Montreal Cognitive Assessment among their French older participants (mean age = 75) after a 4-month English language training (2 hr per week and 16 weeks in total). The small sample size (n = 14) may be partially responsible for the lack of significance in the results. However, the authors suggested that language training may have played a role in maintaining participants’ cognitive abilities.
Another piece of evidence against any effect on cognitive abilities ensuing from language training is Berggren et al. (2020). One hundred sixty Swedish-speaking participants were assigned either to Italian language learning or relaxation training with a random control design in this study. After 11 weeks of learning for 5 hr per week, the language learning group did not demonstrate any substantial improvement in cognitive functioning in any of the sampled abilities, including spatial intelligence (measured by Raven’s matrices and WASI-II Matrix Task), verbal intelligence (measured by Analogies, Syllogisms, and Verbal Inference), working memory (measured by the Numerical updating task and the n-back task), and long-term associative memory and item memory (measured by three types of stimuli, i.e., word-word, face-name, picture-picture). The authors suggested that language learning, or skill training in general, was not likely to affect general cognitive abilities.
Overall, this line of research is still in its infancy, and the mixed findings can be attributed to many reasons. For example, learner-related differences and contextual factors may modulate learning outcomes (e.g., Gao, 2019; Tseng & Gao, 2021). Note that all these studies, except for Wong et al. (2019), have been conducted in a European context. One consequence of this concentration is that the languages earlier acquired by learners and the to-be-learned foreign languages can be typologically similar (e.g., English and French). Antoniou and Wright (2017) suggest that language typology may affect the emergence of learning-related cognitive advantages. On the one hand, cognitive improvement may ensue from learning typologically similar languages. Such languages cause more interference for each other, and the executive control system is thus engaged to revolve the interference. On the other hand, it is possible that only learning typologically different languages can lead to noticeable cognitive improvement, which may arise from the increased processing complexity and the extra effort required to learn a distant foreign language. However, the concentration of the current scholarship in the European context and the typologically similar language pairs (for a review, see Pfenninger & Singleton, 2019) has prevented any attempt at adjudicating these two possibilities. Therefore, more empirical studies on the cognitive effects of late language learning from a non-European perspective are urgently needed. Investigating the impact of learning a typologically distant language adds to the body of empirical research on late language learning; more importantly, it can shed light on the source of any cognitive improvement that language learning may bring about.
This paper is a pilot study for a larger randomized control project under preparation. It presents the results of a retrospective study that examines cognitive effects of language learning among a group of elderly in north-west China who have been learning Arabic for more than 2 years in order to read the Quran after their retirement (more details of the participants are given in the Method section). Chinese and Arabic belong to two different language families, with Chinese from the Sino-Tibetan family and Arabic the Afro-Asiatic Semitic family. These two languages differ considerably in many linguistic levels, including phonology, orthography, and syntax. For example, the Chinese script is written left-to-right and features with an individual character or ideogram for every syllable (Sun, 2006). On the other hand, the Arabic script is written right-to-left and its alphabet consists of 18 shapes that express 28 phonetic sounds with the help of diacritical marks (Ryding, 2014). This study aims to examine whether these Chinese language learners display any superior cognitive performance in the target cognitive areas after a prolonged period of learning a typologically distant language, Arabic.
Method
Participants
The participants are from a small city in north-west China. They were recruited from a self-organized Arabic learning group. There were 37 learners (23 women and 14 men) divided into two classes. Most of them (n = 34) started learning to read in Arabic after their retirement and the other three were still under employment at the time of data collection. They gathered twice every week and each time for an hour and half (every Saturday and Sunday morning). The classes were taught by two teachers who can speak, read, and write fluently in Arabic. At the time of testing, the classes had been run for about 2 years and 4 months. The teachers emphasized receptive skills, especially reading. This was because most learners joined the group so they can read the Quran. There was no homework assigned, but the learners all joined in a group chat where they could ask questions and share learning experiences outside class hours. The mean score of learners’ self-reported proficiency in reading Arabic was 3.56 out of 10.
All learners had filled in a background questionnaire prior to the testing. This questionnaire collected information on the participants’ demographics (i.e., education, previous occupation, housing condition, and parents’ education), language background (i.e., language use, self-reported proficiency in Putonghua, and self-reported Arabic reading proficiency), life style (i.e., whether they play any musical instruments and games, whether they exercise regularly and participate in any social activities), and the prior migration history across city or country borders. All these factors have been found to relate to successful aging (see Hertzog et al., 2009 for an extensive review). Learners who fell out of two standard deviations from the group mean on these factors were removed. The three working participants were also removed. There remained 21 learners (12 women, 9 men). A screening cognitive test adopted from the Chinese Longitudinal Healthy Longevity Survey (CLHLS, Center for Healthy Ageing and Development Studies, 2020) was also administered. This test from the CLHLS was a modified version of the Mini-Mental Status Examination (MMSE). It maintained the original items (n = 30) in the MMSE. However, it re-phrased some items to make them more locally meaningful. For example, subjects were instructed to repeat “zhong gua de gua, zhong dou de dou” (you reap what you sow) instead of “No ifs, ands, or buts” in the original version. Participants’ mean score was 28.24 out of 30 for this test, exceeding the cut-off point 24 (Kochhann et al., 2010), thus showing no sign of dementia.
For a retrospective study, it was necessary to recruit another group of participants who matched the learners on these factors but had never learned a foreign language before. This way, differences between the two groups in cognitive tests are more likely to be attributed to language learning. Two hundred ninety-four additional participants from a similar age range (60–70) filled in the same questionnaire. The following screening procedure was adopted. Starting with one factor, the answers provided by these participants were compared to those by the language learners. Deviating answers from the matching group were removed until they were comparable to the learners. The remaining participants were further compared to the learners on a different factor. After applying this procedure to all factors, 29 matching participants remained (16 women, 13 men). Table 1 summarizes the information across the two groups. Overall, by controlling the factors that can impact cognitive abilities, the main difference between the two groups was whether they had been learning a foreign language. However, it should be noted that the two groups may still differ in some aspects leading to language learning. One such aspect is their language learning motivation. Learning Arabic to read the Quran after retirement in self-organized groups naturally places the learning group higher on the motivation scale. This is an inherent limitation of retrospective studies that often lack random assignment.
Demographic and Other Lifestyle Information of the Participants Across the Language Learning and Non-Learning Groups.
Education: 0 = no schooling; 1 = primary school; 2 = secondary school; 3 = high school; 4 = university; 5 = post-graduate and above
Previous occupation: 0 = unemployed (including housewife); 1 = farm laborer, menial service worker; 2 = unskilled worker; 3 = machine operator, semiskilled worker; 4 = small business owner, skilled manual worker; 5 = clerical and sales worker, small business owner; 6 = technician, semi-professional; 7 = manager, minor professional; 8 = administrator, professional, proprietor.
Housing: 1 = rental; 2 = home owner (flat); 3 = private house; 4 = owner of multiple properties
Migration history: 0 = never migrated; 1 = migrated across city boundaries once; 2 = migrated across city boundaries twice; 3 = migrated across city boundaries three times; 4 = migrated across city boundaries more than three times
Musical training: 0 = never; 1 = less than 6 months; 2 = less than a year; 3 = between 1 to 3 years (included); 4 = more than 3 years
Exercising, Gaming, Socializing: 0 = Never; 1 = once or twice every few months; 2 = once or twice every month; 3 = once or twice every week; 4 = every day
Main language of use: 1 point for the local dialect use in each domain: work, family, friends, relatives; if both the dialect and Putonghua are considered the main language of use in a particular domain, 0.5 point for the dialect
The proposed study has received approval from the research ethics committee of the author’s university. (20-04-72).
The Battery of Measurement Tasks
On the one hand, some major areas that can receive gains after cognitive training include processing speed (Ball et al., 2002), working memory (Ball et al., 2002; Basak et al., 2008; Dahlin et al., 2008), and reasoning (Park et al., 2014; Willis et al., 2006). On the other hand, the bilingual advantage literature suggests that bilinguals can outperform comparable monolinguals when processing distracting information or monitoring conflict. For example, bilinguals have been shown to perform better in the flanker task (Costa et al., 2008; Woumans et al., 2015; Yang & Yang, 2016) and in the antisaccade task (Bialystok et al., 2006; Goral et al., 2015). The few existing studies on language training effects among the elderly also show that subjects’ abilities in working memory (Wong et al., 2019) and attention switching (Bak et al., 2016) have significantly increased after the training. Therefore, the present study has selected a battery of cognitive tasks that tap into these five major areas that respond to cognitive training or bilingualism, that is, working memory, processing speed, reasoning, conflict monitoring, and attention.
The digit span tests (both forward and backward) of the Wechsler Adult Intelligence Scale (WAIS-IV; Wechsler, 2008) were used to assess working memory capacities ( Nijmeijer et al., 2021), which are responsible for temporary storage and manipulation of information for complex cognitive tasks. Raven’s colored progressive matrices were used to measure reasoning abilities (Keijzer & Schmid, 2017; Park et al., 2014). This version is typically used with children from 5 through 11 years, elderly persons, and mentally and physically impaired persons. This test was initially developed to assess individuals’ ability to make “meaning out of confusion” or the ability to go “beyond the given to perceive that which is not immediately obvious” (J. Raven et al., 1991, p. G1). Processing speed was assessed using the speeded digit-comparison tasks with three, six, and nine items (Park et al., 2014; Salthouse & Babcock, 1991). A modified version of the flanker test (Sullivan et al., 2016) was used to assess conflict monitoring. A task switching was added to further assess executive control functions. This task has produced some controversial results regarding the bilingual advantage (Paap & Greenberg, 2013; Prior & MacWhinney, 2010), and failed to produce benefits of late language learning (Ramos et al., 2017). For attention monitoring, the Posner cueing attention test was used. Bialystok (2017) suggests that the model proposed by Posner and colleagues (Posner & Petersen, 1990) may have the potential to better conceptualize attention.
Material and Procedure
Digit Span Tests
On the digit forward span test, the experimenter verbally presented increasingly longer strings of digits at a rate of one per second. Subjects were asked to repeat the numbers in the same order. They repeated the numbers in the revered order on the backward span test. The number strings were pseudo-randomly created with the condition that the same number did not appear twice consecutively and adjacent numbers were not in the counting sequence. If a participant failed twice to produce a string, the test was discontinued. The longest sequence of digits a subject could produce was recorded as their digit span. A longer span supposedly represents better working memory capacities.
Raven’s Colored Progressive Matrices (CPM)
On this paper-and-pencil test, subjects were asked to determine the missing element in a pattern. The missing element was placed below the pattern with five similarly shaped pieces. Subjects had to determine which piece completed the pattern. Questions became harder as the test progressed. The test comprises three sets of 12 matrices (Sections A, Ab, and B) and has no time limit. Previous studies show that the scores on Section Ab contribute little to individuals’ intellectual capacity at the ages of 60 to 69 (Levinson, 1959). Following Smits et al. (1997) to limit the subjects’ burden, this section was removed, and the total number of questions was 24. The total number of patterns completed correctly was the total score for the subject on this test.
The Speeded Digit-Comparison Task
Subjects were presented with pairs of Arabic numbers and to identify them as “same” or “different” as rapidly as possible. Pairs requiring a different response were constructed by altering one of the digits in one pair member. There were three blocks of numbers, one block of three-digit numbers, one of six-digit, and one of nine-digit. Each block consisted of 32 trials. The time interval between trials was 625 ms (Salthouse & Babcock, 1991). If subjects failed to respond within 625 ms, the next trial started. The correct number of trials identified measured their processing speed. Prior to the test, the subjects were given oral instructions. They had a series of practice trials that could be repeated as many times as requested by the participants to ensure a correct understanding of the task. The test was run with E-prime 3.0.
The Modified Flanker Test
The modified flanker test was designed following the procedures specified in Ivanova et al. (2016). A central arrow was flanked by two arrows on each side, which pointed in the same direction as the central arrow (congruent), or in the opposite direction (incongruent), or appeared without arrowheads (neural). There were 48 trials in each condition. Following Ivanova et al. (2016), in each condition, 16 trials appeared on the center, left, or right sides of the screen, respectively, creating double incongruence for some trials (e.g., a right-pointing central arrow flanked by left-pointing arrows presented on the left side of the screen). All types of trials were presented in random order at least once before they were allowed to repeat.
Before a trial started, a fixation point appeared for 800 ms, followed by a location cue for 100 ms, and then the stimulus appeared on the screen for 1700 ms or until a response was made. Subjects pressed the left or the right arrow key on the keyboard to register their responses. After an inter-trial interval of 1700 ms, the fixation point appeared again, and a new trial started. Prior to the test, the subjects were given oral instructions. They were allowed the same practice routine described in Ivanova et al. (2016), that is, six neutral trials, six congruent trials, six incongruent trials, and ended with nine trials with equal numbers of the different trial types presented in random order. The response latencies and accuracy were registered by E-prime 3.0.
Task Switching
The task was designed following Ramos et al. (2017). Stimuli were squares (3 cm each side) or circles (3 cm radius) in red or blue presented in the center of the screen. A cue was also presented 3 cm below each target figure. The cue was the Chinese words for shape (
) or colour (
). Subjects were asked to name either the shape or the color of the stimulus loudly, depending on the cue. Before each trial, a central fixation point appeared for 350 ms. It was then replaced by the simultaneous presentation of the stimulus and the cue, which stayed on the screen until a response was made or for a maximum of 5000 ms. There were 80 trials, 24 switch trials, 24 repetition trials, and 48 filler trials. Filler trials were identical to repetition trials, but they could occur after a switch trial, whereas repetition trials were not allowed. The subjects were given oral instructions. They had 13 practice trials that could be repeated as many times as requested to ensure a correct understanding. The experiment was run with E-prime 3.0. The speech onset of the vocal responses was recorded with Chronos, connected to E-prime. The responses were recorded to check the noted errors to ensure data accuracy.
Posner Cueing Attention Test
In this paradigm, subjects’ attention is directed by a cue to a location in space that may contain a response target (valid cued condition) or not (invalid cued condition) prior to the appearance of the target (Posner, 1980). The classical result is that performance in the valid cued trials is always better. This facilitation effect is attributed to a “spotlight” of attention (Posner & Cohen, 1984). The cue directs the spotlight to the location where the attention is engaged, and a target appearing in this location can be more efficiently detected. If the target instead appears in an uncued location, responses are slowed because attention has to be disengaged first. A larger facilitation effect is interpreted as a slower disengagement from the invalid cue (Posner & Cohen, 1984).
Stimuli were black line drawings on a white background. The initial fixation display consisted of a fixation point located in a central circle flanked to the left and right by two empty squares (2.9 cm on each side). The outer edges of the squares were located 8.9 cm from the outer edge of the central circle. The cue consisted of a cross which appeared in one of the two outer squares. A black filled circle (1.7 cm radius) served as the target stimulus for both cue conditions and was presented in the center of one of the two outer squares. The experiment consisted of 100 trials, with the invalid cued trials ranging from 31% to 20%. Every trial began with the fixation display for 1000 ms, followed by a cue (a cross) display for 300 ms, the target (a black circle) appeared in either the left or right outer square.
The choice of 300 ms for the cue duration was based on the finding that the facilitation effect of valid cue was larger for the old people at 300 ms but only present for young people at 100 ms (Langley et al., 2012). At 300 ms, a larger facilitation effect means less ability to disengage from the once-relevant information. The cue and the target remained on the screen for 3000 ms or until a response was made. Participants pressed the left or right arrow keys on the keyboard to indicate the target’s location. They were asked to rest their fingers on the two keys for quick response, but accuracy was also emphasized. The subjects were given verbal instructions and a drawn representation of stimulus events. They were told that the location of the cue was random. The test was run with E-prime 3.0.
Results
Digit Span Tests
For the forward digit span, the average number recalled for the learning group was 5.79 (std. = 0.96, range = 4–9), and that for the non-learning group was 5.29 (std. = 1.29, range = 4–9). The difference was not significant (t = 1.60, p = .12, d = 0.44). For the backward digit span, the average number recalled for the learning group was 5.10 (std. = 1.00, range = 3–7), and that for the non-learning group was 4.38 (std. = 1.28, range = 3–8). The difference was marginally significant (t = 2.04, p = .06, d = 0.63).
Ravens’ CPM
The means and standard deviations in Sections A and B for the two groups are shown in Table 2. The difference was not significant in both separate sections (t = 0.58, p = .24, d = 0.06 for Section A; t = 0.72, p = .26, d = 0.15 for Section B) and the summed scores (t = 0.94, p = .15, d = 0.05).
Subjects’ Means, Standard Deviations, and Range for the Ravens’ CPM.
Digit Comparison
The correct number of comparisons identified for different conditions are shown in Table 3. A 3 × 2 analysis of variance (ANOVA) was conducted with digit (three vs. six vs. nine) as the within-subject variable and language learning (learning vs. non-learning) as the between-subject variable. There was a main effect of digit (F(1,48) = 28.82, p < .001) with a large effect size (η2 = 0.28). Separate t-tests showed that the three-digit comparison was significantly different from both the six-digit comparison (t = 7.72, p < .001) and the nine-digit comparison (t = 7.221, p < .01), but the difference between the six-digit and the nine-digit comparison was not significant (t = 0.174, p = .861). The main effect of language learning was not significant (F(1,48) = 0.27, p = .60, η2 = 0.001). The interaction was not significant (F = 1.01, p = .37, η2 = 0.009).
The Means, Standard Deviations, and Range for the Correct Number of Comparisons Identified by the Language Learning Group and the Non-Learning Group.
Modified Flanker Test
For RT analysis, responses associated with errors, responses below 250 ms and above three standard deviations from the mean (per participant) were removed (2.08%). The mean RTs and error rates are shown in Table 4. A 3 × 2 ANOVA was conducted with condition (congruent vs. incongruent vs. neutral) as the within-subject variable and language learning (learning vs. non-learning) as the between-subject variable for both RT and error rates analyses. For the RT analysis, the main effects of condition and language learning were significant (for condition, F(1,48) = 5.18, p < .01, with a medium effect size, η2 = 0.06; for language learning F(1,48) = 17.56, p < .01, with a large effect size η2 = 0.11), but the interaction was not significant (F = 0.01, p = .99, η2 < 0.01). Separate t-test results showed that response latencies were significantly longer in the incongruent condition compared to the congruent condition (t = 10.993, p < .001) and the neutral condition (t = 14.691, p < .001). Response latencies in the congruent condition were also significantly longer in the congruent condition compared to the neutral condition (t = 2.334, p = .023). In addition, response latencies of the learning group were longer than the non-learning group (t = 2.026, p = .031). For the error rate analysis, the main effects and the interaction were not significant (Fs <2.081, η2 < 0.01 for all).
Subjects’ Response Times (RTs) and Error Rates (ERs) in Different Conditions in the Modified Flanker Test.
Task Switching
For RT analysis, responses associated with errors and responses below 250 ms and above three standard deviations from the mean (per participant) were removed (1.65%). Errors included wrong task execution, hesitations, and false starts. The mean RTs and error rates are shown in Table 5. A 2 × 2 ANOVA was conducted with condition (repetition vs. switch) as the within-subject variable and language learning (learning vs. non-learning) as the between-subject variable for RT and error rates. The main effects and interaction were not significant for both types of analyses (Fs < 3.28, η2 < 0.01 for all). The only effect that approached significance was that of condition in the error analysis (F(1,48) = 3.28, p = .07, η2 < 0.01). Both groups of subjects made slightly more errors at switch trials.
Subjects’ Response Times (RTs) and Error Rates (ERs) in Different Conditions in Task Switching.
Posner Cuing Attention
Subjects’ responses below 250 ms and above three standard deviations from the mean (per participant) were removed from response latencies (RT) analysis (1.81% for valid cue trials and 1.68% for invalid cue trials). For accuracy analysis, correct responses were analyzed. Table 6 shows the descriptive results in different conditions. With regard to the performance in the valid condition, the difference between the two groups was not significant in both response latencies (t = 0.73, p = .47, d = 0.21) and accuracy (t = −0.89, p = .38, d = 0.26). In the invalid condition, the difference was not significant in response latencies (t = 0.79, p = .43, d = 0.23), but was significant in accuracy (t = 2.28, p = .03, d = 2.42). The learning group made fewer mistakes when the cues were invalid. In terms of the facilitation effect (the difference between the valid and the invalid condition), the difference between the language learning group and the non-learning group in the response latencies was not significant (t = 0.16, p = .87, d = 0.04), but the difference in accuracy was significant (t = 2.46, p = .012, d = 0.66). The facilitation effect for the learning group was smaller, but this was due to their higher accuracy in the invalid condition.
Subjects’ Response Times (RTs) and Accuracy Rates (ARs) in Different Conditions in the Posner Cuing Attention Task.
Discussion
The present paper is a retrospective study investigating cognitive effects that may have ensued from studying a foreign language at an old age. The body of literature in this regard is small, and the results of the existing studies are mixed. A few studies found positive cognitive effects (e.g., Bak et al., 2016; Wong et al., 2019), but the positive correlation was not always attested (e.g., Ramos et al., 2017; Ware et al., 2017). The present study adds to the empirical evidence on such relations from a Chinese perspective to further shed light on whether language learning can produce positive cognitive effects. A group of elderly Chinese learners, who had been learning to read in Arabic for more than 2 years, were compared to a group of matched non-learners on their performance on a few tasks that supposedly tap into five cognitive areas, working memory, processing speed, reasoning, conflict monitoring, and attention. These are also the areas previously shown to respond to cognitive training or bilingual experience. The results show that the two groups differed significantly in the Posner cueing attention task and the modified flanker task. Their performance difference in working memory (tested by the digit span tests) was also marginally significant. In the following, these differences are discussed.
The learning group could recall more numbers in the backward digit span test, although this advantage was marginally significant. This finding echoes the results of Wong et al. (2019), which used the same test and showed that foreign language training led to improvement in working memory. The typological language difference between the learners’ native language (Chinese) and the target language (Arabic) should be considered to better understand this performance difference. These two languages come from two different language families, Chinese from the Sino-Tibetan family and Arabic being a member of the Semitic family. They have distinct writing systems. The Chinese script is written left-to-right and features an individual character or ideogram for every syllable, each character representing a word or idea. An alphabet based on Roman letters (Pinyin) was developed to facilitate the phonetic transcription of Chinese characters (Sun, 2006). On the other hand, the Arabic script is written from right to left, and its alphabet consists of 18 shapes that express 28 phonetic sounds with the help of diacritical marks. The shapes of the letters vary depending on their position within a word. Another important feature of the Arabic script is that it does not represent short vowels, which must be memorized. This feature leads to some words being written the same but pronounced differently (Ryding, 2014). The minimal overlap between the two languages poses a great challenge for the learners to memorize the letters, the change of their shapes, and how the diacritical marks can alter the pronunciation of the same letter. They cannot rely on the existing Chinese writing system to help them with this challenge, which may have placed a high demand on their working memory. The challenges these learners face with the Arabic orthography system may be reinforced by the emphasis on reading in their classes. Constant exercises engageing working memory led to better performance in tasks that rely heavily on such capacities. If this explanation is on the right track, it supports the processing complexity effect proposed by Antoniou and Wright (2017), which suggests that learning typologically different languages is more cognitively challenging and thus can lead to greater cognitive improvement. Recent evidence also shows that bilinguals speaking typologically unrelated languages may have more efficient executive functions, particularly attention-switching abilities (Perovic et al., 2023). This focus on attention abilities leads to the next finding in the present study, that is, language learners’ better performance in the Posner cueing attention task.
The classical result of the Posner cueing attention task is a facilitation effect. This effect is often associated with slow responses in the invalid condition as subjects need to disengage attention from the invalid cue before moving their attention spotlight to the target (Posner & Cohen, 1984). Typically, older people demonstrate a greater facilitation effect (Greenwood & Parasuraman, 1994). This is because they are particularly slow in the invalid cue condition and thus benefit more from a valid cue. In the present study, the language learning group displayed a smaller magnitude of facilitation effect, which was due to significantly fewer mistakes in the invalid condition. Therefore, it can be assumed that these subjects were quicker to disengage their attention from the previously-relevant information compared to the non-learners. This finding is in line with the results of some previous studies on lifelong bilingualism, which show that older bilinguals, compared to older monolinguals, display a smaller magnitude of residual target activation (Blumenfeld et al., 2016) and resolve competitor inhibition more quickly (Blumenfeld & Marian, 2011; Mishra et al., 2012). The better performance of the language learning group in the attention task also echoes the improvement in attentional switching among the subjects in Bak et al. (2016) who have received language training only for a week time, and the perceived attention improvement reported by the participants in Pfenninger and Polz (2018).
It seems that attention ability is more sensitive to language training. Bialystok (2017) reviewed the existing studies on the relation between bilingualism and cognition, and proposed that executive attention may have the potential to explain the mechanism underlying how bilingual experience can modify the cognitive system and hence the source of bilingual advantage. Though this hypothesis is proposed to account for the impact of lifelong bilingualism, it may also extend to language learning in late adulthood. More future studies are called for to investigate how attention abilities are affected by language learning in late adulthood.
On the modified flanker task, the learning group was slower than the non-learning group in all conditions, contrary to what would be expected based on the hypothesis that language training can improve their cognitive abilities, in this case, conflict monitoring. A similar finding has also been reported in Sullivan et al. (2016), in which the bilinguals performed significantly slower than the monolinguals on the conflict blocks in the same task. Pfenninger and Polz (2018) also found that monolinguals were better than bilinguals in their performance on the Stroop test at both data collection times (i.e., before and after the training), despite that both groups made significant improvement after the language training. The echoing finding in the present study adds to the empirical evidence that bilinguals and language learners are not always quicker responders in conflict situations. However, it remains unclear whether language learning is responsible for the slow responses in the current results. It also raises further questions about whether language learning may improve learners’ performance in some tasks but at the expense of others.
One may suggest that these advantages observed with the language learning group may ensue from engagement in social activities. Classroom-based language learning indeed provides enhanced opportunities for social interaction, and there may be a stronger social bonding among the learners, especially when they have a shared goal in language learning, that is, to read the Quran. Previous literature has documented positive effects of social engagement on successful aging (Hertzog et al., 2009; however, see Park et al., 2014 for limited evidence of cognitive benefits of engagement in social activities). Future studies that adopt random control design can help tease apart the impact of social and cognitive aspects of language learning.
The limitations of the present study are apparent, among which the retrospective nature of the study is a significant issue. There is no baseline performance data in both groups before the training. Therefore, the results should be treated with caution, as it is possible that the language learners started their learning with these superior abilities. This said, language learning may still have contributed to maintaining these abilities, which could have otherwise deteriorated in their old age (Ware et al., 2017). Retrospective studies also have less control over how the learning occurs, including the intensity of training and the teaching method, which could influence any cognitive effect of language learning. Another limitation of retrospective studies is that there is little control over some psychological factors (e.g., the aforementioned learning motivation) as participants are not randomly assigned. In the present study, the learning of Arabic was self-organized. Therefore, these learners may have exceptionally high motivation and thus have invested extra effort in their learning, which may not represent general learning situations and the general learning population. Therefore, randomized controlled studies of language learning are urgently needed in the future to address these issues (Nijmeijer et al., 2021). In addition to the retrospective nature, the present study also suffers from the small size of both language learning and control groups. This may have reduced the statistical power of the study to reveal some genuinely significant relationships, as evidenced by the large effect size but the insignificant p-value with the group difference on the forward digit span test (t = 1.60, p = .12, d = .44).
Despite the limitations, the findings of the present study add some valuable empirical evidence to the small body of literature on the cognitive effects of language learning from a Chinese perspective. The superior performance of language learners on the tasks that measure attention abilities and work memory capacities lends support to the effectiveness of language learning as an intervention strategy to slow down the detrimental effects of aging on cognition. The preliminary conclusion drawn from the present study calls for more research, which in its totality will contribute significantly to cost-effective measures and treatments that can battle against age-related cognitive decline. On the theoretical side, the findings suggest that language learning may lead to improvement in attention abilities, which is in line with the converging evidence in the field of bilingualism showing that executive attention may underlie the mechanism of how bilingual experience can alter brain and the cognitive system (Bialystok, 2017). In addition, the present study found positive effects with two typologically distant languages (Chinese and Arabic), echoing the findings of Wong et al. (2019), which also investigated two typologically different languages (Cantonese and English) and found similar positive effects of language learning. This finding supports the processing complexity effect proposed by Antoniou and Wright (2017) that the effects of language learning may be more likely to emerge when learning a typologically different language.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This paper was funded by the HSS start-up fund awarded by the author’s affiliation (project code: HSS_RSF_App_2020-Delta) and Jiangsu Province Innovation & Entrepreneurship Doctor-Talent Program.
Ethics Statement
This project was approved by University Ethics Committee at Xi’an Jiaotong-Liverpool University (the committee approval code: 20-04-72).
