Abstract
This study investigates the effects of three word-focused exercise conditions on vocabulary learning. The exercises were developed based on the involvement load hypothesis. This study also explores how individual differences (e.g. second-language English proficiency level and working memory) affect vocabulary learning outcomes. A total of 180 Chinese students were equally and randomly assigned to 3 exercise conditions (reading comprehension plus marginal glosses, reading plus gap-fill and reading plus sentence writing). The Vocabulary Knowledge Scale was adapted to measure pre- and post-test vocabulary gains. An n-back task was developed to assess learners’ working memory capacity. Results showed that the sentence-writing group yielded the best performance in vocabulary learning, followed by the gap-fill group and finally the reading-comprehension group. General linear model results revealed that learners’ English proficiency level and working memory significantly predicted their vocabulary gains. This study expands on prior research by exploring learner-related factors in vocabulary learning. Relevant implications are discussed based on the findings.
Keywords
Introduction
The acquisition of new words in a foreign language depends on how well learners can process the target words (Laufer and Rozovski-Roitblat, 2011). Word-focused exercises are an effective way to maximize vocabulary learning outcomes (Laufer, 2003). Drawing upon the involvement load hypothesis (ILH) (Laufer and Hulstijn, 2001), researchers have called for practitioners to develop tasks, activities or exercises that trigger a higher involvement load to enhance vocabulary learning. One reason, as proposed by Laufer (2003), is that elaborate processing causes learners to attend more to new words, thus increasing learners’ likelihood of acquiring new words. Studies based on the ILH have supported the effects of different word-focused exercise conditions on vocabulary learning (Teng and Zhang, 2021a; Nassaji and Hu, 2012), indicating that tasks with a higher involvement load resulted in superior vocabulary learning; findings have thus confirmed that the depth of information processing for target words determines vocabulary learning outcomes.
The ILH includes motivational factors as well as cognitive factors. Among the three components of the ILH – need, search and evaluation – the evaluation component appears to be a significant predictor of the greatest amount of learning, followed by need; search does not contribute to vocabulary learning (Yanagisawa and Webb, 2021). It seems that learners’ allocation of attention or resources to establish a form–meaning link for vocabulary learning is essential to vocabulary learning. In addition, there have been some arguments that factors, including time on task, learners’ level of proficiency and frequency of exposure to the target words, may affect the hypothesis assumptions (Hazrat and Read, 2021). Critical issues arising from the research regarding various factors affecting the predictive ability of the hypothesis might not be limited to the three factors. Despite the increasing attention to ILH, it seems that the issues related to the hypothesis assumptions and to the factors affecting the predictive ability of the hypothesis needed more investigations. It is reasonable to assume that working memory (WM; i.e. one's manipulation and allocation of attention and memory for tasks) and English proficiency level (i.e. one's ability to use language at a level of accuracy that facilitates readiness for relevant tasks) may predict vocabulary learning performance. This study examines how learner-related factors – that is, WM and English proficiency level – possibly influence vocabulary learning outcomes from word-focused exercises.
Literature Review
The ILH and Vocabulary Learning
The ILH is grounded in levels of processing theory (Craik and Lockhart, 1972). According to this hypothesis (Laufer and Hulstijn, 2001), the mental effort that a learner is willing to devote to processing words determines the effective learning of new words. Mental effort refers in this case to learners’ task involvement. Involvement is a motivational-cognitive construct containing three dimensions: need, search and evaluation. Laufer and Hulstijn (2001) defined need as the motivational dimension of involvement; it can be categorized as either moderate (an involvement index of 1) or strong (an involvement index of 2). Moderate need applies when the learning requirement is task-imposed, whereas strong need applies when leaners have an internal drive to fulfil task requirements. Search and evaluation constitute the cognitive dimensions of involvement. Search refers to the effort that learners expend to identify the meaning or form of new words by consulting available resources (index = 1 if search is present and 0 otherwise). Evaluation also includes moderate and strong evaluation. An index of 1 indicates moderate evaluation (e.g. when learners are required to examine differences between words in a given context, such as in fill-in-the-blank tasks); an index of 2 is assigned for strong evaluation (e.g. when learners are required to evaluate the suitability of new words in a given context, such as during sentence writing). A task with a higher involvement load may promote better vocabulary learning performance.
Hulstijn and Laufer (2001) conducted an empirical study in two countries, Israel and the Netherlands. Their research revolved around three conditions: (a) reading with marginal glosses (involvement index = 1); (b) reading with gap-fill exercises (involvement index = 2); and (c) writing a composition using the listed words (involvement index = 3). The main difference between these conditions entailed evaluation: evaluation was absent in Condition 1, moderate in Condition 2 and strong in Condition 3. The composition writing condition yielded the best vocabulary learning performance. However, different from the experiment in Israel, the experiment in the Netherlands provided only partial support for the authors’ hypothesis; for example, students in the gap-fill group did not perform significantly better in vocabulary learning compared to the reading-with-marginal-glosses group. Folse (2006) assigned 154 ESL students across 3 types of written exercise conditions: one fill-in-the-blank exercise, three fill-in-the-blank exercises and one original-sentence-writing exercise. The main effect of exercise type was statistically significant, and the effect size for pairwise comparisons involving Condition 2 (fill-in-the-blank exercise) was quite large. Folse's (2006) findings challenged the idea that learners retain new words more effectively by writing original sentences than by completing fill-in-the-blank exercises. Kim (2011) considered learners’ English proficiency in multiple vocabulary-focused conditions. The study consisted of a pair of experiments involving English-as-a-second-language (ESL) learners at two proficiency levels. Experiment 1 tested the ILH with three tasks featuring varying levels of involvement load: reading with comprehension questions including graphic organizers (involvement index = 1); reading with comprehension questions and gap-fill activity (involvement index = 2); and composition writing (involvement index = 3). The findings of Experiment 1 supported the ILH. Experiment 2 included two conditions presumed to represent the same task-induced involvement. The results showed that different conditions with the same involvement load resulted in a similar amount of vocabulary learning. No interaction effect emerged between task-induced involvement and second-language (L2) proficiency in learners’ acquisition of new words. However, the less proficient learners in her study were above the beginning level, for which future studies are needed to investigate and compare a wider range of proficiency levels. In a recent meta-analysis (Yanagisawa and Webb, 2021), ILH was found to significantly predict vocabulary learning. In particular, ILH explained 15.0% and 5.1% of the variance in effect sizes on immediate and delayed vocabulary post-tests.
The above-mentioned studies compared multiple degrees of vocabulary learning effectiveness over distinct word-focused exercises conditions. Few studies have considered different learner-factors (e.g. English proficiency and WM) in ILH-based word-focused exercises. Kim (2011) suggested that vocabulary learning outcomes were consistent among learners across multiple proficiency levels. Yang et al. (2017) identified WM as a significant predictor of gain scores achieved in the comprehension-only and gap-fill groups on an immediate post-test. Other scholars (Teng and Zhang, 2021a) have discovered that learners’ metacognitive strategies in taking control of their vocabulary learning can predict vocabulary learning outcomes. These results affirm the role of learners’ self-regulated capacity in vocabulary learning outcomes in word-focused exercise conditions (Qin and Teng, 2017). However, prior studies did not shed light on how WM and L2 proficiency level jointly predict vocabulary learning outcomes in word-focused exercises.
English Proficiency Level and Vocabulary Learning
English proficiency level is essential to vocabulary learning. Tekmen and Daloǧlu (2006) examined vocabulary acquisition among three groups of Turkish learners with different English proficiency levels. Advanced learners gained a significantly greater number of words compared with the intermediate and upper-intermediate groups. However, the authors did not compare different types of word-focused exercises; they only explored vocabulary learning from reading comprehension. Findings suggested that reading can help learners deepen their knowledge of a word's various meanings and contexts. Additionally, learners’ proficiency level apparently influences their vocabulary learning outcomes. Kim’s (2011) study, which involved two experiments, led to different conclusions. Experiment 1 explored whether participants’ initial learning and retention of target words were in line with the hypothesized task-induced involvement. Experiment 2 investigated participants’ vocabulary learning performance across different English proficiency level groups. Learners’ English proficiency level did not significantly predict vocabulary learning and retention through different types of word-focused exercises. Hill and Laufer (2003) argued that task involvement load, rather than the amount of time spent on a task, predicts learning. They also claimed that the impact of task involvement load is similar for learners with different proficiency levels. Webb and Kagimoto (2009) explored the effects of receptive and productive vocabulary tasks on learning collocation and meaning. A total of 145 Japanese students learning English as a foreign language (EFL) were allocated to a receptive learning condition (encountering target words in three glossed sentences) and a productive learning condition (a cloze exercise). Both conditions led to significant gains in vocabulary knowledge; however, the productive learning condition was more effective for higher-level learners while the receptive learning condition was more effective for lower-level learners. These findings further highlight the role of learners’ proficiency level in vocabulary learning.
The above studies reflect persistent uncertainty about the impact of learners’ proficiency level on vocabulary acquisition. In an attempt to reveal a more complete picture of vocabulary learning outcomes through word-focused exercises, it is essential to investigate whether vocabulary learning under conditions with different involvement indices correlates with one's existing English proficiency level.
WM and Vocabulary Learning
WM refers to the constrained cognitive capacity on which individuals depend to store and manipulate information for a limited period while allowing the execution of cognitive tasks (Baddeley, 2003). WM is inherently dynamic and complex; it is a multifaceted system that links storage and processing components, thereby both phonological and executive WM affect individuals’ vocabulary learning outcomes (Teng and Zhang, 2021b). Following Baddeley and Hitch’s (1974) three-part WM model, WM should not be regarded as a single, unitary construct but as a process of splitting information into multiple components: the central executive, the phonological loop and the visuospatial sketchpad. The central executive controls the flow of information from and to the phonological loop. The phonological loop and visuospatial sketchpad are the two slave systems of the executive function. The phonological loop is responsible for temporarily storing and holding speech-like information while the visuospatial sketchpad handles visual and spatial information. Theoretical conceptualizations of WM are mostly based on the three-part WM model.
Psychologists have developed an array of span tasks to tap into the multiple facets of WM. Wen et al. (2021) summarized tasks to measure WM, such as simple storage-focused WM span tasks and complex memory span tasks. Simple storage-focused WM span tasks include digit span, letter span and non-word repetition span tasks. Complex memory span tasks consist of reading span tasks, operation span tasks, task-switching and n-back tasks. In terms of WM assessment, the phonological component and the executive component have received the most focus (Gathercole and Baddeley, 1993). Tasks for assessing phonological working memory (PWM) generally include digit span (Gary and Macken, 2015) and word/non-word span tasks (Gathercole, 2006). Tasks for assessing executive working memory (EWM) commonly include reading span tasks (Daneman and Carpenter, 1980) and operation span tasks (Conway et al., 2005). Researchers have also advocated for using complex memory span tasks (e.g. n-back) to explore WM capacity (Owen et al., 2005).
Researchers have considered the acquisition of second-language (L2) vocabulary from a WM perspective. For example, Cheung (1996) adopted non-word span to measure the phonological memory of a group of Hong Kong 7th graders and discovered that this type of memory underlies L2 vocabulary acquisition. Scholars have also separately measured the role of phonological short-term memory or EWM on L2 vocabulary learning. In one instance, Martin and Ellis (2012) explored phonological short-term memory based on non-word repetition, non-word recognition and listening span with a sample of 50 native English speakers. Vocabulary learning tasks were completed during two one-hour sessions in a computer lab. Results showed that phonological short-term memory predicted participants’ ability to learn singular forms of vocabulary. However, Engel and Gathercole (2012) assembled a sample of first-, second- and third-language young learners who completed memory tests including complex span and verbal short-term storage. Linguistic tests covered vocabulary, grammar and literacy. Phonological short-term memory was found to be uniquely linked to first-language (L1) and L2 vocabulary. However, EWM is a weak or unstable predictor of L2 vocabulary size. Yang et al. (2017) explored the role of WM in vocabulary learning in exercises featuring different involvement loads. Participants were 85 first-year English majors in China. WM, which was based on a reading span task, was a significant predictor of gain scores in the comprehension-only group and the gap-fill group but not in the sentence-writing group. WM did not influence delayed vocabulary test scores.
Taken together, the preceding vocabulary-related studies frame WM as a significant predictor of vocabulary learning outcomes. PWM and EWM may play distinct roles in vocabulary learning. Yet scholars’ understanding of WM is somewhat limited, particularly regarding the vocabulary acquired from word-focused exercises.
Rationale for the Present Study
It is necessary to explore vocabulary learning outcomes from word-focused exercises while considering participants’ English proficiency level and WM. Individuals’ phonological short-term memory or EWM interact with foreign language proficiency (Van den Noort et al., 2006). Some researchers have claimed that people with a good WM can remember new words or non-words better than those with an inferior WM regardless of L2 proficiency (Bosman and Janssen, 2017). Others have reported that WM may be important for low-level learners but less important for those with higher levels of English proficiency (Van den Noort et al., 2006). The results of studies on WM and English proficiency level may be applicable to EFL learners’ vocabulary learning from word-focused exercises but remains to be determined. The current study aims to investigate the predictive effects of working memory and L2 proficiency level on vocabulary learning from word-focused exercises. The author hypothesizes that different levels of WM capacity and L2 proficiency, which are critical to retrieving, storing and manipulating information for language learning, may predict vocabulary learning outcomes. The following research questions guide this effort:
Do three word-focused exercise conditions, namely (a) reading with marginal glosses; (b) reading plus gap-fill; and (c) reading plus sentence writing, have differential effects on vocabulary learning (i.e. receptive and productive vocabulary knowledge)? Do learner-related factors (i.e. English proficiency level and WM) predict individuals’ vocabulary learning gain scores in different word-learning conditions?
Methods
Participants
This study focused on non-English major students at a university in southern China. Students from 5 classes were invited to participate; of the 245 students invited, 190 agreed to take part. Ten were later excluded because they were unavailable during the learning sessions. The final sample therefore comprised 180 students (95 men and 85 women) who were randomly and equally assigned to 3 learning conditions (see ‘Word-Focused Exercises’ section for details). In this between-group design, each group consisted of learners with different levels of WM and English proficiency. All participants fulfilled the intervention requirements. Additionally, all were between 18 and 20 years old, were second-year students and spoke Chinese as their first language. Most had been EFL learners for at least 10 years.
Word-Focused Exercises
The word-focused learning conditions in this study were developed based on Laufer and Hulstijn (2001). An English teacher who had experience teaching the participants helped design the exercises and comprehension questions. The first group focused on reading comprehension with marginal glosses. Participants were required to answer 10 multiple-choice comprehension questions after reading a text (see ‘Reading Materials and Target Words’ section). These questions were created to assess a range of abilities needed to read and understand the text. The questions were not directly focused on understanding of the target words; the condition involved a moderate need, as the reading requirement was imposed by the task rather than learners’ motivation (index = 1). Search was absent because the meanings of unfamiliar words were provided as marginal glosses, and dictionaries were not allowed for meaning searches (index = 0). Evaluation was also absent; the context provided a straightforward and literal interpretation without requiring learners to determine contextual meaning (index = 0). The second group engaged in gap-fill exercises after reading a gapped text. Participants were provided with a list of 40 words, including 30 target words and 10 distracter words. Each word was accompanied by its meaning. Participants were asked to choose appropriate words to fill in gaps in the text. Distracter words were added under the assumption that offering the learners additional options (i.e. to choose the meaning that fitted the context) would make the decision process more demanding and learning-focused. This condition involved a moderate need (index = 1). Search was absent (index = 0). Evaluation was moderate, as learners’ decisions involved selecting the meaning that suited the context (index = 2). The third group completed a sentence-writing task. After reading a text, participants were provided with a list of meanings of the target words and then used each word to produce a sentence on their own. This condition included a moderate need (index = 1). Search was absent (index = 0). Evaluation was strong given that participants were asked to produce original sentences; during this process, the learners were required to evaluate new words with other appropriate collocations in their generated context (index = 2). Table 1 lists the involvement load indices for each condition. The sum index of the three components represents the overall involvement load in each condition.
Task-induced involvement load index.
As shown in Table 1, group 3 had the highest level of involvement load with an index of 3, followed by group 2 (index = 2) and group 1 (index = 1).
Reading Material and Target Words
The reading material was a roughly 2000-word passage from BBC News. Participants in groups 1, 2 and 3 were exposed to the same material. The text was entered into a vocabulary profile analyser, indicating that 98% of the words fell within the 2000-frequency level. This text was then re-edited by inserting 30 difficult words, which served as target words (Table 2). The 30 words, which accounted for 1.5% of the total words, might not be a learning burden to the learners’ adequate reading comprehension as the learners know at least 98% of words (Hu and Nation, 2000). In addition, a total of 30 words may allow for an understanding of learners’ vocabulary learning gains. A pre-test confirmed that participants had no prior knowledge of the target words.
A list of 30 target words.
Measures
English Proficiency Level
English proficiency level was measured based on the College English Test (CET)-4, a standardized English test for university non-English majors. The CET-4 is a large-scale, national, high-stakes English test developed by English-language teaching professionals, experts and education authorities in China. Data collection was convenient because all participants had already taken the test. CET-4 measures students’ overall English ability in listening, reading, writing and translation. The maximum test score was 710 points.
WM
An n-back test was used to evaluate participants’ EWM. This tool measures individuals’ ability to decide whether each stimulus in a sequence matches the one that appeared n items ago. Participants completed the n-back task through E-prime software in a lab. All participants used the index and middle fingers of their dominant hand to press one of two buttons denoting ‘target’ and ‘non-target’ on a button box. In the 1-back condition, the target was any letter identical to the letter immediately preceding it (i.e. the letter presented one trial back). In the 2-back condition, the target was any letter identical to that presented two trials back. In the 3-back condition, the target was any letter identical to that shown three trials back (see Figure 1).

Illustration of the n-back task.
Stimuli consisted of the 26 letters of the English alphabet randomly presented in a fixed central location on a computer screen (Miller et al., 2009). Stimuli were displayed for 500 ms with a 2000-ms interstimulus interval. Any input received within the stipulated time was valid. Participants completed six trial blocks (two blocks for each of the above three conditions). Each block contained 30 trials. The first three trials in each block were not arranged as targets; the remaining trials were targets. The condition order was randomized across blocks. Participants were given a 20-s break between blocks. Participants completed 20 training trials per condition to ensure that they understood the task and that their performance had stabilized. The software automatically provided participants’ reaction times and accuracy for each trial; these two elements formed the scoring basis for the n-back task. Reaction times were indicated in ms with accuracy scores as percentage correct. The Cronbach's alpha value (.81) showed that the test was reliable.
Vocabulary Knowledge Test
The Vocabulary Knowledge Scale (VKS) developed by Wesche and Paribakht (1996) was adapted to measure vocabulary learning gains. The VKS has been used to examine vocabulary knowledge growth and was employed as a pre- and post-test in this study to determine whether participants learned a word receptively and/or productively. A sample item is shown in Table 3.
Vocabulary test in the present study.
As indicated in Table 3, each target word in the test was accompanied by four options. Participants were required to mark the appropriate column for each word. The use of dictionaries was not allowed during testing. Learners received 0 points for choosing A (‘I’ve never seen this word/phrase before’). They received one point for choosing B (‘I’ve seen this word/phrase before, but I don’t know what it means’), C (‘I know what this word/phrase means. Please write down the word meanings in Chinese or English’) or D (‘I know what this word/phrase means and I can use it in a sentence. Please write down the sentence’). When choosing D, learners received two additional points for producing a correct sentence but could earn one point for a partially correct sentence. When choosing C, learners received one more point for a correct provision of meaning in either Chinese or English. The maximum possible test score was 90 points.
Test instructions were provided in Chinese to promote participants’ understanding of the requirements. Participants were informed that they would need to complete several exercises but were not informed of the vocabulary test section; the details were not disclosed in order to prevent learners from deliberately memorizing target words during the comprehension phase. This test was reliable (Cronbach's alpha = .88).
Procedure
The research process is depicted in Figure 2. Participants’ CET-4 scores were collected before the study began. Participants completed a pre-test during the first week of the study. After a four-week interval, participants in each condition completed the required exercises. Learners in the first, second and third group were allotted 30, 40 and 50 min for learning completion, respectively. They then took an immediate post-test, which lasted 30 min. The next day, participants spent 5 min becoming familiarized with the WM test and were given 10 min to complete it.

Timeline of the research process.
Data Analysis
All data were analysed in SPSS 24.0. The first research question concerned the effectiveness of word-focused exercises on vocabulary learning; analysis of variance (ANOVA) was adopted for data processing. The second research question considered the predictive power of the independent variables on dependent variables. The general linear model, an extension of linear multiple regression for a single dependent variable (Field, 2013), was applied to address this question. The general linear model goes a step beyond multivariate regression and allows for linear transformations or linear combinations of dependent variables.
Results
The descriptive results are presented in Table 4. In terms of the vocabulary post-test, participants in the sentence-writing group achieved the best performance (M = 45.87), followed by the gap-fill group (M = 38.38) and the comprehension-only group (M = 30.62). Average CET-4 scores were 512.68 for the reading-comprehension group, 531.72 for the reading-plus-gap-fill group and 513.12 for the reading-plus-sentence-writing group. Accuracy percentages on the WM test for the reading-comprehension, reading-plus-gap-fill and reading-plus-sentence-writing groups were 59.53, 61.32 and 66.57, respectively.
Descriptive results.
Note: The response time for WM in the Comprehension only, Gap-fill and Sentence writing was 635.02 (135.02), 751.02 (175.32) and 689.25 (146.56). The mean scores of WM were based on accuracy percentage.
The homogeneity of variance was next evaluated using Levene's test. In this case, an F-test was performed to assess the null hypothesis that the variance was equal across groups. The p value was greater than .05, indicating that this assumption was not violated. Parametric analysis was thus appropriate. Table 5 lists the results of the analysis of variance.
Results of the analysis of variance.
Based on Table 5, significant differences emerged between the three groups in terms of English proficiency level [F(2, 179) = 5.042, p < .05, partial η2 = .05], WM [F(2, 179) = 6.452, p < .05, partial η2 = .06] and the vocabulary post-test [F(2, 179) = 20.990, p < .001, partial η2 = .39]. The ANOVA results did not show a significant interaction effect between group effect and English proficiency level, F(4, 179) = .218, p = .926, partial η2 = .024. The ANOVA results also did not show a significant interaction effect between group effect and working memory, F(6, 179) = 1.352, p = .260, partial η2 = .184. Post-hoc multiple comparisons showed that participants in the sentence-writing group outperformed participants in the gap-fill group (p < .001) as well as in the comprehension-only group (p < .001) on vocabulary learning performance.
The next step was to understand the associations of WM and English proficiency level with vocabulary learning performance. Correlation results based on the Pearson correlation coefficient between variables were reported accordingly. A significant correlation was observed between participants’ English proficiency level and WM (r = .613) and between their English proficiency level and vocabulary learning post-test (r = .693). In addition, WM and the vocabulary learning post-test were significantly correlated (r = .546).
The next step was to compare the predictive power of each variable on vocabulary learning performance in each condition (Table 6).
The predictive power of each variable on vocabulary learning in each task.
The standard regression coefficient in Table 6 indicates that participants’ English proficiency level significantly predicted their gain scores in the immediate vocabulary post-test in each condition (p < .001). Additionally, WM was a significant predictor of gain scores in the immediate vocabulary post-test in the gap-flll group (p < .001) and sentence writing group (p < .05) but not in the reading group (p > .05).
Discussion
The present study provides support for the effectiveness of word-focused exercises on vocabulary learning. Findings show that a higher involvement load in a learning condition facilitates the learning of unfamiliar words. The general linear regression model results also reveal the predictive effects of group conditions on vocabulary learning. Furthermore, vocabulary learning outcomes can be contingent on individuals’ WM and English proficiency level.
Word-Focused Exercises and Vocabulary Learning
Vocabulary knowledge gains can occur through word-focused exercises. Participants in the sentence-writing group achieved the best vocabulary learning performance, followed by the gap-fill and comprehension-only groups. These results lend support to the ILH proposed by Laufer and Hulstijn (2001). In line with Kim (2011), the input-and-output exercises as represented in the reading-plus-sentence-writing condition enabled participants to reuse and reprocess target items for writing. Such a technique is particularly conducive to word learning, substantiating the assertion that the output-based condition overrides input-based effects. However, in Keating’s (2008) experiment, sentence writing was not significantly better than the gap-fill task. These conflicting findings may have arisen from the different designs of the sentence-writing condition. Participants assigned to the sentence-writing condition in Keating’s (2008) study did not have a text at their disposal. Writing based on reviewing a list of L2 words may pose higher cognitive demands, thereby hindering more effective word learning. The reading stage in the current study may have facilitated participants’ word retrieval or information processing. Evaluation is thus the most influential factor in vocabulary learning performance, as concluded in a meta analysis (Yanagisawa and Webb, 2021). Differing degrees of evaluation might induce varying extents of involvement when processing a word. Having a text at participants’ disposal may also have helped them map L2 lexical forms to ‘meanings or concepts already in the existing semantic or conceptual system’ (Jiang, 2002: 618). The word-focused exercises then reactivated their recall of contextual clues surrounding the target words encountered during reading, leading to greater manipulation of vocabulary knowledge. As argued by Schmitt (2010), ‘the more manipulation involved with the item, the greater the chances it will be remembered’ (26).
The main difference between the three learning conditions in the present study lay in the evaluation component. Findings support previous research underscoring evaluation as a key element in designing tasks based on the ILH (Teng and Zhang, 2021a). The sentence-writing condition involved strong evaluation, which may have helped participants to notice and comprehend the form–meaning links of target words. Indeed, evaluation in the sentence-writing condition required additional syntagmatic decisions related to the semantic appropriateness of a word and its context. While participants attempted to generate original content, they may have encountered a scenario requiring them to assemble the target items from the text into a mental space before producing a sentence. This process could lead to double practice: pre-task planning and practising new words during writing. The efficacy of the sentence-writing condition might be due to multiple retrievals of information or contextual clues surrounding the target word. Participants in this condition needed to decide if the meaning made sense, if the word could be used in a better way and if the syntax of the created sentence was correct. Including multiple target word retrievals in a condition with a strong evaluation component seemingly promotes EFL vocabulary learning more effectively than multiple retrievals of target words in a condition with moderate evaluation. Such findings counter those of Folse (2006), who argued that having multiple target word retrievals in a gap-fill exercise was more useful for L2 vocabulary learning than the purported deeper processing that writing original sentences may offer. However, the gap-fill condition in Folse’s (2006) study contained three different exercises, which may have led to more retrievals of target words compared to the one gap-fill exercise in the present study. These findings suggest the need for additional research in this area. It might be possible to include retrieval as a component for the ILH. Word retrieval in an exercise could increase the predictive capacity of the ILH in vocabulary learning. Incorporating retrieval may be an effective way to expand evaluation and increase the accuracy of the ILH in predicting task effectiveness.
English Proficiency and Vocabulary Learning
Contributing to previous studies, the present study reveals English proficiency level as a significant predictor of vocabulary learning performance. In line with Tekmen and Daloǧlu (2006), advanced learners gained a significantly greater number of words compared to those with a lower proficiency level. The results did not support Kim’s (2011) study, wherein English proficiency level was not a significant predictor of vocabulary learning and retention through different types of tasks. In addition, in an empirical study (Hulstijn and Laufer, 2001), less-proficient participants performed as well as more-proficient participants under the same ILH-based learning condition. However, participants in Hulstijn and Laufer's study were students at a US university; Kim's sample consisted of ESL learners in the US. Neither sample is similar to one containing EFL non-English major students in a Chinese context where English is not commonly used.
Based on the findings, word-focused exercises with the greatest involvement load might benefit learners more. Yet individuals with a higher level of English proficiency may still achieve better vocabulary learning performance. Arguably, only individuals who have reached a certain level of English proficiency will be prepared to maximize the potential of word-focused exercises for vocabulary learning.
WM and Vocabulary Learning
In this study, WM was found to be a significant predictor of vocabulary learning outcomes across the three participant groups. Although research has shown that phonological and complex WM significantly influences vocabulary learning outcomes (Teng and Zhang, 2021b), the present study sheds light on how learners’ working memory and English proficiency can jointly influence vocabulary learning gains. These findings offer a valuable complement to previous work and provide new evidence that EWM, as measured by an n-back task, is a powerful predictor of vocabulary learning. Phonological short-term memory for learning new vocabulary has been found to decline when learners’ proficiency level increases (Cheung, 1996). Even so, EWM remains a key predictor in vocabulary learning, even for individuals at advanced levels and who have received task instruction.
This study's findings do not confirm the significant role of WM in acquiring words from reading comprehension. Yang et al. (2017) pointed out that WM only predicted vocabulary learning gains in the reading-comprehension-only and gap-fill groups; it did not significantly affect participants’ vocabulary learning performance in the sentence-writing group. In the present study, WM significantly affects the vocabulary learning performance in the sentence-writing condition. It appears that sentence writing is a cognitive activity in which learners must activate memory resources to connect prior and present information. Under the sentence-writing condition, learners need to process new words and mobilize their WM resources to juggle form and meaning to integrate new and previously known words or information requiring activation. The sentence-writing condition may require learners to draw on abilities essential for online performance; WM is thus also a crucial cognitive device for sentence writing. When WM enables learners to better engage in deeper cognitive processing of target words, these individuals may have an advantage over learners in other groups in terms of processing and acquiring new words.
Limitations and Implications
The present study includes several limitations. First, based on Wen’s (2016) assertions, phonological short-term memory and executive WM predict different aspects of L2 learning. The two functions of WM may predict distinct types of vocabulary knowledge (i.e. receptive and productive knowledge). Future studies can explore the mechanism related to various facets of working memory and vocabulary knowledge. Second, vocabulary learning is a gradual and incremental process; a delayed test to measure learners’ retained vocabulary should be conducted. Third, enlightened by Teng (2020), word exposure frequency can influence vocabulary learning outcomes. As further argued by Folse (2006), the frequency of exposure to the target words is what matters, not the level of elaboration, in ILH learning conditions. The target words in the present study appeared once. Subsequent studies should consider repeated occurrences of target words along with more elaborate word processing. Finally, the vocabulary knowledge test was administered as a pre- and post-test. The practice effect may have applied even with a four-week interval between the first and second administrations.
Despite these limitations, findings highlight the benefits of word-focused exercises for vocabulary learning. Results also indicate the need for a strong evaluation element in helping learners to process the semantic and grammatical information of target words, thereby consolidating new words’ form–meaning links. Apart from comprehension-oriented exercises, learners may benefit from input–output cycle exercises. The findings demonstrate the predictive role of English proficiency level and WM in vocabulary learning; it is essential to attend to learners’ individual differences in these respects to maximize vocabulary learning outcomes from word-focused exercises.
Footnotes
Availability of Data and Material
The data and materials that support the findings of this study are available on request from the corresponding author.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by National Social Science Fund of China, entitled cross-sectional effects and longitudinal development of working memory and vocabulary acquisition (Grant number: 22BYY182).
Correction (December 2022):
This article has been updated to correct the funding statement since its original publication.
