Abstract
This study examined the effects of involvement load-based tasks on vocabulary learning in a foreign language, as well as the extent to which task effects are predicted by learners’ metacognition (i.e. metacognitive knowledge and regulation). A total of 120 Chinese university students of English as a foreign language (EFL) were randomly assigned to four task conditions: (1) reading; (2) reading + gap-fill; (3) reading + writing; and (4) reading + writing with the use of a digital dictionary. The Vocabulary Knowledge Scale was adapted to measure condition effects. The Metacognitive Awareness Inventory was used to examine learners’ metacognitive knowledge and regulation. Results revealed that the group of learners who completed reading + writing tasks with the use of a digital dictionary demonstrated the best performance in acquiring receptive and productive vocabulary knowledge, followed by the reading + writing group, the reading + gap-fill group, and, finally, the reading only group. Multiple regression analysis supported the predictive effects of metacognitive regulation on task-based vocabulary learning. Structural equation modelling presented an overall profile of task-based vocabulary learning and metacognition. Based on the findings, we proposed a framework to understand the relationship between learners’ metacognition, task type, and L2 vocabulary learning.
Keywords
I Introduction
Vocabulary provides the building blocks for teaching and learning a foreign language (Oxford, 1990; Schmitt, 2008). However, vocabulary acquisition is a complex and incremental process (Schmitt, 2010). One main reason, as argued by Richards (1976), may be that knowing a word requires more than simply recalling word meanings; word knowledge calls for understanding various features, including collocations, associates, deviations, and grammatical functions and constraints. The acquisition of vocabulary knowledge needs to consider both receptive and productive knowledge (Nation, 1990). Receptive vocabulary knowledge captures one’s ability to recognize a word form and understand the word’s meaning, especially in receptive tasks (e.g. listening and reading) (Nation, 1990; Schmitt, 2014). Productive vocabulary knowledge refers to learners’ ability to use a word correctly in various contexts, particularly when speaking and writing (Nation, 1990, 2001; Schmitt, 2014). Vocabulary acquisition can be daunting given the multifaceted nature of words.
Vocabulary knowledge can benefit all language skills, including listening, speaking, reading, and writing, and thus serves as the foundation of learners’ language proficiency (Nation, 2001; Schmitt, 2000). In particular, reading proficiency depends on learners’ background knowledge, strategies, phonological processing/word recognition, vocabulary, and genre knowledge (Bernhardt, 2011). Over the past few decades, research has revealed a great deal of information about how learners acquire words incidentally from reading (Webb & Chang, 2012). The amount of reading is important to incidental vocabulary learning because reading provides learners with repeated or multiple exposures to words and is also one effective means by which learners learn vocabulary in rich contexts. Reading comprehension refers to ‘an active and complex process that involves understanding written text, developing and interpreting meaning, and using meaning as appropriate to the type of text, purpose and situation’ (Paris & Hamilton, 2009, p. 32). This statement highlights the potential in learning vocabulary from reading. In the present study, the basic idea of involvement load tasks focused on reading, which refers to an ability to decode words while attempting to obtain meaning from texts.
In academia, second language (L2) vocabulary learning continues to attract scholars’ attention. Vocabulary acquisition concerns the frequency of word exposure (Teng, 2020a), English proficiency level (Kim, 2008), the quality of tasks involving lexical items (Laufer & Rozovski-Roitblat, 2015), and learners’ efforts in engaging with lexical items (Huang, Eslami, & Willson, 2012). Overall, ‘the more a learner engages with a new word, the more likely he/she is to learn it’ (Schmitt, 2010, p. 26). Vocabulary learning is thus part of a cyclical process where one’s self-regulation of learning leads to the greater involvement with and use of vocabulary learning strategies, which in turn leads to a better mastery of strategy use (Teng & Reynolds, 2019). The promotion of involvement is hence ‘the most fundamental task for teachers and materials writers, and indeed, learners themselves’ (Schmitt, 2008, p. 339).
The idea of involvement comes from Craik and Lockhart’s (1972) levels of processing model. This model laid the groundwork for Laufer and Hulstijn’s (2001) involvement load hypothesis (ILH) by stating that the more attention given to an item, and the more manipulation involved with that item, the greater the likelihood of item recall. According to the ILH, the acquisition of unfamiliar words is contingent upon a task’s involvement load. Studies on learning and teaching vocabulary in a foreign language based on the ILH have continued to evolve, particularly in exploring the extent of learners’ mental effort or engagement in processing new words. Despite criticism (e.g. Huang, 2018; Keating, 2008; Nation & Webb, 2011), the ILH has been adopted in many areas of second language learning, especially regarding vocabulary acquisition (e.g. Hulstijn & Laufer, 2001; Kim, 2008). Laufer and Hulstijn (2001) claimed that task effectiveness was based on the assumption that other factors are equal, and they acknowledged the influence of individual differences in task effectiveness. However, given the predictive effects of learners’ self-regulated capacity in vocabulary learning (Tseng & Schmitt, 2008), it is still necessary to examine task-induced vocabulary learning from a cognitive perspective (e.g. the amount and type of attention or mental effort needed to decipher an unknown word in different task conditions) (Qin & Teng, 2017). Researchers and classroom practitioners have also become interested in how learning vocabulary in a foreign language can be enhanced within task conditions and how task and learner variables may shape the vocabulary learning outcome (Hulstijn, 1993).
For example, several factors correlate with success in vocabulary learning, such as high motivation (Tseng & Schmitt, 2008), rich target language input (Perez & Desmet, 2012), and learners’ metacognitive awareness of their regulatory capacity (Qin & Teng, 2017). Researchers have increasingly emphasized the need to enhance learners’ metacognitive awareness of their self-regulatory capacity when learning vocabulary. Metcalfe (2008) contended that learners’ abilities to reflect on their own thoughts (i.e. metacognition) is a recent result of evolution and a factor in learners’ beliefs about and efforts to learn. One could therefore assume that vocabulary learning relies on principles related to learners’ metacognitive development (e.g. the activation of prior knowledge, reflections on what and how to learn, and the involvement in setting goals in learning tasks). It is thus essential to enhance learners’ awareness of their vocabulary acquisition processes and increase their readiness for related tasks. As such, learners must be presented with tasks that encourage them to be metacognitively active in exploring vocabulary learning requirements, in reflecting on their abilities, and in evaluating how they can maximize their learning affordances.
Nevertheless, limited research has taken metacognition as a framework to elucidate task-induced vocabulary learning performance in the English as a foreign language (EFL) field. An exploration of metacognition, which is an overarching aspect of how learners regulate their learning, suggests that learners’ judgments about what they know or do not know in terms of accomplishing a task may influence their performance. Metacognitive knowledge and regulation can also affect their ability to learn vocabulary from reading tasks (Teng & Reynolds, 2019). More specifically, learners’ varying levels in terms of possessing an awareness of related tasks, and of themselves as learners, may predict their performance. Learners who are more familiar with different strategies for learning, thinking, and solving problems when processing a task may perform better (Zohar & Adi Ben, 2009). Accordingly, we collectively consider individual differences in metacognition and task effectiveness in this article. A focus on metacognition can provide a fresh perspective on L2 vocabulary acquisition.
1 Metacognition
Flavell (1976) defined metacognition as ‘one’s knowledge concerning one’s own cognitive processes and products or anything related to them’ (p. 232) or the ‘active monitoring and consequent regulation and orchestration’ of cognitive processes for an individual to achieve cognitive goals (1979, p. 252). Metacognition thus encompasses learners’ ongoing experiences or judgments while accomplishing a task. Flavell (1979) identified three domains of metacognition: metacognitive knowledge, metacognitive experiences, and metacognitive strategies. The metacognitive knowledge domain refers to a combination of three types of knowledge, namely person, task, and strategy knowledge. Person knowledge involves knowing oneself and others as cognitive processors (e.g. beliefs about what individuals think they can and cannot do well). Task knowledge denotes learners’ awareness of how to manage a task and achieve a learning goal. Strategy knowledge represents learners’ beliefs about which strategies are essential to goal achievement. The second domain, metacognitive experiences, refers to ‘any conscious cognitive or affective experience[s] that accompany and pertain to any intellectual enterprise’ (Flavell, 1979, p. 906). Lastly, learners depend on metacognitive strategies to control their own cognition.
Anderson (2002) interpreted metacognition as including planning for learning, selecting and adopting learning strategies, monitoring the effectiveness of adopted strategies, orchestrating relevant strategies, and evaluating strategy use and learning performance. Therefore, metacognition involves ‘the knowledge and control’ that learners have over ‘their own thinking and learning activities’ (Cross & Paris, 1988, p. 131) or ‘the monitoring and control of thought’ (Martinez, 2006, p. 696). As construed from such arguments, metacognition represents a learner’s ability to deploy strategies taught within similar but new contexts. To do so, learners need executive control involving monitoring and self-regulation. The transfer of metacognitive strategies to new learning situations can ignite learners’ cognition and possibly lead to better performance. Thus, metacognition is a multidimensional set of general, rather than domain-specific, skills (Schraw, 1998). While processing learning tasks, learners’ understanding and control of cognitive processes may affect their performance (Schraw, Crippen, & Hartley, 2006). Therefore, in terms of task performance, it is necessary to consider learners’ metacognitive abilities with regard to selecting and using metacognitive strategies along with a conscious awareness of their learning processes relative to task effectiveness.
Following Brown (1987), Teng (2016) described metacognition as including both the knowledge and regulation of metacognition. Metacognitive knowledge involves learners’ awareness of their own cognitive processes, while the regulation of metacognition involves learners’ abilities to self-regulate their own learning. Metcalfe and Shimamura (1994) discussed three types of metacognitive knowledge: declarative knowledge (i.e. how learners perceive themselves as learners and the inherent factors that can influence their academic success); procedural knowledge (i.e. learners’ awareness of strategies when performing tasks); and conditional knowledge (i.e. learners’ effective selection of strategies and allocation of resources to facilitate learning). The regulation of metacognition entails three skills: planning (i.e. the ability to select appropriate strategies and allocate resources when completing a task), monitoring (i.e. an awareness of one’s comprehension and performance in task conditions), and evaluation (i.e. the appraisal of one’s efficiency while performing a task) (Schraw, 1998). In relation to vocabulary learning, the regulation of metacognition refers to learners’ conscious regulation of vocabulary learning through managing cognitive loads and applying metacognitive vocabulary learning strategies. In particular, the planning, monitoring, and evaluation processes in vocabulary learning have been identified as major regulatory components and serve as links to vocabulary learning sub-processes (Rasekh & Ranjbary, 2003).
The value of learners’ metacognitive awareness in vocabulary learning is well established (Teng & Reynolds, 2019; Tseng & Schmitt, 2008). Individual differences in the metacognitive characteristics of vocabulary acquisition may explain why students are not equally successful in learning vocabulary. Some learners may not apply metacognitive knowledge due to cognitive constraints. For example, the process of lexical retrieval consumes a large part of learners’ cognitive capacity (McCutchen, 1996). Cognitive constraints may also inhibit some students from activating the metacognitive knowledge needed for vocabulary learning. Additionally, the knowledge and regulation of metacognition interact and may influence learners’ task engagement. Vocabulary learning can thus be conceptualized as a metacognitive selection process: vocabulary learning resources are chosen via executive activities (e.g. planning, monitoring, and evaluating), underscoring the importance of metacognitive regulation in second language learning. As Teng (2020b) argued, the regulation of metacognition plays a unique role in predicting writing proficiency, and such a prediction is much more accurate than that based on a knowledge of metacognition. In a study of vocabulary acquisition (Tseng & Schmitt, 2008), self-regulating capacity was identified as an important mechanism that ‘functions to maintain learners’ intention to learn and to generate support for the implementation of learning behaviors’ (p. 361). Therefore, learners’ orchestration of various metacognitive strategies is vital to vocabulary learning. Tseng et al. (2008) also noted that involvement in vocabulary learning ‘helps organize a learner’s strategic options and helps learners gain mastery over the learning tactics’ (p. 366).
2 The involvement load hypothesis (ILH)
The task-induced involvement construct is grounded in the depth of processing theory (Craik & Lockhart, 1972). Researchers have paid particular attention to traditional components of effective tasks, such as noticing, attention, elaboration, and motivation. Laufer and Hulstijn (2001) presented the ILH as a new formula for vocabulary instruction, wherein the effective acquisition of new words depends on the mental effort (involvement) learners devote to processing new words. They proposed a motivational-cognitive construct of involvement, which consists of three components, namely need, search, and evaluation. These three components, which can be quantified, can be applied in predicting word learning and retention. Specifically, need is the motivational, non-cognitive dimension of involvement. Need is considered to be moderate if the learning is task-imposed, and strong if learner-imposed (i.e. when learners are intrinsically motivated to communicate a concept for which they lack a word). The two components, i.e. search and evaluation, are cognitive dimensions of involvement. The focus is on how learners process information and memorize word form and meaning. Search indicates that learners can use resources to determine the meaning of unknown words during a task; search is absent when such effort is not required. An example of search is checking the meaning of an unfamiliar word in a dictionary. Evaluation refers to a learner’s attempt to compare a new word with those already known, such as when deducing the particular meaning of a word from its other meanings or when assessing its suitability in a given context. For example, when learners look up a homonym in a dictionary, they need to choose the most appropriate meaning after comparing all its meanings based on the specific context. Evaluation is moderate when learners are only required to recognize differences between words in a given context, such as during a fill-in-the-blank task; evaluation is strong when learners must make decisions about the meanings of unknown words and combine them with known words in a new context, such as when writing a sentence or composition. In this study, we applied the ILH when scoring tasks to determine the effects of task difficulty on vocabulary learning.
The combination of these three dimensions – need, search, and evaluation – determine the weight of involvement in a task. A task may not include all three dimensions. A task is scored as 0 when a dimension is absent, as 1 if a dimension is moderately present, and as 2 if a dimension is strongly present. When the task involvement level is high, learners may achieve greater word learning (Laufer & Hulstijn, 2001). This hypothesis, with some basic assumptions, implies that differences in task completion time reflect varying task demands (Keating, 2008). The three involvement dimensions are equal, such that none is prioritized. It is the nature of the tasks, not the strategic approach taken by the learners, that brings about effective vocabulary learning. As noted, however, the ILH is not without criticism. Huang (2018) suggested that rather than ILH-induced effects, learners’ inferences and the repetition of occurrences affect incidental vocabulary acquisition. Keating (2008) pointed out that the ILH does not differentiate task types (i.e. input or output tasks) and highlights the degree of task involvement as the sole determinant of vocabulary acquisition. Nation and Webb (2011) argued that the reason for gaining the highest retention of words in the composition writing task in Hulstijn and Laufer’s (2001) study may be because of spending the longest time on that task, rather than the nature of the task. Furthermore, according to Kim (2008), individual differences, e.g. L2 proficiency and cognitive involvement, might be important to consider when implementing pedagogic tasks. The findings suggest that a deeper level of processing of the new words facilitated L2 vocabulary learning. This is evidence for the evaluation component of tasks.
3 Previous research on the ILH
Many scholars have investigated the effects of task involvement load on L2 vocabulary learning using the ILH. For example, Hulstijn and Laufer (2001) classified their study participants into three groups. Each group was required to complete tasks with different degrees of mental effort and involvement: (1) reading (no involvement load); (2) reading plus vocabulary fill-in task (moderate involvement load); and (3) sentence writing (strong involvement load). Results showed that word retention in the sentence-writing group was better than in the other two groups, especially the reading group. As illustrated in Table 1, although Keating (2008) and Kim (2008) considered different participant groups from diverse foreign language learning contexts, these studies all indicated that retention was highest for groups with strong mental effort and involvement load. Such conclusions confirm the effects of task involvement load on L2 vocabulary acquisition.
Three involvement load hypothesis (ILH) empirical studies.
In the original ILH, Laufer and Hulstijn (2001) distinguished between involvement load in tasks. Zou (2017) later conducted an empirical study and extended the evaluation component of the ILH by exploring how cloze exercises (moderate effort), sentence writing (strong effort), and composition writing (very strong effort) affected L2 word learning. Findings revealed that the two writing tasks with greater involvement load contributed to better word retention. Zou further differentiated between the involvement load in sentence writing and composition writing, noting that these tasks had distinct effects on learners’ L2 vocabulary acquisition. Specifically, evaluation in the composition-writing task was stronger than in the sentence-writing task, leading to better vocabulary retention in the composition-writing group.
Overall, according to research related to the original ILH, involvement load has clear effects on learners’ L2 vocabulary acquisition; that is, higher involvement loads contribute to better vocabulary retention. Scholars have also attempted to consider other factors that may affect L2 vocabulary acquisition. For example, based on Vygotsky’s (1978) sociocultural theory as well as Craik and Lockhart’s (1972) levels of processing, Jahangiri and Abilipour (2014) considered whether and how collaboration and exercise type can affect vocabulary retention. Focusing on word-learning strategies, Nassaji and Hu (2012) verified a significant relationship between the degree of task-induced involvement load and learners’ use of lexical referencing strategies, showing that high involvement loads contributed to greater use of word-based strategies. However, evaluation, which refers to a process that requires learners to choose the most appropriate meaning for an unknown word based on the critical examination and comparison of different meanings, has not been considered from the perspective of metacognition. Search, also a cognitive dimension of the ILH, needs to be further explored. The engagement of learners in metacognitive thinking is considered necessary, as the development of metacognitive skills helps learners to become thoughtful with regard to their learning process (Metcalfe, 2008; Metcalfe & Shimamura, 1994).
4 Rationale and research questions
Despite research on task-induced involvement load and vocabulary learning, exactly how learners’ metacognitive awareness predicts vocabulary learning performance in different ILH-based tasks remains to be explored. Metacognition, referring to the effective coordination of cognitive resources, is a driving factor in learners’ vocabulary acquisition (Qin & Teng, 2017; Teng, 2017). The effects of learners’ metacognitive awareness on vocabulary acquisition may shed further light on the extent to which vocabulary can be acquired from tasks. For instance, learners who adopt metacognitive strategies are more likely to remain engaged in and complete a task. Learners who engage in metacognitive control via self-regulation strategies may demonstrate a willingness to comply with task norms or features and thus maximize the afforded task benefits. An understanding of how metacognition predicts vocabulary learning performance in tasks with different involvement loads may reveal the extent to which a task can capture learners’ interest, stimulate their metacognitive functioning, and encourage engagement in vocabulary learning processes. Studies have shown that metacognitive knowledge and metacognitive regulation conjointly influence learners’ academic performance (Bandura, 1986; Brown & Kinshuk, 2016). Considering that learning outcomes are influenced by learners’ ‘self-generated thoughts and behaviors that are systematically oriented toward the attainment of their learning goals’ (Schunk & Zimmerman, 2012, p. 59), learners’ performance in involvement-induced tasks may be influenced by their general metacognitive abilities.
This study aims to explore learners’ metacognitive awareness of knowledge and regulation and their vocabulary learning performance under multiple task conditions. The focus was on the predictive effects of metacognitive awareness on vocabulary learning performance under varying degrees of task-induced involvement. The present study aims to address the following questions:
How does one’s degree of task involvement affect vocabulary learning?
To what extent do learners’ metacognitive knowledge and regulation predict the effects of task conditions on word learning?
II Method
1 Participants
Participants were recruited from a comprehensive university in southwestern mainland China. The initial participant pool included 150 Chinese EFL students. They were enlisted for this study through university email invitations. Each participant was paid 50 RMB after taking the study. However, 20 participants had already known some of the target words on a pre-test. Their data were excluded from the final analyses as a pre-existing knowledge of target words would contaminate our results. In addition, ten students withdrew from the study after taking the pre-test. The final participant pool thus included 120 students who were randomly and equally assigned to four task conditions (see Section II.3). All the participants were English majors between 18 and 20 years old. Although the age at which students begin learning English in China varies regionally, all the participants had been learning English for at least nine years. In addition, the participants met the cut-off score of 26 out of 30 at the 2,000- or 3,000-word level of the Vocabulary Levels Test, indicating mastery in reading a text written with words within 2,000- or 3,000-word level (Schmitt et al., 2001). We adopted the Vocabulary Levels Test to assess the written receptive vocabulary knowledge of learners of English (Kremmel & Schmitt, 2017).
2 Reading materials and test items
When selecting reading materials, we considered genre, learners’ vocabulary size, and learners’ reading comprehension ability. Materials included three texts from a textbook. Each text contained around 800 tokens, i.e. the total number of words in a text. The texts functioned as three parts of an adventure story, adapted from The Adventures of Tom Sawyer. This story, about an imaginative and mischievous boy named Tom Sawyer, was assumed to be more interesting than reading scientific and academic texts; that is, students may be more motivated to read this text, leading to more engagement in reading activities (Wang & Guthrie, 2004). According to the VocabProfile (http://www.lextutor.ca/vp/comp) section of Compleat Lexical Tutor (www.lextutor.ca; Vocabprofile Compleat, 2020), more than 95% of the texts consisted of 2,000-word-level words. Considering participants’ mastery at the 2,000-word level, these texts were deemed appropriate for participants to read and understand based on their vocabulary knowledge (Laufer & Nation, 1999; see more in Section II.1).
However, we could not identify enough difficult words to evaluate students’ vocabulary learning outcomes because the texts mainly consisted of high-frequency words. We therefore decided to replace some words with low-frequency words. Three experienced English teachers jointly discussed the three texts, after which eight words in each were replaced with low-frequency words with similar meanings. A native English speaker then checked the language flow of the revised texts. The replaced words were at the 5,000-word level or above. Based on the pre-test (see Section II.4), participants had no prior knowledge of any of these target words (Table 2).
The 24 target words.
3 Task types
The tasks involved reading conditions that differed from one another based on the degree of involvement (Laufer & Hulstijn, 2001). Participants were randomly and equally assigned to one of the following treatment conditions: (1) reading texts with marginal glosses (reading only); (2) reading texts with fill-in-the-blanks using a given word list containing the target words, low-frequency words, and high-frequency words (reading + gap-fill); (3) reading texts with marginal glosses and writing a composition (reading + writing); and (4) reading texts with marginal glosses and writing a composition while having a digital dictionary on hand (reading + writing with the use of a digital dictionary). In the present study, as Table 3 shows, the tasks involved a moderate need because participants’ motivation to understand and learn unknown words was attributable to the task requirement. Search was absent in Tasks 1–3 because participants were not provided with tools to identify the meanings of target words; however, they could use a digital dictionary in Task 4. Evaluation was absent in Task 1 as learners could find word meanings in marginal glosses. Evaluation was moderate in Task 2; learners had to determine word meanings from a word list. Evaluation was very strong in Tasks 3 and 4, where learners had to determine word meanings from a word list and evaluate word use via writing. Following the original ILH and Zou’s (2017) augmented evaluation framework, the involvement load of Task 1 was 1 (moderate need). The involvement load was 2 when tasks required moderate evaluation. In this study, Tasks 3 and 4 produced ‘connected discourse [that] involves more elaborate processing of the target words than producing disconnected sentences’ (Keating, 2008, p. 379). Therefore, the involvement load of Tasks 3 and 4 was 4 and 5, respectively.
Task-induced involvement load index.
4 Measures
a Metacognitive assessment
Participants completed the Metacognitive Awareness Inventory (MAI) (Schraw & Dennison, 1994). Self-report questionnaires are often used to measure learners’ metacognitive awareness and regulation (Metcalfe & Shimamura, 1994). The MAI included two metacognition subscales: the knowledge of metacognition (17 items) and the regulation of metacognition (35 items). Schraw and Dennison (1994) created these subscales from a larger pool of items based on established theory (Flavell, 1979; Jacobs & Paris, 1987). The knowledge subscale addresses declarative, procedural, and conditional knowledge, and the regulation subscale addresses planning, information management strategies, monitoring, evaluation, and debugging strategies. Exploratory factor analysis reflected the validity of the scale’s internal structure (Harrison & Vallin, 2018). Händel, Artelt, and Weinert (2013) further confirmed the MAI’s reliability and applicability, particularly for metacognitive knowledge. Cronbach’s alphas for the knowledge and regulation subscales were .71 and .78, respectively, indicating sufficient reliability in the present study.
Items were scored on a 5-point Likert-type scale ranging from always false (0) to always true (4). The sums of all the scores on each subscale were calculated separately to capture learners’ awareness of metacognitive knowledge and metacognitive regulation. The maximum possible score on each subscale was 68 and 140 points, respectively.
b Vocabulary tests
We adapted Wesche and Paribakht’s (1996) Vocabulary Knowledge Scale (VKS) to measure participants’ word-learning growth (Table 4) and explore aspects of vocabulary knowledge. As Nation (2001) argued, vocabulary knowledge involves word form, meaning, and use. The VKS in our study included receptive knowledge and productive knowledge and served as a pre- and post-test. The pre-test was completed four weeks prior to the study and showed that learners had no prior knowledge of the target words. As Table 4 indicates, learners were required to demonstrate word knowledge receptively (i.e. knowing the word’s meaning) and productively (i.e. being able to use the word in productive tasks). However, exposing learners to the pre-test could indirectly cause learners to focus on the target words. We thus adopted the following procedures to minimize the test–retest effect: (1) 50 high-frequency words were mixed in with target words and (2) learners had a 4-week break between tests to minimize the deliberate memorization of the target words. The post-test was administered immediately after the intervention.
Vocabulary knowledge test.
As explained above, knowing a word is not dichotomous in that a word is either known or not (Schmitt, 2014); word knowledge follows a continuum from receptive to productive knowledge. As displayed in Table 4, productive knowledge was tested first, followed by receptive knowledge, to minimize carryover effects between tests. Learners were given 0.5 points for providing a correct sentence using the target words and 0.5 points for providing synonyms or definitions reflecting words’ meanings. In the receptive knowledge test, participants could earn 1 point when they chose the right option. No points were awarded for incorrect or partial answers on either test. The maximum possible score on both the receptive and productive tests was 24 points. Cronbach’s alpha values for the receptive and productive knowledge tests were .78 and .83, respectively, indicating sound reliability.
5 The procedure
The experiment was conducted during classes. The participants completed a pre-test in the first week, which was intended to determine whether they had prior knowledge of the target words. They were required to complete four tasks in the fifth week of the study. Huang et al. (2012) found that inconsistent completion times on involvement load-based tasks may lead to contaminated results; thus, one hour was allocated for each of the four tasks. Participants took a 30-minute post-test immediately after completing each task. The purpose of this test was to understand the level of vocabulary knowledge (i.e. receptive and productive knowledge) mastered after each task. Participants completed the MAI after the test to assess their metacognitive knowledge and regulation. All tests were in paper-and-pencil format.
6 Data analysis
Participants’ scores on the VKS and MAI were analysed in SPSS and AMOS. One-way independent analyses of variance (ANOVA), followed by Tukey post-hoc comparisons, were conducted to reveal group differences in vocabulary learning. Multiple linear regression analyses were employed to identify models demonstrating the extent to which the two types of metacognition were predictive of participants’ task-based vocabulary learning scores. Multiple regression analysis was adopted to predict the value of a dependent variable (e.g. vocabulary learning scores) from a collection of independent variable values (e.g. metacognitive awareness). Finally, structural equation modelling was performed in AMOS to provide an overview of how metacognition predicted participants’ task-based vocabulary learning performance.
III Results
Our first research question pertained to how the four groups (Group 1: reading only;
Group 2: reading + gap-fill; Group 3: reading + writing; Group 4: reading + writing with the use of a digital dictionary) differed in terms of vocabulary learning. Table 5 summarizes descriptive statistics for learning receptive and productive vocabulary knowledge in the four task conditions. Task conditions with a higher involvement index seemed to yield better scores for receptive vocabulary knowledge (Groups 1–4: 8.300, 14.167, 18.567, and 22.233, respectively) and productive vocabulary knowledge (Groups 1–4: 3.567, 9.133, 13.567, and 17.333, respectively). The fourth group yielded the best results (receptive knowledge: 22.233; productive knowledge: 17.333).
Descriptive results for vocabulary learning in the four conditions.
We used Levene’s test to evaluate whether the variances in the four groups were significantly different. Results showed homogeneity of variance (i.e. the significance of Levene’s test was greater than .05). These findings confirmed that the one-way ANOVA was robust (Table 6).
Results for the one-way independent ANOVA.
The ANOVA results indicated significant differences in mean scores for the post-intervention receptive knowledge test across the groups [F(3, 116) = 413.035, p < .001, η2 = .914]. Significant differences were also detected between the four groups’ productive knowledge test scores [F(3, 116) = 358.678, p < .001, η2 = .903]. We then performed post-hoc comparisons (Table 7) using the Tukey HSD test, which can better control the probability of making one or more Type I errors.
Post-hoc comparisons of vocabulary learning tests across the four groups.
Notes. * The mean difference is significant at the 0.05 level. Group 1: Reading only;
Group 2: Reading + gap-fill; Group 3: Reading + writing; Group 4: Reading + writing with use of a digital dictionary.
Post-hoc comparisons suggested that Group 4 (reading + writing with the use of a digital dictionary) was significantly better at acquiring both receptive and productive knowledge than the other three groups (p > .001). These results held for the receptive knowledge and productive knowledge tests.
To answer our second research question regarding how learners’ metacognitive awareness predicts vocabulary learning gains under different learning conditions, we adopted multiple linear regression analysis. Table 8 presents descriptive statistics for each group’s scores on metacognitive knowledge, metacognitive regulation, and receptive and productive knowledge tests. The groups’ metacognitive knowledge scores ranged from 25.233 to 29.533, with their metacognitive regulation, receptive vocabulary knowledge, and productive vocabulary knowledge scores ranging from 48.667 to 54.100, from 8.300 to 22.233, and from 3.567 to 13.567, respectively. We therefore observed substantial between- and within-group variation among the four variables. We did not include pre-test scores as a covariate in the subsequent multiple regression analysis because participants exhibited no prior knowledge of target words in the pre-test.
Descriptive results for metacognition and vocabulary learning in four conditions.
Next, we ran a correlation analysis for the predictor and criterion variables (Table 9). The results revealed that metacognitive knowledge was not significantly correlated with test outcomes for receptive knowledge (p = .185) or productive knowledge (p = .139). However, metacognitive knowledge was significantly correlated with metacognitive regulation (p < .001). In addition, metacognitive regulation was significantly correlated with receptive (p < .001) and productive (p < .001) vocabulary knowledge. Finally, receptive knowledge was significantly correlated with productive knowledge (p < .001).
Correlation analysis results for the variables.
Note. ** Correlation is significant at the 0.01 level (2-tailed).
In terms of regression analysis, the dependent or outcome variables consisted of receptive and productive vocabulary knowledge test scores for each group. The predictor or independent variables were metacognitive knowledge and metacognitive regulation. This analysis was performed to ascertain whether metacognition variables uniquely contributed to treatment effects. We checked the assumption of multiple regression analysis; Watson statistics, which were between 1 and 3, showed that the ‘independence of errors’ assumption was met. The variance inflation factor, which was below 10, indicated that the assumption of multicollinearity between predictor variables was not violated. This step was essential to ensure multiple regression analysis was appropriate for the data analysis (Field, 2013). We also ran a backward variable selection method, in which all variables were entered into a model as potential predictors and then progressively removed when the probability of the associated F-value was larger than 0.10 (Field, 2013). The regression coefficient (β), denoting a change (in standard deviation units) in the dependent variable caused by one standard deviation of change in the predictor, and the adjusted R2 (i.e. the percentage of variance in dependent variables explained by the predictor) were reported for each prediction model.
Table 10 displays prediction model results for each group’s post-test scores. Metacognitive regulation was a significant predictor of immediate post-test scores in all four groups. The predictive effects were positive. However, metacognitive knowledge was not significantly predictive of participants’ post-test scores. These findings were consistent for the receptive knowledge and productive knowledge tests.
Results for the multiple regression analyses.
Finally, we conducted structural equation modelling in AMOS to develop an overall profile of how metacognition predicted task-based receptive and productive knowledge learning. Although multiple regression analysis provided a basic picture, a more comprehensive profile clarified how metacognition functioned in learning receptive and productive knowledge in our task conditions. The model is depicted in Figure 1, with the coefficient results listed in Table 11.

Structural model of the relationships between metacognition, vocabulary knowledge, and task conditions.
Path coefficients of metacognition, vocabulary knowledge and task conditions (n = 120).
Standardized parameter estimates for metacognitive regulation were significant at the .001 level for receptive and productive knowledge. Unexpectedly, standardized parameter estimates for metacognitive knowledge were not significant for receptive and productive knowledge. Task conditions thus appeared to have significant direct effects on receptive knowledge (ß = 0.927, p < .001) and productive knowledge (ß = 0.911, p < .001). These findings indicate the significant predictive effects of task-induced involvement on vocabulary learning. The results also showed the predictive effects of learners’ individual differences in metacognitive regulation on task-based vocabulary learning performance.
IV Discussion
We investigated the extent to which metacognitive knowledge and metacognitive regulation were associated with the efficacy of task-induced involvement in vocabulary learning. ANOVA analyses were employed to examine differences among the four task conditions in learning receptive and productive vocabulary knowledge. Multiple linear regression analyses were then conducted to determine whether the two types of metacognition were significantly correlated with participants’ post-test scores. Finally,
structural equation modelling was performed to obtain a general picture of how metacognition predicted task-induced vocabulary learning. Findings revealed that: (1) significant differences in learning both receptive and productive vocabulary knowledge existed between the four conditions, and tasks with higher levels of involvement yielded better scores in learning both types of knowledge, and (2) metacognitive regulation was a significant positive predictor of the effects of task conditions on learning receptive and productive vocabulary knowledge. These findings offer insights into theory and practice. One issue to be mentioned is that the four task conditions required the learners to read three texts. The reason for focusing on reading materials, rather than on listening ones, is that learners demonstrated lower incidental vocabulary learning gains in listening than in reading (van Zeeland & Schmitt, 2013).
In line with Hulstijn and Laufer (2001), word learning and retention is contingent upon the involvement load of a task, i.e. the amount of need, search, and evaluation that a task imposes. In particular, the main differences in task involvement load among the four tasks in our study involved the evaluation dimension. It is essential to understand the predictive influence of this dimension. For example, Kim (2008) suggested learners’ proficiency levels did not influence the effectiveness of different involvement load task conditions on vocabulary learning performance. It was learners’ cognitive involvement that determined their vocabulary learning outcome. In particular, the evaluation component of task-induced involvement determines learners’ vocabulary learning. Zou (2017) proposed an augmented evaluation framework to differentiate degrees of evaluation in L2 vocabulary retention. This framework categorized evaluation thusly: (1) moderate evaluation at the phrase level; (2) strong evaluation at the sentence level; and (3) very strong evaluation at the composition level. According to Zou (2017), tasks with strong evaluation at the sentence level require participants to make sentences and generate original content. At the composition level, evaluation should be stronger than sentence writing because the task involves generating original content and making all content coherent. During this process, learners may make more concerted efforts regarding the use of new words.
The findings revealed significant differences in vocabulary learning outcomes between task conditions. Between Tasks 1 and 2 (i.e. reading vs. reading + gap-fill), the results indicated that differences in the search component yielded significantly different vocabulary learning outcomes. In addition, Keating (2008) noted significant differences in vocabulary retention between a reading-only group and a fill-in-the-blanks group. However, our finding diverged from other research, such as that by Hulstijn and Laufer (2001) and Kim (2008). In their studies, no significant differences were identified between the reading-only group and the fill-in-the-blanks group. Kim (2008) argued that the different degrees (moderate and strong) of each individual component (need, search, and evaluation) might not lead to significant differences in vocabulary learning. That said, differences in search (i.e. no search vs. with search) might not lead to significant vocabulary learning differences. One explanation for the differences may be due to the time when the post-test was administered. While students in our study took the test immediately after the task-based learning, the test in Kim’s (2008) study was administered two weeks after the treatment. The form-meaning link that was established during the task condition learning may have decayed during these two weeks. Expanding on this line of ILH studies (e.g. Huang et al., 2012; Hulstijn & Laufer, 2001; Keating, 2008; Zou, 2017), Task 4 (reading + writing plus the use of a dictionary), which contained a strong evaluation and search requirement, yielded the best performance in terms of learning receptive and productive vocabulary knowledge.
Additionally, we identified the predictive effects of metacognition, especially the effects of metacognitive regulation on task-induced vocabulary learning. Previous studies featured the importance of learners’ regulatory abilities in L2 vocabulary learning (Teng, 2017; Teng & Reynolds, 2019; Qin & Teng, 2017). This ability, as the researchers stated, could significantly predict task-induced L2 vocabulary learning performance. According to Pintrich, Wolters, and Baxter (2000), metacognitive regulation includes (1) metacognitive monitoring, such as planning and monitoring learning tasks; and (2) self-regulation and control, including the management of learning time and the environment. Taking Task 4 as an example, strong metacognitive regulation may have enabled some learners to take control of the reading passage, look up unknown words in the dictionary, and complete the essay-writing task.
Presumably, learners’ metacognitive regulation abilities may be associated with tasks with different involvement loads. For instance, learners in Group 4 (reading + writing plus the use of a dictionary) may be facilitated to foster a better self-regulatory capacity to plan, monitor, and evaluate their vocabulary learning through ‘chunking, hierarchical organization and pre-task planning’ (Zou, 2017, p. 54). However, metacognitive knowledge was not a significant predictor of vocabulary post-test results. This finding deviates from earlier work (e.g. Boulware-Gooden, Carreker, Thornhill, & Joshi, 2007) substantiating associations between metacognitive knowledge and vocabulary learning. Scholars have offered the following explanations for these connections: (1) learners with a better awareness of metacognitive knowledge strategies are better at consolidating knowledge gleaned from the treatment condition and (2) metacognitive knowledge becomes apparent when external assistance with reading is unavailable. We must acknowledge the disparity between our study’s findings and theirs. While previous studies focused on strategy training, we focused on treatment tasks. Therefore, training in metacognitive knowledge in prior research may have helped learners understand the potential of different strategies. As documented by Teng (2020c), training in metacognitive knowledge could help learners better comprehend the use of various strategies, thus leading to better writing performance. Training learners’ metacognitive knowledge could also potentially enhance L2 vocabulary learning. In addition, we must note the flaw in Schraw and Dennison’s (1994) MAI: the measure contained 17 items on metacognitive knowledge but 35 items on metacognitive regulation. This discrepancy is likely to lead to a clearer understanding of learners’ metacognitive regulation while ignoring some dimensions of metacognitive knowledge. Finally, it may be challenging to display awareness of their metacognitive knowledge, which covers knowledge relating to the conditions for using appropriate strategies, the extent to which strategies are effective, and learners’ strengths and weaknesses (Flavell, 1979; Pintrich, 2002). When learners cannot activate relevant situational or conditional knowledge to perform a task in a certain context, they may not be able to prepare themselves for the task. We are not arguing that metacognitive knowledge is not important in task-based vocabulary learning, but replication studies are needed to examine the effects of metacognitive knowledge on vocabulary learning in greater depth. Based on the literature and our study results, we propose an extension of the original ILH (Figure 2).

An extended framework of involvement load hypothesis (ILH).
On the one hand, this framework takes into account the predictive effect of learners’ metacognitive regulation on L2 task-induced vocabulary learning. On the other hand, tasks with different involvement loads could lead to different levels of vocabulary learning performance. As Figure 2 illustrates, our proposed framework highlights (1) the importance of involvement load in L2 vocabulary learning, in line with the original ILH; and (2) the effect of metacognitive regulation on predicting vocabulary learning. Moreover, it predicts the potential influences of learners’ metacognitive regulation awareness and competence on task-induced involvement load. Within this framework, we highlight the importance of learners’ task involvement. Although Laufer and Hulstijn (2001) argued that more involvement in a task leads to greater chances for vocabulary learning, we want to underscore the notion of how to enhance learners’ involvement. The search and evaluation components may be essential for enhancing task involvement. We further acknowledge that learners’ strategic behaviour may influence learning. In particular, one’s awareness of metacognitive regulation may lead to more involvement with and use of vocabulary learning, which can foster better mastery of receptive and productive vocabulary knowledge. In addition to the outcomes of vocabulary learning in tasks with different involvement loads, this conceptual framework highlights the importance of learners’ innate metacognitive awareness of their self-regulatory capacity, which can fuel their effort to search for and apply personalized strategic learning mechanisms (Teng & Reynolds, 2019; Tseng & Schmitt, 2008). In line with contemporary theories of metacognition and self-regulation in educational psychology (e.g. Zeidner, Boekaerts, & Pintrich, 2000), this framework targets core learner differences that distinguish self-regulated learners from peers who do not engage in strategic learning, even in the same task with the same involvement load.
V Limitations
Some limitations exist in the present study. First, although we tried to adapt the VKS from Wesche and Paribakht (1996) by making it multidimensional to measure different aspects of word knowledge, we did not explore the longitudinal development of learners’ vocabulary learning. Although we also planned to assess learners’ attention ability, only a few students could finally take the test. We thus did not analyse learners’ vocabulary retention. Second, we did not explore causal relationships between receptive and productive vocabulary knowledge, which may have clarified their development. Third, repeated encounters with target words are a key determinant in vocabulary learning (Teng, 2020a). Future studies should explore the interaction between word encounters and word-focused tasks and how this interaction is predicted by metacognition. Fourth, our metacognition measure may not have measured learners’ awareness of metacognitive knowledge and regulation in a balanced way (i.e. 17 items focused on metacognitive knowledge while 35 focused on metacognitive regulation). A measure that better reflects learners’ awareness of metacognition could be applied in future work. Fifth, treatment sessions were brief; subsequent studies involving longer sessions could refine our theoretical model, which associates metacognition with the processing of new words in different post-reading word-focused tasks. Finally, reading and listening are two correlated receptive tasks. However, we did not compare vocabulary learning performance through reading and listening, which should be a focus in future studies.
VI Concluding remarks
Overall, tasks with higher levels of involvement yielded better scores in vocabulary learning. Metacognitive regulation was a significant predictor of the effects of task conditions on vocabulary learning. The present study contributes to an understanding of the ILH and vocabulary learning. The proposed framework sheds light on the relationship between metacognitive regulation, task-induced involvement load, and L2 vocabulary retention. The framework also helps us better understand the importance of learners’ metacognitive regulation ability and task involvement from a strategic and cognitive standpoint. This study also provides pedagogical implications. Our findings, which provide some evidence of the validity of the original ILH, could guide language teachers in emphasizing the roles of tasks in language education and in encouraging instructors to design and integrate tasks with high involvement loads into vocabulary teaching. It is also essential to develop learners’ metacognitive awareness to monitor and regulate their vocabulary learning. Students will benefit from effectively orchestrating various metacognitive strategies to coordinate the resources available for vocabulary learning tasks. In addition, learners may find themselves in highly unpredictable settings when being asked to complete a task. For example, students may have varying abilities and motivations when placed in conditions with changing dynamics. Teachers must therefore be cognizant of evolving task requirements and plan, implement, monitor, and reflect on their own teaching as well as students’ learning. To gain a clearer understanding of these factors, scholars could continue investigating how individual differences in metacognition influence learners’ L2 vocabulary acquisition through tasks.
