Abstract
This study investigates the impact of task-induced involvement and time on task on incidental second language (L2) vocabulary acquisition. Utilizing a 3 (task-induced involvement) × 2 (time on task) × 2 (post-test time) research design, three task-induced involvement conditions were employed based on the Involvement Load Hypothesis (ILH): reading and gap-fill task, reading and sentence-making task, and reading and translation task, with corresponding involvement load (IL) indices of 2, 3, and 4, respectively. Two time-on-task conditions were implemented: uncontrolled time on task, where participants in different groups completed tasks with varied durations, and controlled time on task, where participants in different groups completed tasks with roughly equal durations. Five intact classes comprising 256 Chinese middle school students participated and were randomly assigned to one of five designed tasks aimed at learning 10 carefully selected target words. The results of a three-way repeated measures ANOVA indicate a significant three-way interaction effect among task-induced involvement, time on task, and post-test time, as well as a significant two-way interaction effect between task-induced involvement and time on task. These findings demonstrate that task-induced involvement and time on task interact to significantly influence both initial acquisition and retention of incidental L2 vocabulary. Specifically, under uncontrolled time conditions, tasks with higher ILs and longer durations yield better initial vocabulary gains and retention, partially supporting the ILH. Conversely, under controlled time conditions, tasks with lower ILs exhibit superior initial vocabulary gains and retention, contradicting the predictions of the ILH. Relevant implications are also discussed.
Plain Language Summary
This study delves into the dynamics of second language (L2) vocabulary acquisition, scrutinizing the impact of task-induced involvement and time spent on the learning process. Three distinct tasks—reading and gap-fill, reading and sentence-making, and reading and translation—were administered, each representing different levels of involvement. Furthermore, two time-on-task conditions were explored: uncontrolled time, allowing varying task durations, and controlled time, with approximately equal durations. The study involved 256 Chinese middle school students randomly assigned to tasks focused on mastering 10 specific target words. Results uncovered a significant three-way interaction effect among task-induced involvement, time on task, and post-test time, alongside a noteworthy two-way interaction effect between task-induced involvement and time on task. These findings underscore the crucial interplay between task engagement and time investment in shaping the initial acquisition and retention of incidental L2 vocabulary. Particularly, in uncontrolled time conditions, tasks demanding higher involvement and extended durations proved more effective in vocabulary gains and retention, aligning with aspects of the Involvement Load Hypothesis. However, under controlled time conditions, tasks with lower involvement exhibited superior performance, challenging the predictions of the hypothesis. Nevertheless, certain limitations, including participant challenges with the sentence-making task and the exclusive focus on Chinese ESL learners, merit acknowledgment. To enhance the reliability of results, future replication studies should involve participants from diverse language backgrounds. Additionally, the transferability of these findings to various input tasks necessitates further exploration. The article not only delves into the theoretical and pedagogical implications of these results but also advocates for continued research to refine our understanding of effective language.
Keywords
Introduction
Second language (L2) vocabulary acquisition plays a crucial role in successful language acquisition, particularly during the early stages of language development (Hughes, 2011). It is commonly recognized that reading can contribute to vocabulary expansion for L2 learners (Mohsen & Almudawis, 2021; Webb et al., 2023; Zhang & Ma, 2021). However, the effectiveness of incidental vocabulary acquisition from reading is disappointingly low (Schmitt, 2008). To address this issue, teachers can employ different word-focused reading tasks to enhance vocabulary acquisition (Hulstijn, 2005). Nevertheless, there is a need for research that specifically identifies the effective tasks for L2 vocabulary learning, providing much-needed guidance for teachers in this area.
The Involvement Load Hypothesis (ILH) posits that the effectiveness of a task is linked to the degree of involvement it induces in learners. Higher Involvement Loads (ILs), reflecting the cognitive engagement and processing demands placed on learners, are believed to yield superior vocabulary acquisition outcomes (Laufer & Hulstijn, 2001). Meanwhile, tasks with higher ILs have also been found to be more time-consuming (Hill & Laufer, 2003; Yanagisawa & Webb, 2021), and time on task has been suggested as a potential contributing factor to vocabulary learning gains (Hazrat & Read, 2022; Nation & Webb, 2011). This raises the question of whether task-induced involvement or time on task is the determinant of task effectiveness.
Previous studies have examined the effects of task-induced involvement and time on task separately, either by controlling time on task to compare the effects of different ILs induced by different tasks on vocabulary acquisition (Folse, 2006; Keating, 2008; Kim, 2008; Webb, 2005) or by controlling the IL to compare the effects of different time on the task (Taheri & Rezaie Golandouz, 2021). However, these studies have yielded inconsistent results and interpretations, with some emphasizing the importance of task-induced involvement (Kim, 2008), while others highlight the influence of time on task (Folse, 2006; Keating, 2008; Taheri & Rezaie Golandouz, 2021; Webb, 2005). Consequently, the relationship between task-induced involvement, time on task, and vocabulary acquisition remains complex and inconclusive. Considering that language acquisition is typically influenced by multiple factors simultaneously, it is crucial to investigate the potential interaction between task-induced involvement and the time on task. To address these gaps and gain insights into the interaction between task-induced involvement and time on task in L2 vocabulary acquisition, the present study aims to investigate whether there is an interaction effect between task-induced involvement and time on task on initial incidental vocabulary acquisition and retention. By doing so, this research aims to provide valuable insights and guidance for English vocabulary acquisition among L2 learners and inform instructional practices for teachers.
Literature Review
The ILH is based on the processing depth theory proposed by Craik and Lockhart (1972). According to this theory, the extent to which newly acquired words are retained in long-term memory depends on the level of cognitive analysis applied during the processing stage. In other words, the more learners engage in higher levels of semantic or cognitive analysis to comprehend the meaning of unfamiliar words, the greater their enhancement in vocabulary retention (Craik & Lockhart, 1972). However, operationalizing distinct levels of processing has presented challenges within the scope of the processing depth theory (Craik & Tulving, 1975; Hulstijn & Laufer, 2001). To address these challenges and establish a more practical and measurable definition of “depth of processing” in L2 vocabulary acquisition, Laufer and Hulstijn (2001) developed the ILH.
The ILH specifically focuses on incidental L2 vocabulary acquisition through word-focused reading tasks. It introduces the concept of involvement, which comprises three essential dimensions: need, search, and evaluation. Laufer and Hulstijn (2001) define need as the motivational aspect of the involvement, categorized into two levels: moderate need (indexed as 1) and strong need (indexed as 2). Moderate need arises when learners are motivated by the task requirements, while strong need occurs when learners possess an internal drive to meet those demands. Search and evaluation belong to the cognitive aspect of involvement. Search refers to the effort learners put into seeking the meaning or form of new words, utilizing resources like dictionaries, or seeking guidance from teachers (indexed as 1 if the search is present; otherwise, 0). Evaluation involves comparing the form or meaning of a new word with other potential words or meanings to determine the most suitable choice based on contextual requirements. Evaluation is absent in situations where the selection of a word or its specific sense is unnecessary (indexed as 0). It becomes moderate when a specific context is provided, such as in gap-fill exercises where the optimal word must be chosen from multiple options (indexed as 1). Evaluation is strong when a word is used in a learner-generated context, exemplified by the sentence-making task using targeted vocabulary words (indexed as 2). The IL is calculated by summing the involvement indexes of the three components, and tasks with higher ILs are expected to yield improved performance in L2 vocabulary initial acquisition and retention. Over the past two decades, the ILH has played a significant role in research on L2 vocabulary acquisition, primarily due to its ability to predict the effectiveness of various tasks in facilitating L2 vocabulary acquisition.
Hulstijn and Laufer (2001) conducted the earliest experimental study presenting evidence for ILH’s predictive capability. In their study, three word-focused reading tasks were designed to investigate incidental vocabulary acquisition. Incidental vocabulary acquisition occurs when learners encounter new words without prior knowledge of a subsequent posttest (Godfroid et al., 2013; Laufer & Hulstijn, 2001; Silva et al., 2021). The tasks were reading with marginal glosses (1, 0, 0) (index for need, search, and evaluation respectively), reading with gap-fill exercises (1, 0, 1), and composition writing using a provided word list (1, 0, 2). Task duration was not controlled in this study, and the primary goal was to compare the effectiveness of these tasks in promoting vocabulary knowledge in Israeli and Dutch settings. The findings indicated that the composition writing task outperformed the other two tasks in both settings, providing strong support for the ILH. In the Israeli setting, the reading with gap-fill exercises was significantly more effective than reading with marginal glosses, while in the Dutch setting, this difference did not yield significant results, thus offering partial support for the ILH.
Subsequent studies have mostly offered partial support for the ILH (Ansarin & Khabbazi, 2021; Rahmani et al., 2018; Taheri & Rezaie Golandouz, 2021; Zou & Teng, 2023), with only a minority of studies fully endorsing the ILH (Kim, 2008; Min, 2008; Teng, 2022; Teng & Zhang, 2023). However, some studies have presented contrasting evidence that challenges the ILH by demonstrating that tasks with lower ILs yield superior performance compared to those with higher ILs (Bao, 2015; Folse, 2006; Lu, 2013). Recently, several meta-analyses have been conducted to examine the ILH. Two of these meta-analyses revealed that language learners who engage in tasks with a higher IL achieve greater gains in vocabulary (Huang et al., 2013; Yanagisawa & Webb, 2021). Additionally, two meta-analyses focused on the overall predictive ability of the ILH and indicated that while the ILH significantly predicts vocabulary acquisition, its predictive power is relatively weak, explaining only about one-third of the variance in incidental vocabulary acquisition and retention (Liu & Reynolds, 2022; Yanagisawa & Webb, 2021). These findings suggest that while the ILH serves as a useful framework, there may be additional factors beyond task-induced involvement that influence vocabulary acquisition.
One such factor examined by researchers is time on task. Time on task refers to the duration participants spend engaged in assigned word-focused exercises, such as gap-fill vocabulary exercises, sentence-making, or composition with target words. This time duration excludes other activities such as study introductions and pretest/posttest assessments of the target words (Huang et al., 2013). Some researchers have suggested that the outcomes of Hulstijn and Laufer’s (2001) study could potentially be explained by considering the variable of time on task. In their study, three groups of students were allocated varying amounts of time to complete their respective tasks: 40 to 45 min for the reading comprehension task, 50 to 55 min for the reading and gap-fill task, and 70 to 80 min for the reading and composition task. Therefore, the observed superior outcomes associated with the composition task may be a result of the greater amount of time allocated to this task, rather than solely the inherent effectiveness of the IL associated with it (Hazrat & Read, 2022; Nation & Webb, 2011). It is worth noting that positive correlations were found between time on task and learning gains (Huang et al., 2013; Yanagisawa & Webb, 2021), as well as between the IL and time on task (Yanagisawa & Webb, 2021), suggesting that tasks with higher ILs generally require more time for completion, leading to improved vocabulary gains. This complexity adds to the challenge of comprehending the effects of task-induced involvement and time on task on vocabulary acquisition.
Consequently, researchers have delved into the investigation of which factor, the task-induced involvement or time on task, contributes to learning gains. Some researchers argue that it is the IL that predicts the effectiveness of tasks for acquisition, rather than the time spent on tasks (Hill & Laufer, 2003; Hulstijn & Laufer, 2001). For example, Hulstijn and Laufer (2001) did not analyze time on task as a separate variable, as they maintained that differences in time on task are inherent to each task, and retention is believed to be more closely connected to the depth of initial processing rather than the duration of information stored in primary memory (Craik & Lockhart, 1972; Craik & Tulving, 1975). This claim finds support in a meta-analysis that concluded task-induced involvement outweighs time on task as the more reliable predictor of task effectiveness, and time on task does not moderate the effect of task-induced involvement (Yanagisawa & Webb, 2021). However, several studies have yielded opposing conclusions. For instance, in a recent study comparing three tasks with similar ILs, the most time-consuming task produced better results than the other two, suggesting that time on task might play a determining role in task effectiveness (Taheri & Rezaie Golandouz, 2021). Some studies have controlled for time on task to eliminate its influence on the effect of task-induced involvement. These studies have observed that the superiority of tasks with higher ILs faded and even disappeared when gain scores were adjusted to reflect words learned per minute (Keating, 2008) or when an equal amount of time was allocated to the tasks (Folse, 2006; Webb, 2005). However, there was also a study that arrived at the opposite conclusion, finding that tasks with higher ILs remained superior when an equal amount of time was assigned, fully supporting the ILH framework (Kim, 2008). Thus, despite the extensive exploration of task-induced involvement and time on task, the effect of these factors on vocabulary acquisition remains inconclusive. It remains unclear whether time on task significantly affects vocabulary acquisition, task-induced involvement significantly affects vocabulary acquisition, or both factors interact in complex ways. One potential explanation for the inconsistent findings in previous studies could be attributed to the treatment of approximate time limits for all participants to complete the task as the time on task, may not accurately reflect the actual time invested by each individual. The absence of precise quantification of task duration as a variable can directly impact the research outcomes, potentially leading to divergent results as mentioned earlier. Therefore, it is crucial to improve the measurement of time on task to obtain more reliable and valid data.
Previous studies on the effects of task-induced involvement and time on task on vocabulary acquisition have often been conducted in isolation, neglecting the potential interaction between the two factors. However, vocabulary acquisition is influenced by multiple interconnected factors. Therefore, this study aims to explore whether there is an interaction effect between task-induced involvement and time on task on initial incidental vocabulary acquisition and retention. Furthermore, previous studies in the ILH framework have predominantly focused on participants with intermediate to advanced proficiency levels, while paying limited attention to those with elementary proficiency levels (Silva & Otwinowska, 2018). However, vocabulary acquisition plays a vital role in early language learning stages (Laubscher & Light, 2020; Silva & Otwinowska, 2018). Thus, this study focuses on Chinese ESL middle school students as research participants, to provide practical recommendations and guidance for L2 vocabulary acquisition, as well as insights into teachers’ instructional practices.
Objectives of the Study
By examining the interplay between task-induced involvement and time on task, this study seeks to contribute to a better understanding of the factors influencing vocabulary acquisition and inform instructional practices in L2 vocabulary acquisition.
Research Question
The primary research question addressed in this study is,
How do the task-induced involvement and time on task impact incidental L2 vocabulary acquisition in word-focused reading tasks?
Method
The study utilized a three-way repeated measures ANOVA with a 3 [task-induced involvement: reading and gap-filling (1, 0, 1), reading and sentence-making (1, 0, 2), reading and translation (1, 1, 2)] × 2 [time on task: controlled, uncontrolled] × 2 [post-test time: immediate, delayed] design. Post-test time was treated as a within-subject repeated measure, while the IL and time on task were considered as between-subject independent variables.
Participants
The participants were 256 eighth-grade ESL students from five intact classes at a middle school in China. Their ages ranged from 13 to 15. The five classes of participants shared similar English educational settings and had received formal English instruction for 6 years. None had spent more than 2 months in an English-speaking country. Initially, there were 263 participants; however, some were excluded from the analysis due to incomplete tasks or tests.
To assess participants’ prior knowledge, their scores from the most recent English final exam were compiled as a comprehensive measure of overall L2 attainment. Additionally, the Vocabulary Size Test (VST) (Nation & Beglar, 2007) and the Vocabulary Knowledge Scale (VKS) (Paribakht & Wesche, 1997) were administered to assess vocabulary breadth and prior familiarity with the target vocabulary. Mean scores and standard deviations for the three tests across the five classes are presented in Table 1, with the maximum total scores for the English final exam and VKS being 120 and 50 points, respectively. The VST serves as an estimate of the number of words mastered by learners. Subsequent sections will provide detailed explanations of the VST and the VKS. Employing a one-way ANOVA with preceding tests for normality (p > .05) and homogeneity of variance (p > .05), we found no statistically significant differences among the five classes in terms of overall L2 achievement [F(4, 251) = 0.70, p = .594], vocabulary size [F(4, 251) = 0.14, p = .967] or prior knowledge of the target vocabulary [F(4, 251) = 0.069, p = .991].
Descriptive Statistics of Participants’ Prior Knowledge.
The five classes were randomly assigned to five distinct treatment groups, denoted as Group Gap-fill A (N = 52, Mage = 13.67), Group Gap-fill B (N = 53, Mage = 13.55), Group Sentence-making A (N = 51, Mage = 14.19), Group Sentence-making B (N = 50, Mage = 13.80), and Group Translation (N = 50, Mage = 14.02). Subsequent sections will provide detailed explanations of the assigned tasks for each group.
Instruments
Vocabulary Size Test
The Vocabulary Size Test (VST), developed by Nation and Beglar (2007), is a tool for assessing English vocabulary knowledge. This test draws its sample from the first 1,000 to the 14th 1,000-word families of English, as compiled by Nation (2006) using the British National Corpus (BNC). These word families are categorized into 14 levels based on their frequency, with each level comprising 1,000-word families. To construct the test, Nation and Beglar randomly select ten-word families from each level. Participants are presented with a word-meaning selection test, where they should choose the correct meaning for each word. The total score obtained from the test is multiplied by 100 to determine learners’ total receptive vocabulary size.
Aside from the monolingual version, bilingual versions of the VST have been developed for various languages, including Chinese, Japanese, Korean, Vietnamese, Russian, and Persia (Elgort & Warren, 2014; Nguyen & Nation, 2011; Zhao & Ji, 2018). These versions incorporate multiple-choice options in the native language of the test takers. Considering that our participants were English beginners, we utilized the bilingual English-Mandarin version of the test. Moreover, research by Zhao and Ji (2018) has demonstrated the effectiveness of both the complete bilingual English-Chinese version and a subset of its questions in assessing vocabulary size. In line with the Chinese curriculum, which stipulates that junior high school students should acquire a vocabulary of 1,600 words from the textbook and an additional 100 to 300 words beyond the textbook by graduation, our study employed the first three levels of the VST English-Chinese version. This selection included a total of 30 words to measure participants’ vocabulary size according to these established standards.
Vocabulary Knowledge Scale
To assess participants’ word-learning growth, we adapted the Vocabulary Knowledge Scale (VKS) developed by Paribakht and Wesche (1993) as our measurement tool. The VKS exhibits high reliability, with Cronbach’s alpha values of 0.89, and effectively captures different levels of vocabulary mastery (Wesche & Paribakht, 1996). It is essential to note that word knowledge is not a dichotomous construct, where individuals either know a word or do not (Schmitt, 2014). Instead, it exists along a continuum, ranging from receptive knowledge to productive knowledge (Teng & Zhang, 2021). The VKS evaluates participants’ familiarity with words, their understanding of word meanings, and their ability to accurately utilize words within the context of grammar and semantics, covering both receptive and productive knowledge.
We employed the Chinese version of the VKS as both a pre-test and a post-test, which helps mitigate potential performance issues stemming from language factors. However, it should be considered that the pre-test may direct learners’ attention to the target words during the training phase. To minimize the test-retest effect, the pre-test was administered 3 weeks before the experiment, taking only 5 min at the beginning of a class. Students were unaware of the test’s purpose, and regular teaching activities commenced immediately afterward, leaving no time for students to look up words; additionally, a three-week break between the pre-test and post-test helped minimize deliberate memorization of the target words through dictionary consultation or discussions with classmates outside of class.
The VKS incorporates detailed scoring criteria. In our study, VKS scores ranged from 1 to 5 points for each target word, resulting in a maximum total score of 50 points (with a total of 10 target words).
Reading Materials and Target Words
Three potential reading materials were carefully selected from the widely adopted English textbook Go for It (ninth-grade), which is commonly used in Chinese middle schools. These texts were deliberately selected for their moderate length and the absence of a requirement for specific cultural background knowledge, ensuring that they were accessible to the participants. To further validate the appropriateness of these texts, five proficient English teachers from grade 8 independently evaluated and confirmed that the texts aligned with the grammatical scope outlined in the eighth-grade curriculum, and vocabulary might pose a primary challenge for eighth-grade students. Two of the selected materials were titled “Rethink, Reuse, Recycle!,”“Save the Sharks!,” and another one did not have a specific title, addressing the topic of middle school graduation ceremonies in Unit 14.
One month before the experiment, the participants selected for the pilot study were asked to identify any unknown words encountered within the three potential materials. The three texts, namely “Rethink, Reuse, Recycle!,”“Save the Sharks!,” and the text in Unit 14, consisted of a total of 341, 217, and 319 words, respectively. It was found that, on average, 2.77%, 3.68%, and 1.39% of the words within these texts were unknown to the participants. Prior research has indicated that learners need to comprehend around 98% of the words within a given text to attain sufficient understanding (Hu & Nation, 2000). Therefore, in line with this finding, the text from Unit 14 was chosen as the reading material for this study. It is important to note that when calculating the percentage of unknown words within each potential reading material, 10 words that were unfamiliar to all participants were regarded as potential target words and excluded from the calculation. This decision was based on the fact that the target words were either explicitly glossed or the participants were permitted to consult a dictionary during the completion of reading tasks.
Subsequently, 10 target words were deliberately selected from the text in Unit 14, ensuring that all participants were unfamiliar with them. These target words included “attend,”“ceremony,”“knowledge,”“support,”“consider,”“ahead,”“ability,”“responsible,” and “separate”.
Task-Induced Involvement and Time on Task
Table 2 presents a comprehensive analysis of the five treatment groups, highlighting two important factors that set them apart: task-induced involvement and time on task. To begin, we will provide a description of different tasks and the corresponding involvement they induce. In this study, three types of tasks were designed: reading and gap-fill task (referred to as gap-fill task), reading and sentence-making task (referred to as sentence-making task), and reading and translation task (referred to as translation task). The first two tasks included a glossed table of target words appended at the end of the reading material. This glossed table encompassed valuable information, such as phonetic transcriptions, parts of speech, Chinese meanings, and illustrative example sentences for each of the 10 target words. In contrast, the reading and translating task did not include a glossed table but allowed participants to refer to a dictionary. The dictionary used by the participants was the widely adopted “New English-Chinese and Chinese-English Dictionary” published by the Commercial Press in China. This dictionary caters to the needs of Chinese middle school students and consists of two sections: one providing detailed information about English words, including phonetic transcriptions, parts of speech, and Chinese meanings, and illuminating example sentences for each word (the glossed table in the other two reading tasks is extracted from this section), and another section providing English words corresponding to Chinese, enabling students to look up English words with the same literal meaning through Chinese. Based on Laufer and Hulstijn’s (2001) comprehensive examination of the IL concerning diverse reading tasks, all three types of tasks are expected to induce a moderate need as this need is imposed by the external agent of the tasks themselves. The gap-fill task, with the presence of glossed target words, does not require extensive search efforts but involves moderate evaluation, as participants need to select the most appropriate word from a provided pool of options based on contextual clues. Similarly, the sentence-making task, with glossed target words, does not necessitate search efforts but requires strong evaluation, as participants need to decide how additional words will combine with the target words to create original sentences. In contrast, the translation task induces search as participants consult the dictionary to overcome unfamiliar vocabulary, and it also induces strong evaluation, as the target words in the translation task need to be assessed concerning suitable collocations within the original sentences. Consequently, in this study, the translation task is expected to invoke the highest level of ILs, followed by the sentence-making task, while the gap-fill task is anticipated to have the lowest ILs.
The Design of Word-Focused Reading Tasks.
An additional differentiating factor among the five treatment groups is the manipulation of time on task. The experiment encompassed two distinct conditions concerning time allocated for the tasks. When the time on task was tightly controlled, no significant differences were observed in the duration of the three types of tasks. These groups were denoted as gap-fill A, sentence-making A, and the translation group, respectively. Conversely, when the time on task was left uncontrolled, variations in task duration were apparent, leading to the identification of gap-fill B, sentence-making B, and the translation group. The pilot experiment findings revealed that, in the absence of time control, the translation task demanded the longest duration. Conversely, upon implementing time control, the durations of the gap-fill B and sentence-making B tasks closely approximated that of the translation task. Consequently, only one group was assigned to undertake the translation task.
Word-Focused Reading Tasks
This section provides an overview of the reading tasks assigned to each group. In the gap-fill tasks, two groups of participants were presented with either 10 gaps (task A) or 30 gaps (task B) to be filled. For task A (N = 52), 10 target words were removed from the text, and a list of these words, along with five distractors, was provided at the bottom. Participants were instructed to select the appropriate words from the list and fill in the corresponding gaps in the text. Furthermore, participants were required to consider the contextual cues and use the correct form of the word, as the list presented the word as a prototype. Task B (N = 53) expanded upon task A by introducing an additional 20 gaps in 20 separate sentences that were unrelated to the main text. The design and instructions for Task B were identical to those of Task A, except for the inclusion of these additional gaps in separate sentences. It is worth noting that gap-fill tasks in texts (Kim, 2008; Yang & Cao, 2021) and independent sentences (San Mateo-Valdehita & de Diego, 2021; Yang & Cao, 2021) are widely employed in research on task-induced involvement. In this study, both gap-fill tasks were expected to induce a moderate need, no search, and moderate evaluation.
Regarding the sentence-making tasks, participants were instructed to generate either 10 sentences (task A) or 15 sentences (task B) using the 10 target words. In task A (N = 51), participants were specifically instructed not to directly copy sentences from the original text or utilize example sentences from the glossed table. Task B (N = 50) built upon task A by allowing participants to select an additional five target words for sentence-making, with the condition that the content of the sentences should not overlap.
In the translation task (N = 50), sentences containing the target words were extracted from the original text, and their corresponding Chinese translations were provided. These translated Chinese sentences closely reflected the content of the original English sentences. Participants were instructed to read the text and translate the given Chinese sentences into English using the provided target words. For instance:
(1) ____________ (responsible) (做出明智的选择,并对你的决定负责。)
During the translation task, participants were explicitly instructed to employ the target word “responsible” to translate the Chinese sentence enclosed in parentheses. The corresponding English translation of the sentence is “Choose wisely and be responsible for your decisions and actions.” Some sentences in the original text were challenging or overly complex, so suitable simplifications were introduced to enhance task comprehension and feasibility.
Procedure
One week before the formal experiment, a pilot study was carried out involving 53 students who would not participate in the formal experiment. The students completed five different tasks: gap-fill A (N = 11), gap-fill B (N = 10), sentence-making A (N = 11), sentence-making B (N = 10), and translation (N = 11). The participants were asked to keep track of how much time they spent on each task. The analysis of the results revealed that the time taken for translation (M = 20.33) was significantly higher than that for sentence-making A (M = 12.86) (p < .001). Furthermore, the time taken for sentence-making A tasks was significantly higher than that for gap-fill A (M = 6.17) (p < .001). In contrast, there was no significant difference in the time taken for gap-fill B (M = 20.63), sentence-making B (M = 20.53), and translation (M = 20.33) (p = .902). The average time mentioned above is in minutes. Therefore, it is predicted that there will be no significant difference in the time taken for tasks among the three groups under controlled time conditions during the formal experiment.
Before the formal experiment, a pre-test was conducted 3 weeks in advance with five treatment groups. The pre-test took place during regular class hours without the participants being informed of its purpose. The formal experiment was conducted during the self-study class period, with each class being instructed by an English teacher. The experimental procedure and purpose were not disclosed to the participants. Initially, the participants completed the VST, followed by the corresponding reading tasks assigned to each group. The teacher supervised the students in each group to ensure thorough completion of the tasks, and each student recorded the time taken to finish their reading tasks. Finally, an unannounced immediate post-test for the target words was administered. In line with previous research indicating that delayed tests conducted after 2 weeks or more can provide relatively stable retention rates (Laufer, 2011), a delayed test for the target words was conducted 2 weeks after the experiment. To prevent participants from rushing through certain questions or not taking the tests seriously due to time constraints, no time limits were imposed for any of the tests or tasks conducted during the experiment.
Ethical Considerations
The researchers gave ethical concerns, as advised. We explained the study’s purpose to those who took part and made sure they were under no obligation to do so. Moreover, we assured them that the information would only be utilized for analysis. In order to protect the participants’ anonymity while we analyzed the data, we assigned them codes (numbers).
Consent
The participants gave their consent and were informed that their answers would be kept confidential and that their performance on any assignments would not have an impact on their final grades for this course. Those who agreed to participate in the study voluntarily filled out a permission form indicating they were fine with having their responses public.
Data Analysis and Results
Firstly, a one-way ANOVA was conducted on the time duration of completing the reading tasks of the five treatment groups. Preceding the ANOVA, tests for normality (p > .05) and homogeneity of variance (p > .05) were conducted. The results revealed significant differences in the time taken to complete gap-fill A, sentence-making A, and translation tasks (p < .001). Specifically, participants spent significantly more time on translation tasks (M = 20.48) than on sentence-making A tasks (M = 13.59), and participants spent significantly more time on sentence-making A tasks than on gap-fill A tasks (M = 7.53) (translation > sentence-making A > gap-fill A).
In contrast, no significant differences were found in the time taken to complete gap-fill B, sentence-making B, and translation tasks (p = .935), with participants spending approximately the same amount of time on all three tasks (M = 20.72 for gap-fill B, M = 20.59 for sentence-making B, and M = = 20.48 for translation). Therefore, when time on task was not controlled, the three groups spent varying amounts of time completing their respective tasks; when time on task was controlled, the three groups spent roughly equal amounts of time completing their tasks. Then, a three-way repeated measures ANOVA was conducted after confirming normal distribution (p > .05) of scores and homogeneity of variances (p > .05) in preceding tests. Table 3 displays the results, indicating a nonsignificant main effect of task-induced involvement [F(2,300) = .614, p = .542,
Results of the Three-Way Repeated Measures ANOVA.
Descriptive Statistics in a Three-Way Repeated Measures ANOVA.
Effects of Task-Induced Involvement Moderated by Time on Task
Although the main effect of task-induced involvement did not reach statistical significance, it was overshadowed by a significant interaction effect between task-induced involvement and time on task. Subsequent post-hoc analysis revealed that the direction of task-induced involvement effects on vocabulary acquisition was affected by time on task. Specifically, when time on task was not controlled, the gap-fill task demonstrated higher vocabulary gains compared to the sentence-making B and gap-fill B tasks (gap-fill B > sentence-making B ≥ translation, p < .001). Conversely, when time on task was controlled, the translation task resulted in higher vocabulary gains compared to the sentence-making A and gap-fill A tasks (translation > sentence-making A ≥ gap-fill A, p < .001).
Moreover, additional simple effects tests were conducted to examine the effects of task-induced involvement under different time-on-task conditions. When time on task was not controlled, task-induced involvement significantly influenced both the immediate post-test (p = .044,
Furthermore, when time on task was controlled, the simple effects tests indicated that task-induced involvement had a no significant effect on the immediate post-test (p = .160,
Based on these findings, it can be concluded that both the direction and the effect size of time on a task is influenced by task-induced involvement.
Effects of Time on Task Moderated by Task-Induced Involvement
The analysis revealed that both the main effect of time on task and the interaction effect between time on task and task-induced involvement were significant. These findings indicate that time on task plays a significant role in vocabulary acquisition and its impact is moderated by task-induced involvement.
Further examination through simple effects tests demonstrated that time on task had a significant effect on both the immediate and delayed post-tests for the gap-fill groups (p = .002,
Based on these findings, it can be concluded that the direction of the effect of time on task remains consistent and is not moderated by task-induced involvement. Additionally, the effect size of time on a task is influenced by task-induced involvement.
Effects of Post-Test Time Moderated by Task-Induced Involvement and Time on Task
The analysis revealed a significant main effect of post-test time, indicating a notable difference between immediate and delayed post-test results. Furthermore, the interaction effect between task-induced involvement, time on task, and post-test time was also significant, highlighting the moderating role of task-induced involvement and time on task in the vocabulary learning process.
Simple effects tests were conducted to examine the impact of post-test time on each task-induced involvement condition and time on task condition. The results indicated a significant effect of post-test time across all conditions: gap-fill A (p < .001,
The moderating effect of task-induced involvement and time on task on post-test time was further observed through vocabulary retention rates, which varied across different time on task and task-induced involvement conditions. Notably, the effect size of post-test time is inversely related to vocabulary retention rates. In other words, the smaller the difference between immediate and delayed post-test scores, the higher the vocabulary retention rate. Therefore, the vocabulary retention rates for each task group were ranked as follows: gap-fill B > translation > sentence-making B > sentence-making A > gap-fill A. This suggests that when time on task was not controlled, tasks with higher ILs demonstrated higher vocabulary retention rates (translation > sentence-making A > gap-fill A). Conversely, when time on task was controlled, the task with the lowest ILs exhibited the best vocabulary retention rates (gap-fill B > translation > sentence-making B).
To summarize the findings of this study, it can be concluded that task-induced involvement and time on task interacted with each other, significantly influencing the incidental L2 vocabulary acquisition among Chinese middle school students. The interaction between the two factors was evident in their reciprocal influence on effect size, and in the moderating effect of time on task on the direction of task-induced involvement’s impact on L2 incidental vocabulary acquisition. Specifically, when time on task was not controlled, tasks with higher ILs and longer durations exhibited better vocabulary gains (translation > sentence-making A ≥ gap-fill A). Conversely, when time on task was controlled, tasks with lower ILs demonstrated a more effective vocabulary acquisition (gap-fill B ≥/> sentence-making B ≥ translation). Additionally, the effects of task-induced involvement and time on task on vocabulary retention rates were also observed. Similarly, when time on task was not controlled, tasks with higher ILs showed higher vocabulary retention rates compared to tasks with lower ILs (translation > sentence-making A > gap-fill A). In contrast, when time on task was controlled, tasks with the lowest ILs exhibited the highest vocabulary retention rate (gap-fill B > translation > sentence-making B).
Discussion
This study aims to investigate the impact of task-induced involvement and time on task on L2 vocabulary gains. This examination was conducted through the application of a three-way repeated measures ANOVA, featuring a 3 (task-induced involvement) × 2 (time on task) × 2 (post-test time) design. A notable discovery in our findings is the identification of a two-way interaction effect between task-induced involvement and time on task. Additionally, a three-way interaction effect surfaced among task-induced involvement, time on task, and post-test time. These outcomes underscore the intricate dependencies shaping incidental L2 vocabulary acquisition, wherein both task-induced involvement and time on task exhibit sustained effects over time. Crucially, our findings highlight that the impacts of both task-induced involvement and time on task on incidental vocabulary acquisition are not independent. They mutually constrain each other, implying that neither should be disregarded.
Prior research has consistently underscored the substantial impact of task-induced involvement on incidental vocabulary learning (Ansarin & Khabbazi, 2021; Rahmani et al., 2018; Taheri & Rezaie Golandouz, 2021). Our study extends these findings by affirming that time on task is also a critical factor when assessing the overall efficacy of vocabulary learning tasks. Moreover, it illuminates that the influence of task-induced involvement on vocabulary acquisition is significantly moderated by time on task. This finding challenges Hulstijn and Laufer’s (2001) and Hill and Laufer’s (2003) arguments that task involvement load predicts learning, not the amount of time spent on task. The present findings stand in contrast to the results reported by Yanagisawa and Webb (2021) as well as Ansarin and Khabbazi (2021), who did not observe a main effect of time on task (Ansarin & Khabbazi, 2021) or any moderating effect of time on task for the IL (Yanagisawa & Webb, 2021). One possible explanation for this discrepancy could be attributed to differences in the measurement and control of time on task. Yamaha and Webb’s analysis relied on the approximate times or periods reported in 26 studies to calculate time on task, while in the current research, the duration spent by each participant to complete the corresponding task was recorded, providing a more precise measure. Additionally, Ansarin and Khabbazi’s study incorporated time on task as a covariate in their design, whereas the present study unified the duration of each task to control for this variable.
As the results show in this study, time on task significantly affects the effect of task-induced involvement on L2 incidental vocabulary acquisition. When exploring the impact of task-induced involvement on incidental vocabulary acquisition, it is essential to control for time on task. It is worth noting that due to uncertainties regarding the effects of time on a task, some previous studies did not control for time on a task (Bao, 2015; Hu & Nassaji, 2016; Silva & Otwinowska, 2018; Teng & Zhang, 2023), while others recognized its potential influence on experimental outcomes and chose to implement controls (Ansarin & Khabbazi, 2021; Folse, 2006; Keating, 2008; Kim, 2008). This discrepancy in handling time on task may contribute to the inconsistent results observed in prior research. It highlights the necessity for a standardized approach to account for time on task in future studies, ensuring a more comprehensive and reliable understanding of the factors influencing incidental vocabulary acquisition in L2 learning contexts.
Specifically, the moderating effect of time on task extends to the size and direction of task-induced involvement effects on L2 incidental vocabulary acquisition. When time on task was not controlled, translation tasks performed better than sentence-making and gap-fill exercises in both initial vocabulary acquisition and retention. This partially supports the ILH and is consistent with the number of previous studies (Ansarin & Khabbazi, 2021; Rahmani et al., 2018; Taheri & Rezaie Golandouz, 2021; Zou & Teng, 2023), suggesting that the ILH applies to Chinese ESL middle school students, albeit in a less stringent manner; when time on task was controlled, gap-fill exercises performed better than sentence-making and translation in both initial vocabulary learning and retention. This finding contradicts ILH but is consistent with previous research by Folse (2006) and Kim (2008), indicating that tasks with low ILs are more effective for vocabulary acquisition when taking into account the time devoted to the task. The variable of time on task changes the amount of contact time and the number of learners exposed to target words. On the one hand, when time on task was not controlled, the translation task’s outperformance was possibly due to its higher ILs and longer exposure. Spending more time on a task allows learners to better connect newly encountered vocabulary with their existing foreign language lexical system, leading to stronger performance in the translation task (Huang et al., 2013). The superior vocabulary retention observed in the translation task may also be attributed to its strong evaluation component, which has been identified as a significant factor in vocabulary acquisition performance according to previous research (Teng & Zhang, 2021; Yanagisawa & Webb, 2021; Zou & Teng, 2023). On the other hand, when time on task was controlled, all tasks had the same duration, but gap-fill tasks provided more retrieval opportunities for target words compared to the sentence-making and translation tasks. This may explain the superior performance of gap-fill tasks. Repeated encounters with a word during reading or through gap-fill tasks enhance learners’ attention and retrieval of learned information about the word from previous encounters. The ILH exclusively centers on the process of learning unknown words, neglecting to incorporate retrieval opportunity as a crucial component (Laufer, 2020); however, including it as a variable enhances the ability to predict incidental vocabulary acquisition (Nation & Webb, 2011). The frequency of exposure to target words seems to be the crucial factor for acquisition (Folse, 2006; Yanagisawa & Webb, 2021). Another reason for the effectiveness of vocabulary acquisition through gap-fill exercises could be hierarchical organization. Hierarchical organization involves structuring information into units and sub-units, establishing relationships between them, and aiding in systematic searching within memory (Zou, 2017). Participants who connected target words coherently by organizing their hierarchical relationships demonstrated the use of the hierarchical organization. However, sentence-making lacks hierarchical organization as the uses of target words are independent and lack systematicity between contexts. Similarly, the translation contexts in this study also exhibit relatively weak systematicity. Practicing individual sentences out of context proves to be less conducive to the long-term retention of acquired information (Llach, 2009). Conversely, gap-fill exercises require learners to organize the entire article and construct a context that hierarchically connects all the target words, providing memory cues and aiding information retrieval (Anderson, 2010).
In general, when time on task was not controlled, tasks characterized by higher ILs exhibited superior vocabulary gains. Conversely, under controlled time conditions, tasks featuring lower ILs demonstrated more effective vocabulary acquisition. However, it is essential to note exceptions to this trend. The performance of the sentence-making group did not yield the expected results. When time was not controlled, there was no significant difference between the participants’ performance in the sentence-making group and the gap-fill group. Similarly, when time was controlled, the participants’ performance in the sentence-making group was not significantly better than that of the translation group. This outcome could potentially be attributed to some participants not meeting the requirements of the experimental design for sentence-making tasks. While nearly all students completed the sentence-making exercises under teacher supervision, a subset of students either copied the original sentences (excluding data from those who copied all the original sentences) or merely replaced a few nouns or pronouns within the original sentences. According to the Output Hypothesis (Swain, 1985), sentence-making involves a process that includes semantic and syntactic processing, as well as hypothesis-making, verification, and reflection on the output language to effectively facilitate language acquisition. Copying and vocabulary substitution do not require syntactic processing and lack the elements of hypothesis-making, verification, and reflection on the output language. Consequently, some students may not have achieved the assumed depth of processing for vocabulary, as proposed by the Output Hypothesis for sentence-making tasks. The experimental results revealed that five students either did not complete the sentence construction task or copied the original text entirely. Upon contacting these students, researchers inquired about their reasons, and the students mentioned difficulties such as “I don’t know how to make a sentence” and “I didn’t have enough time, but since I saw everyone else turning in their papers, I also submitted mine.” These challenges encountered by some students indicate that sentence-making tasks posed greater difficulties for them. Another factor that potentially influenced the results is the disparity in the frequency of target word retrievals between the sentence-making and translation tasks when time on task was controlled. The sentence-making task involved 1.5 times more frequent encounters with target words compared to the translation task. However, this difference did not result in any significant variation in vocabulary acquisition between the two tasks. This may be attributed to the limited frequencies represented in the studies included in our analysis. Previous research examining the impact of frequency on incidental vocabulary acquisition during reading has indicated that numerous exposures are necessary for substantial learning to take place (Elgort & Warren, 2014; Pellicer-Sánchez & Schmitt, 2010; Yanagisawa & Webb, 2021). A higher number of repetitions may be required to have a meaningful effect on long-term retention. Further investigation is warranted to ascertain the extent to which vocabulary frequency influences incidental vocabulary acquisition in reading tasks.
Limitations and Implications
It is important to acknowledge the limitations of the study. The inclusion of a sentence-making task, which proved to be difficult for participants, may have influenced the experimental results. Future studies should consider selecting tasks with moderate difficulty to address this limitation. Additionally, all participants in the study were Chinese ESL learners. Replication studies with subjects from different L1 or L2 backgrounds are needed to obtain more reliable and generalizable results. Furthermore, the focus of this study was on output tasks, as previous research has shown that task type (output or input) affects the predictive power of the ILH (Bao, 2015; Taheri & Rezaie Golandouz, 2021; Yanagisawa & Webb, 2021). Therefore, the applicability of the findings to input tasks requires further exploration.
This study has implications for language researchers, teachers, and learners. Time on task plays a crucial moderating role in shaping the effects of task-induced involvement on vocabulary acquisition. Neglecting the influence of time on a task may account for the variations observed in prior research. Thus, we recommend that researchers treat time on task as a variable influencing research outcomes and exert control over it, enabling a more accurate assessment of the impact of task-induced involvement on vocabulary acquisition. Integrating time on task as a variable within the framework of the ILH holds promise for enhancing the ILH’s predictive efficacy regarding incidental vocabulary acquisition. However, further investigation is necessary to determine appropriate quantification methods that capture the influence of time on tasks accurately. Moreover, the study has practical implications for language teaching and learning. It highlights the importance of considering both task-induced involvement and time on task when selecting appropriate tasks to promote L2 vocabulary acquisition. Teachers are advised to prioritize tasks with low ILs, such as gap-fill exercises, which have proven to be more effective in facilitating incidental vocabulary acquisition compared to tasks with higher ILs, such as sentence-making and translation. Leveraging these findings can optimize vocabulary learning strategies, fostering efficiency and effectiveness in the classroom. It is important to note that this study specifically examines the effects of different reading tasks on vocabulary acquisition and does not encompass all aspects of language abilities.
Conclusion
This study contributes valuable insights into the intricate relationship among task-induced involvement, time on task, and incidental L2 vocabulary acquisition from word-focused reading tasks. The observed interaction effect between task-induced involvement and time on task sheds light on the nuanced manner in which they collectively exert a significant influence on both the initial acquisition and retention of incidental L2 vocabulary. To elaborate, tasks characterized by higher Involvement Loads (ILs) generally result in improved acquisition outcomes, albeit requiring a more extended completion time. In contrast, tasks with lower ILs demonstrate superior vocabulary retention within the same time frame.
This study challenges the ILH, which asserts that task-induced involvement predicts learning independently of the time spent on the task. The empirical evidence presented in this study emphasizes the substantial role of time on task in incidental L2 vocabulary acquisition and indicates that the impact of task involvement load on this process is moderated by time on task. Therefore, we propose that integrating time on task into the ILH framework is expected to enhance its predictive power, although further exploration is necessary to precisely quantify this effect. This finding also suggests that in future experimental designs for incidental L2 vocabulary acquisition, time on task should be treated as a variable to control. Moreover, this holistic understanding is also crucial for educators and curriculum designers, as it provides a more nuanced perspective on the multifaceted nature of language learning dynamics.
Footnotes
Acknowledgements
None.
Ethical Approval
The research reported is in compliance with Zhenghzou University’s ethical standards involving human participants.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Data Availability Statement
Data will be made available on request.
