Abstract
This study investigates the effects of task interactivity (monologic versus dialogic tasks) and willingness to communicate (WTC) on the speech fluency of 88 Hong Kong undergraduate English as a second language learners. Participants, categorised into high- and low-WTC groups based on a WTC questionnaire, completed either a monologic or a dialogic collaborative “spot-the-picture-difference” task. Fluency was assessed through speech rate, frequency of filled pauses and unfilled pauses at difference locations, and repairs. Results indicated that learners with high WTC demonstrated significantly faster speech rates and produced fewer repairs than those with low WTC. Monologic tasks elicited significantly more mid-clause pauses and slightly more filled pauses compared to dialogic tasks. Crucially, a significant interaction effect revealed that the disparity in speech rate between high- and low-WTC learners was more pronounced during monologic tasks. This suggests that the detrimental effect of lower WTC on speed fluency is particularly marked when learners engage in independent speaking, while dialogic tasks may offer a more supportive environment for learners with lower WTC to maintain a fluent speech rate. These findings highlight the complex interplay among affective factors, task design and second language oral performance, with implications for language pedagogy and assessment.
Keywords
Introduction
In second language (L2) acquisition, speech production remains a critical yet challenging skill for learners, influenced by both cognitive and affective factors (Lambert et al., 2023). Among these, willingness to communicate (WTC) has emerged as a pivotal construct, reflecting a learner's readiness to engage in L2 communication at a specific moment (MacIntyre et al., 1998). Its influence on L2 communication has long intrigued researchers in the field (e.g., Aubrey and Yashima, 2023; Cao, 2011; Cutrone and Siewkee, 2024). However, the interplay between WTC and fluency, a key dimension of both L2 communication and proficiency, is under-researched. Previous studies have primarily examined WTC or fluency in isolation, with limited focus on how task interactivity, namely monologic versus dialogic tasks, affects these constructs (Wood, 2016). This study is an attempt to address these gaps by investigating how self-reported WTC influences L2 fluency among Hong Kong English learners across monologic and dialogic tasks.
Literature review
L2 fluency
Fluency in L2 speech is a multi-dimensional construct, typically defined as the smooth, rapid and efficient production of speech under real-time constraints (Segalowitz, 2010). Utterance fluency, the focus of this research, is broadly categorised into three interrelated dimensions: speed fluency (rate and density of speech production), breakdown fluency (frequency, duration and location of pauses) and repair fluency (instances of false starts, replacements, reformulation and repetitions) (Skehan, 2003; Tavakoli and Skehan, 2005). These dimensions represent measurable aspects of speech fluency, which are underpinned by cognitive fluency, the efficiency of the underlying linguistic and cognitive processing resources (Segalowitz, 2010; Suzuki and Kormos, 2023).
In task-based language teaching (TBLT) research, utterance fluency is most commonly operationalised through temporal measures derived from learners’ oral performances (Bui and Huang, 2018). Speed fluency is measured via speech rate (syllables/words per minute, including pauses) (Tavakoli and Wright, 2020). Breakdown fluency is indexed by silent and filled pause frequency and duration, with a growing emphasis on pause locations—mid-clause pauses are interpreted as disruptions in formulation, while end-clause pauses are linked to conceptualisation (De Jong, 2016; Tavakoli, 2011). Repair fluency is quantified through counts of false starts, replacements, reformulation and repetitions per 100 syllables/words (Suzuki and Kormos, 2023). Composite measures such as mean length of run (syllables/words between pauses) are also used, although they conflate speed and breakdown dimensions (Bosker et al., 2013).
Variation in L2 fluency performance is shaped by a confluence of cognitive, task-related and affective factors. Cognitive fluency, which comprises lexical retrieval speed, syntactic encoding efficiency and articulatory control, directly influences utterance fluency (De Jong et al., 2013; Kahng, 2020). Kahng's (2020) research demonstrates that L2-specific cognitive skills, particularly syntactic encoding speed, are strong predictors of certain disfluencies, such as the frequency of mid-clause silent pauses and self-corrections. This suggests that difficulties in rapidly accessing and assembling grammatical structures manifest directly in temporal aspects of speech. Task characteristics such as interactivity (monologic versus dialogic), planning time and content familiarity modulate fluency by altering cognitive load and resource allocation (Bui, 2014). Dialogic tasks, for instance, may reduce breakdown fluency due to shared cognitive load but can increase repair fluency due to online monitoring demands (Michel et al., 2007). Pre-task planning time as a task condition has been consistently shown to enhance L2 fluency. Planning allows speakers to pre-assemble linguistic content, reducing the cognitive load during online speech production. This typically results in faster speech rates, longer runs between pauses and a decrease in the frequency and length of mid-clause silent pauses, which are associated with lexical and syntactic encoding difficulties (Tavakoli and Skehan, 2005). Furthermore, Bui and Huang (2018) found that learners speaking on familiar topics consistently had a higher speech rate and fewer mid-clause pauses. However, such topic familiarity had no impact on repair fluency.
In addition, affective variables such as WTC interact with task conditions to influence fluency, although this interplay remains underexplored (Wood, 2016; Zhang et al., 2018). Idiodynamic case studies such as Wood (2016) investigate the complex, moment-to-moment relationship between WTC and L2 speech fluency. Cognitive struggles (e.g., difficulty retrieving vocabulary) led to lower WTC, which in turn negatively impacted fluency. Conversely, situational factors such as an interlocutor's non-verbal cues or a learner's sense of relief could raise or lower WTC, subsequently affecting fluency. Bui (2014) found that familiar topics elicited more extensive discourse (i.e., a higher total number of words produced) and phonation time than unfamiliar topics, suggesting a higher level of WTC in the L2. Nuanced findings also highlight proficiency-related differences: for intermediate learners, WTC correlates most strongly with repair fluency (fewer unnecessary self-corrections) because their linguistic competence is sufficient to act on communicative intent, whereas beginners’ fluency remains constrained by lexical/grammatical limitations even with high WTC (Dörnyei and Kormos, 2000). Despite these interesting findings, few studies have explored the causal link between WTC and L2 fluency through experimental designs, which are essential to understand how affective factors shape L2 fluency.
Willingness to communicate
WTC is broadly understood as the probability that a learner will choose to initiate spoken interaction in the L2 when the opportunity arises (MacIntyre et al., 1998; McCroskey and Baer, 1985). Early definitions treated WTC as a relatively stable, trait-like predisposition (Burgoon, 1976), but more recent conceptualisations emphasise its dynamic, situational nature (Aubrey and Yashima, 2023). MacIntyre et al.'s (1998) six-layer pyramid model locates WTC at the interface of enduring influences (e.g., personality, intergroup attitudes) and transient, moment-to-moment factors, such as state self-confidence and the immediate desire to speak to a specific interlocutor. In this view, WTC fluctuates rapidly as learners weigh approach motives (e.g., affiliation, task orientation) against restraining forces (e.g., anxiety, face-protection, cultural silence norms) (MacIntyre, 2007; Wen and Clément, 2003). Consequently, WTC is better viewed as a volitional state, an “on-the-spot” decision to speak or remain silent.
Measurement of WTC in TBLT research typically involves a mixed-methods toolkit. Quantitatively, researchers employ self-report scales derived from McCroskey and Baer's (1985) original WTC questionnaire. These instruments require learners to indicate the percentage of time (0–100%) they would choose to speak in 12 hypothetical situations (dyads, small groups, public presentations) with friends, acquaintances or strangers. Modified versions have been validated for L2 contexts (Lee and Drajati, 2020; MacIntyre et al., 2003) and for digital environments (Lee and Liu, 2024). Qualitative methods, such as stimulated recall interviews, reflective journals and classroom observations, capture the micro-level, moment-to-moment dynamics that questionnaires cannot (Cao, 2011). Together, these approaches allow researchers to triangulate learners’ stated intentions with their actual participation.
Variation in WTC during task-based L2 learning activities is shaped by an interrelated set of cognitive, affective, social and contextual factors. At the cognitive level, perceived L2 competence and real-time lexical retrieval demands directly affect state self-confidence (Zhang et al., 2018). Affective variables, especially communication anxiety and enjoyment, exert strong, sometimes non-linear influences; moderate facilitating anxiety can enhance WTC, whereas debilitating anxiety suppresses it (Dewaele and MacIntyre, 2014; MacIntyre, 1995). Social factors include group cohesiveness, interlocutor familiarity and culturally conditioned face-concerns; collectivist learners (e.g., Chinese English as a foreign language (EFL) students) often report lower WTC due to fear of losing face in front of peers or authority figures (Wen and Clément, 2003). Task characteristics further moderate these relationships: dialogic tasks that distribute cognitive load and promote peer support can raise WTC and, potentially, fluency, whereas high-stakes monologic tasks may heighten anxiety and reduce WTC (Michel et al., 2007). Finally, instructional variables, such as pre-task planning time, teacher immediacy and positive feedback, have been shown to boost WTC by lowering anxiety and enhancing perceived competence (Vongsila and Reinders, 2016; Weaver, 2005). Recognising this multi-dimensional variability is essential for designing task-based activities that not only elicit fluent L2 production but also cultivate learners’ willingness to engage.
Task interactivity
Task interactivity distinguishes monologic tasks, where learners produce speech independently, from dialogic tasks, which require real-time co-construction of meaning with an interlocutor (Gilabert, 2023). A growing body of empirical evidence indicates that this distinction has systematic consequences for L2 fluency. Monologic tasks in Michel et al.'s (2007) study are associated with higher speech rate, longer mean length of run and fewer mid-clause pauses. Learners can allocate working-memory resources almost exclusively to formulation and articulation, yielding “clean” temporal profiles that indicate automatised processing (Bui and Huang, 2018). In contrast, dialogic tasks distribute cognitive load across interlocutors but introduce pressures of turn-taking, negotiation of meaning and mutual face-management. Tavakoli's (2016) study with 35 upper-intermediate learners found that dialogue produced significantly faster speech rates, longer continuous runs and shorter silent pauses than matched monologues. These gains are attributed to collaborative scaffolding: interlocutors supply lexical items, complete syntactic frames and signal comprehension, thereby reducing formulation pauses. Nevertheless, dialogic settings also generated more filled pauses and overlapping speech, reflecting online monitoring and the interpersonal functions of hesitation phenomena (Michel et al., 2007). These inconsistent findings of the effects of task interactivity warrant further research.
Task interactivity may also influence WTC. As highlighted by Han and Li (2025), dialogic tasks foster greater WTC in supportive environments where positive peer relationships enhance engagement. This is corroborated by Chegini (2023), who found that while both cooperative (small group) and paired work (dyad) instructions enhanced WTC, the cooperative format was more effective among Iranian EFL learners. In contrast, monologic tasks present different cognitive and social demands than dialogic tasks, potentially influencing WTC and fluency differently (Michel et al., 2007). These findings underscore the importance of considering task interactivity in task-based lesson designs.
Research gaps and research questions
Despite advances, the causal relationships between WTC and L2 fluency using an experimental design remain underexplored. In addition, few studies have systematically compared such relationships across monologic and dialogic tasks. This study addresses these gaps by examining Hong Kong university students’ L2 task performance to explore how WTC and task interactivity influence L2 fluency. The following research questions (RQs) guided the present study:
RQ1: What is the influence of WTC on speech fluency in L2 learning tasks? RQ2: What is the influence of task interactivity (monologic versus dialogic tasks) on speech fluency in L2 learning tasks? RQ3: Does task interactivity mediate the effects of WTC on speech fluency in L2 learning tasks?
Methodology
Participants
Eighty-eight undergraduate students (38 male, 50 female) enrolled in a freshmen English public speaking course at a private university in Hong Kong voluntarily took part in this research. Participants were selected from 199 volunteers registered based on performance in an International English Language Testing System (IELTS) mock speaking test administered one week prior to the main study and their responses to the MacIntyre et al. (1999) WTC questionnaire conducted immediately before the speaking tasks. The participants were aged between 18 and 21 years (mean = 19.34). All participants spoke Chinese as their mother tongue (Cantonese: 76, Putonghua: 12) and had an average of 13.47 years of L2 English learning history. Based on their self-reported Diploma of Secondary Education (DSE) or IELTS scores, their English proficiency corresponded to the B1 level of the Common European Framework of Reference for Languages (CEFR). The majority of students (62, or 70.45%) were enrolled in various business and management majors, while the remaining (26, or 29.55%) came from other disciplines: journalism: 6, Asian studies: 5, computing: 5, media and technology: 3 translation: 2, others: 5. As an incentive, each participant received an honorarium of HK$50 upon completion of the speaking tasks.
Study design
As shown in Table 1, this study employed a 2 × 2 between-participant factorial design to investigate the main and potential interaction effects of WTC and task interactivity on L2 speech fluency. To ensure a clear distinction between groups, cut-off points for WTC were established based on a pilot study: a score of ≤33 defined the low-WTC group (n = 44) and a score of ≥38 defined the high-WTC group (n = 44). From this pool, 44 students (22 high WTC, 22 low WTC) were assigned to a monologic task, while the remaining 44 students formed 22 pairs (11 high-WTC pairs, 11 low-WTC pairs) for a dialogic task. This stratification ensured that WTC levels were evenly distributed across task conditions, allowing for a robust comparison of its effects on speech fluency. The study's design, which included homogeneous WTC pairing for the dialogic task, aimed to control for potential confounding effects of differing WTC levels within a dyadic interaction, thereby isolating the impact of the interactivity variable more effectively.
Study design.
WTC: willingness to communicate.
A two-way analysis of variance (ANOVA) revealed significant differences in WTC levels between high- and low-WTC groups (F(1, 1) = 611.51, p < .001, ηp2 = .88. In contrast, no significant difference emerged between monologic and dialogic task conditions (F(1, 1) = 0.82, p = .37, ηp2 = .01). At the same time, a one-way ANOVA for the speaking pre-test revealed no significant differences between the four groups (F = 1.55, df = 3, p = .21, ηp2 = 0.052). All of this confirms the robustness of the study design.
The two independent variables were WTC, with two levels (high WTC and low WTC), and task interactivity, also with two levels (monologic task and dialogic task). The dependent variables were various measures of L2 speech fluency validated by Bui and Huang (2018). Table 2 in the Data Analysis section details all the measures and their operationalisation
Coding scheme for fluency measures (dependent variables).
Instruments
The primary instrument for assessing participants’ baseline communicative disposition was a WTC questionnaire, adapted from MacIntyre et al. (1999). The original questionnaire measures both productive and receptive skills, but this study only included questions related to speech production to measure participants’ WTC orally. The questionnaire was administered in English, the medium of instruction at the university. Participants rated their level of willingness in various scenarios on a 5-point Likert scale (1 = lowest, 5 = highest), with a maximum possible score of 50. Appendix 1 shows the adapted WTC questionnaire. A “spot-the-difference” picture task with 10 differences was used to assess participants’ speech fluency. Appendix 2 details the picture prompts of the speaking task.
Procedures
The 88 selected participants undertook the communicative tasks in a quiet classroom in person, either individually (monologic) or in same-WTC pairs (dialogic), with only the researcher or a trained research assistant present. Participants were briefed on the nature of the research and assured of the anonymity of both their identities and their data and their right to withdraw from the study at any time. They then all signed a consent form.
Without pre-task planning time, participants commenced the speaking task immediately upon receiving the instructions. Task durations differed according to condition. Participants in the monologic condition were shown the two variant pictures and given 3 minutes to identify 10 differences. In the dialogic condition, each pair member received one of the two pictures and collaborated—describing, comparing and contrasting—to locate the same 10 differences. Four minutes were allotted to allow for the extended interaction required. The slightly longer duration for the dialogic task was based on pilot study findings, which indicated the need for additional time to accommodate discussion strategies, such as turn-taking, clarification and confirmation.
Data analysis
Speech samples collected from monologic and dialogic tasks were transcribed verbatim and coded for the fluency measures following the conventions of Bui and Skehan's (2016) online speech analytic tool called CALF, with an unfilled pause defined as a silence lasting 0.4 s or longer. Table 2 details the coding scheme of the fluency measures. The coded data were then automatically analysed using CALF to ensure consistency and accuracy. A two-way ANOVA was conducted to examine the main and interaction effects of WTC and task interactivity on L2 fluency. The significance level was set at p < .05, and the effect size (ηp2) thresholds were defined as small (.01), medium (.06) and large (.138), following Pallant (2013: 218).
Findings
This section presents the effects of WTC and task interactivity on L2 fluency, including their main and interaction effects.
As Table 3 shows, the speech rate varies systematically between the WTC groups. The group exhibiting higher WTC demonstrates a substantially faster mean speech rate compared to the group with lower WTC across both monologic and dialogic conditions. Although the main effect of task interactivity on speech rate is not significant, it nevertheless becomes a significant moderator of the effect of WTC, as indicated by the significant interaction effect (p < .001 with a large effect size, ηp2 = .34).
Willingness to communicate (WTC) and task interactivity effects on speech rate (words per minute).
IVs: interventions.
Figure 1 indicates that while the performance gap between WTC groups persists in both interactivity conditions, the disparity in speech rate is more pronounced during monologic tasks. This suggests that lower WTC has a particularly detrimental effect on speech fluency in the absence of an interlocutor.

Interaction effects of willingness to communicate (WTC) and task interactivity on speech rate.
Table 4 presents the results of the effects of WTC and task interactivity on the frequency of mid-clause pauses per 100 words. No statistically significant difference was found between the high- and low-WTC groups. In contrast, a significant main effect for task interactivity was observed. Participants produced significantly more mid-clause pauses during monologic tasks compared to dialogic tasks, F(1) = 12.00, p < .001, with a moderate effect size (ηp2 = .13). The interaction effect between WTC and task interactivity on mid-clause pause counts was not significant.
Willingness to communicate (WTC) and task interactivity effects on number of mid-clause pauses per 100 words.
IVs: interventions.
Table 5 concerns the influence of WTC and task interactivity on the number of end-of-clause pauses per 100 words. Neither main effect reached statistical significance. The WTC × interactivity interaction was not significant either, indicating that the effect of WTC on end-of-clause pausing did not differ between monologic and dialogic conditions.
Willingness to communicate (WTC) and task interactivity effects on number of end-of-clause pauses per 100 words.
IVs: interventions.
Table 6 describes the effects of WTC and task interactivity on the frequency of filled pauses, that is, non-silent hesitation phenomena such as “err” and “hmm”. Participants with high WTC do not differ from their low-WTC counterparts in terms of filled pauses. However, a significant main effect emerged for task interactivity. Monologic tasks elicited more filled pauses than dialogic tasks (p = .04), but only with a small effect size (ηp2 = .05). The WTC × interactivity interaction was non-significant, suggesting that the effect of task interactivity on filled pauses was comparable across high- and low-WTC speakers.
Willingness to communicate (WTC) and task interactivity effects on the number of filled pauses per 100 words.
IVs: interventions.
The effects of WTC and task interactivity on the total frequency of repairs, that is, false starts, reformulations, repetitions and replacements, are reported in Table 7. A significant main effect emerged (p = .003) for WTC, with a large effect size (ηp2 = .10). Speakers reporting high WTC produced fewer repairs than those with low WTC. Task interactivity had no discernible impact. Monologic and dialogic tasks yielded almost identical repair rates. The WTC × interactivity interaction did not reach statistical significance, indicating that the observed WTC-related differences in repairs were consistent across both monologic and dialogic conditions.
Willingness to communicate (WTC) and task interactivity effects on total number of repairs per 100 words.
IVs: interventions.
Discussion
The findings reveal a complex relationship where WTC significantly influences speech rate and repair fluency, while task interactivity primarily affects breakdown fluency, specifically frequency of mid-clause pauses. Interestingly, an interaction effect was observed, indicating that the impact of WTC on speech rate is moderated by task interactivity. This section will interpret these findings through the lens of existing theories and prior research.
Effects of WTC on L2 fluency
The results of this study demonstrate a significant effect of WTC on speed and repair fluency. These findings align with the theoretical underpinnings of WTC, which posit that learners with higher WTC are often characterised by greater self-confidence and lower communication anxiety. Such positive affective states can reduce the cognitive load associated with online speech production, allowing for more efficient retrieval of lexical items and syntactic structures, and consequently, a faster and smoother delivery. The reduced need for repairs among high-WTC learners might also suggest a greater sense of control over their language production or a higher tolerance for minor errors, enabling them to maintain fluency without frequent self-correction. This is consistent with Wood's (2016) suggestion of a potential link between WTC and fluency gains, although the current study provides more specific evidence regarding speech rate and repairs. The lack of a significant WTC effect on pausing frequency (both mid-clause and end-of-clause) and filled pauses suggests that while WTC influences the overall speed and smoothness of speech, it may not directly govern the finer-grained aspects of pausing behaviour.
The observed positive relations between higher WTC and higher speech rate with a lower repair frequency can be further interpreted through the lens of MacIntyre et al.'s (1998) heuristic model of WTC. This model conceptualises WTC as a volitional process influenced by both enduring individual traits (such as personality and intergroup attitudes) and transient situational factors (such as state self-confidence and the desire to communicate with a specific person). Undeniably, L2 fluency can be achieved either through more advanced cognitive fluency (Segalowitz, 2010) or through the avoidance of complex utterances that pose processing challenges for L2 learners (Bui, 2021). The results of this research suggest that learners with high WTC are more likely to experience a more favourable balance of approach and avoidance motives. Higher state self-confidence, a critical component of the WTC model, can mitigate the anxiety and cognitive interference that often impede fluent L2 production. When learners feel more confident in their L2 abilities and less apprehensive about communication, their cognitive resources are less depleted by affective factors, allowing for more efficient processing and articulation. This efficiency translates directly into a faster speech rate. Similarly, a lower frequency of repairs among high-WTC individuals could stem from a combination of factors: greater linguistic automaticity due to more frequent communication attempts, a more positive self-perception of L2 competence leading to fewer perceived errors or a strategic decision to prioritise fluency over other aspects in communicative contexts.
The finding that WTC did not significantly affect pausing frequency suggests that these disfluency markers are primarily reflexes of resource-intensive cognitive processes (e.g., lexical retrieval, syntactic encoding), which may operate automatically during speech formulation and are thus less susceptible to the motivational variance captured by WTC (De Jong, 2016; Kahng, 2020). While high-WTC learners speak faster and with fewer overt corrections, the fundamental need to pause for conceptualisation and formulation, especially in a cognitively demanding L2 task, might remain relatively constant across WTC levels. This nuanced impact of WTC highlights its role as a facilitator of overall oral performance rather than a determinant of all specific fluency sub-components.
Effects of task interactivity on L2 fluency
Task interactivity emerged as a significant factor influencing breakdown fluency, particularly the frequency of mid-clause pauses. Participants engaged in monologic tasks produced a significantly higher number of mid-clause pauses compared to those in dialogic tasks. The moderate effect size (ηp2 = .13) suggests that the cognitive demands inherent in monologic production, where the speaker bears the sole responsibility for conceptualising and formulating their message without interlocutor support, lead to more frequent disruptions in speech flow. Mid-clause pauses are often interpreted as indicators of formulation difficulty or disruptions in syntactic planning (Tavakoli, 2011). Conversely, dialogic tasks, by their interactive nature, allow for a distribution of cognitive load. Interlocutors can provide scaffolding through back-channelling, clarification requests or collaborative completion of utterances, which can alleviate some of the formulation pressure on the individual speaker (Szyszka and Lintunen, 2025). This shared responsibility can reduce the need for prolonged or frequent mid-utterance pauses as speakers can rely on their partner for support or time to plan.
Furthermore, a smaller but still significant effect of task interactivity was found for filled pauses, with monologic tasks eliciting more filled pauses than dialogic tasks. Despite the small effect size (ηp2 = .05), this finding still suggests that monologic contexts place greater self-monitoring demands on the speaker. Filled pauses (e.g., “um”, “er”) can serve as stallers, allowing speakers to hold the floor while they plan their next utterance or search for the right word. In a monologue, speakers might rely more on these vocalised hesitations to manage planning pressure. In contrast, dialogic interaction offers more dynamic turn-taking and opportunities for negotiation, potentially reducing the reliance on filled pauses as sole floor-holding devices. Interestingly, task interactivity did not significantly influence speech rate, end-of-clause pauses or repair frequency, which contrasts with some previous research (e.g., Tavakoli, 2016, who found faster speech rates in dialogue). This discrepancy might be attributed to the specific task type used (the “spot-the-difference” task), which still requires careful description and comparison, which might have led to comparable overall speech rates across conditions when averaged.
The differential impact of task interactivity on the location of pauses is noteworthy. While mid-clause pauses significantly increased in monologic tasks, the frequency of end-of-clause pauses did not differ significantly between monologic and dialogic tasks. This pattern supports the theoretical distinction between these two types of pauses (Tavakoli, 2011). End-of-clause pauses are generally considered to be more planful, occurring at natural syntactic boundaries and often reflecting higher-level conceptualisation processes or discourse planning (Bui and Huang, 2018). These pauses might be necessary for chunking information into coherent units regardless of the interactive context. In contrast, mid-clause pauses are more likely to signal local difficulties in lexical retrieval or grammatical encoding. The increased cognitive load of monologic tasks appears to exacerbate these local formulation challenges, leading to more frequent disruptions within clauses. Dialogic tasks, by offering opportunities for collaborative meaning construction and shared lexical search (Cutrone and Siewkee, 2024), can mitigate these specific difficulties. The finding that repairs were not significantly affected by task interactivity suggests that the need for self-correction might be driven more by individual differences such as WTC, or by the inherent demands of the “spot-the-difference” task itself, rather than by the presence or absence of an interlocutor.
Interaction effects between WTC and task interactivity on L2 fluency
A significant interaction between WTC and task interactivity was found for speech rate, revealing that WTC's impact on learners’ speaking pace can be moderated by how interactive the task is (Figure 1). This suggests that the cognitive demands of a monologue, that is, requiring independent speech planning and execution, heighten the benefits of high WTC. Learners with high WTC, likely due to greater confidence, lower anxiety or more efficient cognitive processing (Peng, 2025), manage the solo demands of monologic speech more effectively, maintaining a faster rate. In contrast, learners with low WTC struggle more under the higher cognitive load of monologues, slowing their speech. Dialogic tasks, however, appear to mitigate this disparity; the interactive support, even among similarly predisposed peers, may buffer cognitive strain, narrowing the speech rate gap between high- and low-WTC learners.
This interaction effect for speech rate aligns with theories linking affective factors (e.g., WTC) and cognitive load (Lambert et al., 2023). Monologic tasks impose heavier demands on conceptualisation, formulation and self-monitoring, with no interlocutor to share the burden. High WTC, associated with lower anxiety and greater perceived competence (Li, 2025), may enhance cognitive resource management, enabling faster speech. Conversely, low-WTC learners face a “double burden”, whereby limited processing capacity are compounded by anxiety, resulting in greater fluency breakdowns in monologues. Dialogic interaction, through turn-taking and scaffolding, distributes processing load, reducing performance gaps.
The absence of significant interaction effects for mid-clause pauses, end-of-clause pauses, filled pauses and repairs suggests that WTC and task interactivity influence these aspects of fluency more independently, or that their combined influence does not manifest as a statistical interaction in this dataset. This pattern implies that the mechanisms through which WTC affects pausing and repair behaviour might be distinct from those influencing speech rate, or that the moderating effect of task interactivity is most potent for the speed fluency.
Theoretical and pedagogical implications
The significant interaction effect between WTC and task interactivity suggests that the influence of a learner's WTC is not static but rather is modulated by the cognitive architecture of the task. This finding bridges affective models of L2 communication with cognitive models of speech production (e.g., Levelt, 1989), indicating that the “volitional process” of WTC can buffer against the high cognitive load inherent in monologic tasks, thereby protecting speech rate from degradation. This moves the field beyond viewing WTC and task interactivity as separate influences and towards an integrated model where their interplay is a key predictor of performance. Also, the results show that WTC primarily influences speed and repair fluency, while task interactivity targets breakdown fluency (mid-clause pauses and filled pauses). This differentiation advances our understanding of L2 fluency as a multi-dimensional construct and provides a more granular theoretical model for predicting how affective and task-related factors interact to shape distinct components of L2 speech production.
The findings also offer implications for L2 pedagogy and assessment. For instructors, the results underscore the importance of a differentiated approach to task design. While monologic tasks are necessary for developing independent speaking skills, they disproportionately challenge learners with low WTC, potentially reinforcing anxiety and hindering fluency. Therefore, a balanced pedagogical sequence is recommended: beginning with collaborative, dialogic tasks to build confidence and scaffold fluency for all learners, particularly those with lower WTC, before progressing to more demanding monologic performances. Secondly, the study highlights that fostering WTC is not merely about creating a positive classroom atmosphere; it is also a direct investment in fluency. Pedagogical interventions aimed at boosting WTC, such as building rapport, providing positive feedback and normalising communication errors, can yield tangible gains in students’ speech rate and reduction of unnecessary repairs. Finally, for assessment, the results caution against over-reliance on monologic tasks in high-stakes testing, as they may not provide a fair measure of a low-WTC learner's underlying linguistic competence. Incorporating dialogic or collaborative tasks can offer a more comprehensive and equitable profile of a learner's L2 oral proficiency.
Conclusion
This study investigated the effects of WTC and task interactivity on speech fluency of Hong Kong university English as a second language (ESL) students. The findings reveal that high WTC was associated with faster speech rate and fewer repairs. Task interactivity, by contrast, primarily influenced breakdown fluency: monologic tasks led to a higher frequency of mid-clause pauses and, to a lesser extent, filled pauses, compared to dialogic tasks. Meanwhile, a positive WTC effect on speech rate was moderated by task interactivity, with the advantage particularly pronounced in the monologic condition. The study contributes to a more nuanced understanding of affective and task design factors in shaping L2 speech production.
That said, this research is not without limitations. Firstly, only learners’ self-reported WTC as well as a limited set of fluency indices were measured. Future research can consider including real-time WTC dynamics obtained through the idiodynamic methodology (Lambert et al. 2023) with a wider range of fluency measures (such as the mean length of pauses). Secondly, the study utilised a single task type which, while effective for controlling cognitive demand, limits the generalisability of the findings. Future studies should incorporate more diverse task types to enhance their applicability to other contexts. Finally, the operationalisation of the dialogic condition involved pairing learners with homogeneous WTC levels (high–high or low–low). While this design choice effectively controlled for the confounding variable of interlocutor WTC, it may not fully represent authentic classroom dynamics. Future studies could employ a mixed design to compare the effects of homogeneous versus heterogeneous WTC pairings on fluency outcomes.
Footnotes
Acknowledgements
This manuscript was proofread by ChatGPT 5.0 for language accuracy; however, all research ideas, data and interpretations of the findings are entirely the author's own. The author would also like to thank his research assistant, Mr Andrew Wong, for his help with data collection and materials preparation. Thanks are also due to Dr Roby Marlina, Editor of RELC Journal, and the two anonymous reviewers, whose insightful comments substantially improved the quality of this article.
Ethical approval and informed consent statements
This research has received ethical clearance from the university research committee. All participants have given their written consent to participate.
Funding
The author disclosed receipt of the following financial support for the research, authorship and/or publication of this article: This work was supported by the Research Grants Council, University Grants Committee (grant number UGC/FDS14/H13/20).
Declaration of conflicting interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability statement
Data is available upon request.
