Abstract
This study seeks to address a gap in our understanding of how corrective feedback (CF) influences second language (L2) learning by examining the specific impacts of oral and written CF on acquiring the third person singular -s in the simple present tense. The study examines these effects on both explicit and implicit knowledge. The research was conducted in five intermediate adult English as a second language classrooms in Peru (N = 101), using a pretest–posttest design with one control group (n = 24) and four experimental groups: oral recast (n = 21) oral metalinguistic CF (n = 18) written direct CF (n = 16) and written metalinguistic CF (n = 22). The results revealed no significant difference between oral and written CF; however, differences were observed based on measurement types and CF subtypes used. This study’s findings carry theoretical and pedagogical implications, contributing valuable insights to both second language writing research and pedagogy.
I Introduction
While second language acquisition (SLA) research has traditionally focused on the impact of corrective feedback (CF) on various aspects of language acquisition, including noticing and changes in linguistic competence, second language (L2) writing researchers have primarily concentrated on how CF can enhance writing performance (Ellis, 2010; Sheen, 2010, 2011). As Ellis (2010) explained, oral CF has moved from predominantly descriptive studies aimed at developing taxonomies of the CF strategies to experimental studies that investigate the effects of different types of CF strategies on second language development. Written CF studies, however, have progressed from studies concerned primarily with the effects of CF on learners’ revision of their original texts to experimental studies that investigate the effect written CF has on the accuracy with which learners use specific grammatical structures in new pieces of writing. Consequently, studies examining oral and written CF have historically been conducted separately, often guided by distinct theoretical frameworks, and employing different research methodologies.
This separation between oral and written CF research, guided by distinct theoretical frameworks and different research methodologies, has created a gap that necessitates bridging for a more comprehensive understanding of CF’s influence on language learning and writing development (Sheen, 2010) and on both explicit and implicit knowledge (Ellis, 2010; Li & Vuono, 2019). Integrating insights from both second language acquisition (SLA) and L2 writing research can provide a more holistic perspective on the role of CF in language development, contributing to the development of more robust theoretical frameworks.
To bridge the gap, the present study was designed to examine the differential effects of both oral and written CF. By exploring the impacts of both feedback types, the study responds to the need for research that systematically compares the effects of different CF types across oral and written modalities. In doing so, the study seeks to extend and build upon prior research, specifically Sheen’s (2010) study which also explored similar aspects of CF. However, our study distinguishes itself by examining these effects on learners’ explicit and implicit knowledge. By examining the differential effect of oral and written CF on both implicit and explicit knowledge, the study not only provides insights into how different types of feedback interact with the development of distinct knowledge types but also informs the design of effective feedback strategies tailored to the specific knowledge being acquired, which can consequently contribute to the ongoing refinement of language teaching practices.
II Literature review
Oral and written CF research has grown dramatically over the last years. This is not only because the study of CF has served as a means of testing the claims of SLA theories concerning the importance and need of CF for acquisition but also because of the practical and pedagogical concerns that motivates such research (Ellis, 2010). CF refers to strategies that indicate implicitly or explicitly to learners that their output is erroneous, non-target-like, and/or not appropriate or ambiguous in some way (Nassaji, 2016). It can be provided on oral, written, and technology-mediated output, and in response to a range of errors, including linguistic, content, organization, discourse, and pragmatic errors (Nassaji & Kartchava, 2017; Oliver & Adams, 2021).
In the literature, a theoretical distinction has been made between the effect of oral and written corrective feedback. For example, it has been argued that oral CF is typically immediate whereas written CF is delayed (i.e. it becomes available to the learner some time after the error is committed). Based on this difference, Doughty (2001, cited in Ellis, 2010) claimed that oral CF may be more effective in promoting interlanguage restructuring precisely because it occurs in what she calls a window of opportunity for connecting form to meaning. However, not all oral CF is immediate since teachers can sometimes elect to delay feedback provision on oral errors until the learners have completed an oral activity. There are also individual differences and contextual factors that can mediate the effectiveness of feedback.
Both oral and written CF research has focused on the investigation of the strategies that can be employed to correct learner errors. There are, however, debates as to what types of CF are more beneficial to L2 learning and in what ways. In this relation, Ellis (2005) emphasized the importance of distinguishing between implicit and explicit knowledge in CF research. Implicit knowledge involves intuitive awareness of linguistic norms, procedural knowledge, and access through automatic processing. In contrast, explicit knowledge entails conscious awareness, declarative knowledge, and access through controlled processing and verbalizable self-report. In this respect, the effectiveness of CF involves a complex interplay between conscious and unconscious processes. Explicit knowledge may be primarily developed through CF that prompts conscious language processing. In contrast, the acquisition of implicit knowledge is more complex and may also vary among learners, influenced by divergent cognitive processes, learning styles, and prior language learning experiences (Nassaji, 2020b).
Oral CF research has extensively examined the effects of various types of implicit and explicit feedback, such as recasts versus metalinguistic feedback. The findings of such research – documented in numerous individual research projects, reviews, and meta-analyses (e.g. Goo & Mackey, 2013; Li, 2010; Lyster & Ranta, 2013; Lyster & Saito, 2010; Lyster et al., 2013; Mackey & Goo, 2007; Nassaji, 2015, 2016; Russell & Spada, 2006) – have consistently demonstrated that both feedback types play a role in facilitating L2 acquisition. However, their effectiveness varies depending on the manner, context, and conditions in which the feedback is provided. For instance, certain studies suggest that recasts are generally considered less effective than explicit or metalinguistic feedback (Nassaji, 2009; Sheen, 2010), Nevertheless, there is evidence that recasts can become highly effective when they are made more explicit by isolating the error, adding stress, and thereby increasing the explicitness of the correction (Nassaji, 2007, 2015). This nuanced understanding of the impact of different oral feedback types emphasizes the importance of considering the specific nature of the correction in studying their effectiveness within the language learning process.
In the domain of written CF, two primary types have been identified: direct feedback, where the correct form is provided, and indirect feedback, where an error is indicated without specifying the correct form (Ferris, 2006). The effectiveness of explicit and implicit feedback in written form can be influenced by the nature of the correction. Direct feedback methods encompass various approaches such as crossing out superfluous words or phrases, inserting missing words, bracketing misplaced words, and indicating their proper place in a sentence; additionally, the correct form may be written above the error or in the margin (Ferris, 2006; Lira-Gonzales & Nassaji, 2023). The choice between these types of written feedback can significantly influence its impact on language learners. Indirect feedback does not provide the correct form and can be provided through codes indicating the type of error (e.g. ‘SS’ for sentence structure error) or through other techniques like circling, underlining, inserting arrows or question marks, and using metalinguistic cues (Lee, 2004). Regardless of the method used, indirect feedback places the responsibility on learners to diagnose the error for themselves. This approach is considered to encourage a more active engagement with the correction process, fostering a discovery-based learning (Ferris, 2006) that potentially contribute to the development of implicit knowledge.
Studies on written CF have compared direct CF with various types of indirect CF across different contexts. Chandler (2003) was one of the first that investigated direct and indirect written CF, comparing four types: direct correction, and three indirect methods (underlining with error codes, error codes without indicating error locations, and simple underlining). The study found both direct and indirect CF (underlined) feedback to be significantly more effective than descriptive feedback in reducing long-term errors. Subsequent studies (e.g. Bitchener, 2008; Bitchener et al., 2005; Gholaminia et al., 2014; Sheen, 2007; Shintani & Ellis, 2013; Suzuki et al., 2019), however, produced mixed results. Notably, Suzuki et al. (2019) found indirect feedback to be effective with a mediating role for error type. Kim et al. (2020) discovered direct CF to be more useful for accuracy, but both types were effective in promoting learning through collaborative writing. Van Beuningen et al. (2008) found similar effects on revisions, with direct feedback having a greater impact on new text accuracy. Sheen (2007) observed indirect metalinguistic feedback outperforming direct feedback in a delayed posttest, while Shintani et al. (2014) reported a more durable effect for direct written CF.
While existing studies examined either oral or written CF, few combined both, and none separately evaluated their effectiveness. Bitchener et al. (2005) compared different direct CF combinations and found explicit written CF with one-on-one conferences more effective in enabling students to use the past simple tense and the definite article in new pieces of writing. Bitchener (2008) also investigated different direct CF combinations (e.g. direct error correction with written and oral meta-linguistic explanation in the form of a thirty-minute classroom lesson; direct error correction with written meta-linguistic explanation; direct error correction; no corrective feedback) on two functional uses of the English article system (‘a’ and ‘the’). The study revealed positive results for error correction when written and oral meta-linguistic explanation are combined. Bitchener and Knoch (2009) found no difference in accuracy between various direct written CF options (i.e. direct corrective feedback, written and oral metalinguistic explanation; direct corrective feedback and written meta-linguistic explanation; direct corrective feedback only.
III The research gap
Although previous research has examined the effect of oral and written feedback combination, a gap in the literature is the lack of research analysing the independent effects of oral and written CF in studies involving both modalities. In other words, the studies that have incorporated both oral and written feedback have blended these modalities together by primarily focusing on their combined impact, not comparing and analysing their distinct contributions. A comparison of the effects of the two feedback modalities can provide insights into not only their differential effects, if any; but, also into the interplay between feedback modalities.
To date, Sheen’s (2010) study is the only study that directly compared CF types across oral and written modalities. Her investigation into four feedback types – oral recasts, oral metalinguistic CF, written direct CF, and written metalinguistic CF – on English articles revealed that explicitness, rather than modality, significantly influenced effectiveness. While valuable, it is crucial to recognize that one study is insufficient for definitive conclusions, given the complexities of language learning and the diverse factors influencing feedback effectiveness.
Similar to Sheen’s (2010) study, our study also examined the impact of oral versus written CF. However, while Sheen focused on English definite and indefinite articles, our investigation examined the effects on the third person singular -s in the simple present tense. This broader examination of various target structures aims to deepen our understanding of feedback’s role in language learning. Unlike Sheen’s research, we also controlled for learners’ first language (L1) background, focusing solely on Spanish L1 learners. This allows us to eliminate potential confounding effects on the research results.
Furthermore, our study goes beyond Sheen’s work by distinguishing between the impacts of oral and written CF on both explicit and implicit knowledge. As noted above, we focused on the investigation on the third person singular -s in the simple present tense – a structure recognized as ‘inherently easy to learn as explicit knowledge but difficult to acquire as implicit knowledge’ (Ellis & Sheen, 2006, p. 432). While adding an ‘s’ to the base form of the verb for he/she/it may be simple to grasp explicitly, it does not guarantee the development of implicit knowledge (Ellis, 2006; Loewen et al., 2009). This study, thus, addressed this issue by using measures that tap into these different types of knowledge.
IV Research questions
The following two questions and sub-questions guided our study:
• Research question 1: Is there any difference in the effect of oral versus written CF on the accurate use of third-person singular -s in the simple present tense (irrespective of types of oral or written feedback)? And, if so, does this differential effect (if any) vary depending on the specific type of outcome measure used (i.e. explicit versus implicit knowledge measures)?
• Research question 2: Is there any difference in the effect of different types of oral and written CF on the accurate use of third-person singular -s in the simple present tense (i.e. oral recasts, oral metalinguistics CF, direct written CF, written metalinguistic CF)? And, if so, does this differential effect (if any) vary depending on the type of outcome measures used (i.e. explicit versus implicit knowledge measures).
V Methods
1 Research context
The present study was carried out in five intermediate adult English as a second language (ESL) classrooms in a private university in Lima, Peru. This university has developed a curriculum that includes program-specific optional courses in English. There are six ESL proficiency levels ranging from I (beginner) to VI (advanced). When students reach English IV, they can pursue bilingual studies.
ESL classes at this university used to involve four to six hours of classroom instruction and four hours of monitored online study each week. However, because of the pandemic context, in which the study took place, all classroom instruction was provided online using Zoom for the synchronous sessions as well as Canvas LMS and MyELT for the asynchronous activities provided in the course. The synchronous sessions were delivered through Zoom for six hours weekly. The online sessions began presenting the learning outcomes, activating prior knowledge, and delivering an engaging warm-up activity that led the students to the main purpose of the class. During these sessions, the students were able to interact in real time with their peers and teacher, using interactive boards, playing online, watching videos, participating in breakout rooms, among other interactive online activities while using their microphones and cameras. At the end of each class, the students were informed about their flipped activity to do before the next session. The asynchronous hours were devoted to solving the activities provided in Canvas (videos, links, questionnaires, forums). As for MyELT, which is a platform provided by NatGeo publisher, students access this platform to solve the activities in the online workbook. The students were also assigned different types of homework that they had to solve individually and sometimes in groups. These assignments were uploaded through Canvas LMS before the deadline set by the teacher. As for evaluations, the students had to sit for a quiz (in Canvas) every two weeks in their synchronous sessions. Thus, they had to be connected to Zoom and Canvas LMS at the same time.
2 Research participants
The research recruited students from the context described above. They were 101 intermediate adult ESL students registered in five ESL intact classes. Following past studies (e.g. Loewen, 2004; Mackey, 2006; Mackey & Oliver, 2002; Nassaji, 2007, 2013) students’ language proficiency was determined by the institutional placement test. In the present study participants’ English proficiency level was used as an indicator to ensure that they shared a similar language proficiency level. Since the study focus was on investigating the impact of the treatments on language learning outcomes within this proficiency range, language proficiency level was not a major independent variable and was solely used to determine comparability among participants. Therefore, reliance on the internal placement test, while not ideal in terms of external validation, was considered sufficient.
As noted earlier, the aim of the study was to compare the effectiveness of two modalities of feedback oral vs. written CF and the extent to which their effects are mediated by the nature of feedback within each modality. Thus, five intact classes were randomly assigned to one of the five different groups in this study as follows.
■ Group 1: Oral recast group (henceforth OR), n = 21;
■ Group 2: Oral metalinguistic group (henceforth OM), n = 18;
■ Group 3: Written direct group (henceforth WD), n = 16;
■ Group 4: Written direct metalinguistic group (henceforth WM), n = 22;
■ Group 5: Control group, n = 24.
Table 1 shows information about the participants in each group and their characteristics. Table 2 shows the operationalization of the types of feedback addressed in this study.
Participants’ information.
Types of feedback.
3 Target linguistic structure
The target structure was the English simple present tense third person singular -s. There were three reasons behind this selection. First, although this target structure is easy to explain, L2 learners experience problems with its production, even at advanced levels (Nassaji, 2016). Second, since the third person singular -s in the simple present tense has little communicative value, it is highly possible that students will not notice it in naturalistic communication, which makes it a good target structure for feedback provision. Third, the teachers teaching these students also stated that students had serious difficulty using this target structure, particularly in spontaneous communication.
4 Treatment procedures
The study used a pretest–treatment–posttest design following Sheen (2010), in each study group, there were two treatment sessions, each involving a 30-minute narrative for eliciting the use of the third person singular -s in the simple present tense. For each of the two treatment sessions, a 30-minute narrative task was used to elicit the third-person singular -s in the present simple tense errors from the students. The first narrative task described the daily routine of Sophie, a British teenager, whereas the second task described the daily routine of Albert, an American college student. Both tasks were chosen from an ESL website and were then adapted with the help of the teachers who were also the study co-researchers. The first narrative task described the daily routine of Sophie, a British teenager and the second task described the daily routine of Albert an American college student. The teachers considered that both narratives were appropriate for the level of their students and reported that their students often make errors using the third person singular -s.
5 Procedure for Oral CF Treatment
The oral CF treatment sessions involved students reading silently the two narrative tasks mentioned above (Sophie’s daily routine and Albert’s daily routine). For each task students were asked to read the narratives silently. Then, the teacher took back the narratives and read the story out aloud just once to refresh the students’ memory as they noted down the key words. The teacher then gave the students 5 minutes to practice telling the story in teams of three or four (Zoom breakout rooms). Finally, each team retold the story to the entire class, with each student in the team providing only one or two sentences before passing the speaker role to the next team member. Whenever a student made an error in the use of the third-person singular -s in the simple present tense, the teacher corrected the error using the type of feedback assigned to each group (oral recast or oral metalinguistic feedback).
6 Written CF Treatment
The written CF treatment involved the same story tasks that were used in the oral treatment sessions, but students were asked to reproduce the story in writing. First, the teacher showed the story in Canvas and told the students that they were going to read the story and then rewrite the story. Then, students were asked to read the short story silently. The teacher then removed the stories from the screen. Before asking the students to rewrite the story, the teacher read the story aloud once while the students noted down key words. The students were then asked to rewrite the story as closely as they could remember. The students submit their written narratives. In the following class (3 days later), students received their narratives with corrections according to their CF groups (written direct or written metalinguistic CF). The students were asked to look over their errors and the corrections carefully for at least five minutes. However, as in Sheen (2010), the teacher did not comment further on their errors or give any additional explanation.
7 Testing instruments and scoring procedures
Three types of tests were used: an error correction test, a speeded dictation test, and a free writing task. Two versions of each of the tests were developed and used a pretest and a posttest.
a The Error Correction Test
There were two versions of the error correction test (pretest and posttest). Each version of the test consisted of 15 items adapted from tests used in Sheen (2007) and Lira-Gonzales and Nassaji (2023). Each item contained two related statements, one of which was underlined and contained an error that the students were asked to correct in writing. In both the pretest and posttest versions of the error correction test 10 items involved the use of simple present tense, and five distractor items involved the use of there is/are and subject–verb agreement. These distractors were strategically included to reduce the participants’ awareness of the study’s focus, and also to create a more natural testing environment. The following are two examples taken from the pre- and posttest, followed by their corrections:
Jean is an English teacher at my school. He teach in my friends’ class every Tuesday.
He teaches in my friends’ class every Tuesday.
There are a special toy in my little brother’s room. It is a puppet that can say ‘Good morning!’ when you touch it.
There is a special toy in my little brother’s room.
My sister is a great cook. Every year she bake a delicious chocolate cake for my birthday.
Every year she bakes a delicious chocolate cake for my birthday.
There are a new student in my class. He and his family come from Montreal.
There is a new student in my class.
The error correction test was not time pressured and therefore it allowed students to draw on their explicit grammatical knowledge and therefore it was taken to tap more into explicit knowledge than implicit knowledge (Nassaji, 2020b). This type of test requires learners to consciously and deliberately apply their knowledge of language to identify and correct errors. Each version of the error correction test (pre- and posttest) was scored on a discrete item basis. One point was given for each correct suppliance of the third person singular -s in the 10 obligatory contexts.
For the reliability, a second researcher checked 25% of the error correction tests (pre- and posttests) independently for the scoring of the test related to the third person singular -s in simple present tense. The agreement rate was 98% in the pretest and 100% in the posttest.
b The Speeded Dictation Test
The speeded dictation test was adapted from Sheen (2007, 2010) and Lira-Gonzales and Nassaji (2023). This test was designed to create a time-pressured environment, which limits learners’ ability to apply their grammatical knowledge consciously and deliberately. Therefore, it was considered a test tapping into implicit knowledge (Ellis, 2006). Previous research has also utilized speeded dictation tests as an implicit knowledge test (e.g. Lira-Gonzales & Nassaji, 2023; Sheen, 2007, 2010).
In the speeded dictation test, learners were asked to listen to a short text and write it down as quickly as possible. Each version of the speeded dictation test (pre- and posttest) consisted of 15 items, each of which contained one sentence involving the use of simple present tense, as applied to daily routines. The total number of items in each test was 15. Ten items in each version of the test (pre- and posttest) had one or two stimuli involving the use of -s for the third person singular in present tense obligatory contexts. Five items in each version of the test (pre- and posttest) had one stimulus involving the use of the base form of the first-, second-, and third-person plural in simple present tense.
Examples of items: (1) Marie kisses her little sister every morning. (2) My parents wake up early during the week.
Example 1 measures learners’ receptive and productive knowledge of -s affixed to a referential third person singular verb with a simple present tense function (‘Marie kisses’ is the stimulus in the item). Example 2 measures knowledge of the verb form for the first-, second-, and third-person plural in simple present tense (‘my parents wake up’ is the stimulus). As in Sheen (2007) we provided each student with a small notebook when administering the pre- and post speeded dictation tests. The teachers first explained the procedures to the students, then read two sample sentences so that the students could familiarize themselves with the approach. The items were read at a natural pace and the students were directed to write as quickly and exactly as they heard it. In order to prevent the students from consciously reworking what they had originally written, they were told that once turning the page for the next item, they were not allowed to return. Since this time was time pressured, it was assumed to tap more into implicit knowledge (Nassaji, 2020b). Because of the time pressure, this type of test limited learners’ ability to apply their knowledge of the language consciously and deliberately. Therefore, the speeded dictation test was taken to assess learners’ implicit knowledge rather than explicit knowledge.
For the analysis, accuracy scores in the form of percentages of correct to incorrect uses of the target structure were used. For this purpose, first, the total number of accurate and inaccurate uses of the target structure were calculated. Target-like use (TLU) scores were calculated (Pica, 1991, in Sheen, 2007) and analysed to measure learners’ knowledge of the third person singular -s in the present simple tense. It was first scored for correct use in obligatory contexts. This score was then inserted as the numerator of a ratio for which the denominator was the sum of the number of obligatory contexts for the third person singular -s and the number of non-obligatory contexts in which articles were supplied inappropriately. Consideration was also given to overuse of the target form. The scoring formula taken from Sheen (2007, p. 266) is shown in the following equation:
The same formula and manner of scoring was used to calculate the accuracy of the target structure in the free written production tests. For the reliability, a second researcher checked 25% of the dictation tests (pre- and posttests) independently for the identification of obligatory contexts for the third person singular -s in the simple present tense and scoring; the agreement rate was 95% in the pretest and 98% in the posttest.
c The Free Writing Tasks
Both free writing tasks (pre- and posttest) were adapted from Lira-Gonzales and Nassaji (2023). The tasks involved a sequence of eight pictures that conveyed the stories of a typical day in the lives of two different people. Students had to write at least two sentences about each picture and provide as many details as possible when describing each picture. The topic of the pre-writing test was the daily life of Mr. Smith, a middle-aged man who goes to work. The topic for the post-writing test was the daily life of John, a teenager who goes to school.
In the free writing tasks, learners were asked to use the language in a spontaneous way, producing sentences with a focus on conveying or communicating meaning. This type of task, thus, requires learners to focus more on meaning than form and use their knowledge more implicitly and unconsciously. Therefore, they have been taken to tap more into learners’ implicit knowledge of the language that other explicit knowledge test such as error correction tests (Nassaji, 2020a).
The scoring was performed by the teachers. Initially, all the writings including the revised drafts were scored for the accuracy of the target structure (present tense singular -s). For this purpose, first the correct use of the target structure in obligatory contexts was scored (each correct response received ‘1’ and each incorrect response received ‘0’). Therefore, accuracy scores in the form of percentages of correct to incorrect uses of the target structure were used (Sheen, 2010). First the total number of accurate and inaccurate uses of the target structure were calculated. Then the same formula and manner of scoring used to calculate the accuracy of the target structure in the speeded dictation test was used. For the reliability, as in the speeded dictation tests a second researcher checked 25% of the writings (pre- and posttests) independently for the identification of obligatory contexts for present tense and scoring and the agreement rate was 97% in the pretest and 98% in the posttest.
VI Data analysis and results
1 The effects of CF in general
We first addressed research question 1 and its subpart that concerned the effect of CF in general. The question was: Is there any difference in the effect of oral versus written CF on the accurate use of third-person singular -s? (irrespective of types of oral or written feedback) and, if so, does this differential effect (if any) vary depending on the specific type of outcome measure used (i.e. explicit versus implicit knowledge measures)? For the outcome measures, following past research (Karim & Nassaji, 2020; Sinha & Nassaji, 2022) we calculated an accuracy gain score for each participant.
Gain scores were calculated by computing the difference in scores between the ratio of correct responses in the pretests and the posttests for each of the tests using the following formula:
One advantage of gain scores is that they help control for initial differences among participants by measuring the change from the pretest to the posttest. In doing so, they focus on how much individuals or groups improved, rather than their absolute performance. Gain scores are also relatively easy to interpret. A positive gain score indicates improvement, while a negative one suggests a decline. One limitation of gain scores, however, is that they may be influenced by the initial values of the pretest scores. For example, if the gain score suggests that group A has a larger improvement compared to group B and group A has started with lower pretest scores, the improvement might not be because of the treatment alone but because group A had more room for improvement, leading to a larger gain score (Table 3).
Descriptive statistics of feedback types and outcome measures.
To examine the main effects of feedback types and outcome measures as well as their interaction effects, we conducted a two-way mixed-model repeated-measures ANOVA on the gain scores with feedback group – written, oral, and no feedback (control) – as a between-group variable and type of outcome measure as a within-group variable (speeded dictation, writing, and error correction tasks). If the results showed statistical significance, following previous studies, post hoc pairwise comparisons were used to determine where the differences were among scores. For the repeated-measures ANOVA we checked the assumption of equality of covariance using Box’s test, and that assumption was not met (p = .005). Therefore, Pillai’s trace was the appropriate multivariate statistics to use (Leech et al., 2005). Levene’s Test of Equality of Variance was not significant for the speeded dictation F(2, 98) = 2.65, p = .075) and the error correction test F(2, 98 = 1.98, p = .143) but it was significant for the free writing test F(2, 98 = 3.62, p = .030). Therefore, the assumption of equality of variances was met for the first two tests and not for the last one. Given that this assumption was not met for the free writing task, a separate one-way ANOVA was conducted on this test, using the Welch adjustment followed by the Games Howell post hoc test if the results were significant. We also checked the sphericity assumption and the Mauchly’s test indicated that the assumption was met as the p-value was greater than .05, W = χ2(2) = .095, p = 0.953. Therefore, we did not need to correct the F-values for the main and interaction effects. For effect sizes (partial eta squared), using the following interpretation criteria were used: small (η2 = 0.01), medium (η2 = 0.06), and large (η2 = 0.14). For pairwise comparisons Cohen’s d was calculated and used the following criteria for interpretation: small (d = 0.2), medium (d = 0.5), and large (d = 0.8) (Cohen, 1988). The descriptive statistics for the analysis (i.e. the means and standard deviations) are shown in Table 4.
Descriptive statistics of gain scores for each outcome measure.
Table 4 shows the results of the descriptive statistics of gain scores for each outcome measure (also see Figure 1). As can be seen in all outcome measure, the learners’ gains from the pretest to the posttest is greater for the feedback groups than the control group. The results of two-way mixed-model repeated-measures ANOVA then showed a significant main effect for feedback group, F(2, 98) = 9.55, p < .001, η2 = 0.16, indicating significant differences among the groups (oral feedback, written feedback, and no feedback groups) irrespective of the outcome measures. To determine where the differences were, pairwise comparisons using the Bonferroni adjustment for multiple comparisons were conducted and the results showed a significant difference between the control group and the written CF group (p < .001, η2 = 1.2) and the control group and the oral CF group (p = .01, η2 = .68), with both the written CF group and the oral corrective feedback group outperforming the control group. No significance difference was found between oral and written CF. These results provide evidence for the effectiveness of CF (both oral and written) compared to no CF, irrespective of the type of outcome measures used.

Mean gain scores of the oral and written and no feedback groups across outcome measures.
The results also showed a main effect for outcome measures, using Pillai’s trace, F(2, 97) = .106, p = .004, η2 = .106, suggesting that there were differences among the three types of measures used (the error correction test, the speeded dictation test, and the free writing task) irrespective of the types of feedback. Pairwise comparisons using the Bonferroni adjustment for multiple comparisons showed a significant difference between the speeded dictation test and the error correction test (p = 003, η2 = .37), with the estimated marginal mean score (EMM) for the error correction test (EMM = 17.92) being significantly larger than that of the speeded dictation test (EMM = 5.85). No difference was found between the free writing task and the speeded dictation test, and the free writing task and the error correction test. If we assume that error correction tests tap more into explicit knowledge than speeded correct tests and free writing tasks, these results can be taken to indicate that CF overall contributed more to the development of explicit knowledge than that of implicit knowledge.
The results also showed a significant interaction effect between feedback group and outcome measures, using Pillai’s trace, F(2, 97) = .100, p = .039, η2 = .050. The pairwise comparisons using the Bonferroni adjustment for multiple comparisons examined these interaction effects by comparing the effect of the feedback groups within each outcome measure. The results showed no pairwise differences between the estimated marginal mean scores of any of the CF groups (oral, written, and no feedback group) in the speeded dictation test and in the free writing task (for the free writing task the Welch test was not significant, p = .230). However, significant differences in gains were found among the feedback groups in the error correction test. These differences were between the control group and the oral feedback group (p = .040, η2 = .58), between the control group and the written feedback group (p = .000, η2 = 1.3) and between the oral feedback and the written feedback groups (p = .032, η2 = .63), with the estimated marginal mean scores of both the oral and written feedback groups being greater than that of the control (no feedback) group (Control: EMM = .833, Oral: EMM = 18.46, Written: EMM = 34.47), and the estimated marginal mean scores of the written feedback group being significantly larger than that of the oral feedback group. If we assume that error correction tests tap more into explicit knowledge than implicit knowledge, these findings suggest that both oral and written CF contributed significantly to the development of explicit knowledge. The greater mean score of written CF group than the oral CF group in the error correction test further indicates that of these two types of feedback, written CF contributed more significantly to the development of explicit knowledge than oral CF (see Figure 1).
2 The effect of subtypes of oral and written feedback
We then addressed research question 2 and its subpart, which concerned the effect of different subtypes of oral and written CF (i.e. oral recasts, oral metalinguistic feedback, written direct feedback and written metalinguistic feedback). The question was:
• Is there any difference in the effect of different types of oral and written CF on the accurate use of third-person singular -s in the simple present tense (i.e. oral recasts, oral metalinguistics CF, direct written CF, written metalinguistic CF)?
• And, if so, does this differential effect (if any) vary depending on the type of outcome measures used (i.e. explicit versus implicit knowledge measures).
Like the previous analysis, we calculated gain scores from the pretest to the posttest for each participant. The descriptive statistics are shown in Table 4. One notable observation in the table is the presence of high standard deviations. This can be attributed to the calculation of gain scores based on percentage correct in the pre- and posttests. When computing percentage correct for a group of individuals or a series of tests, there may be a wide range of performance levels within that group. Some learners may have very high percentages correct (e.g. 70% correct), while others may have very low percentages correct (e.g. 2%). This variability contributes to higher standard deviations.
To address our research questions, we conducted a two-way mixed-model repeated-measures ANOVA with CF subtypes (i.e. oral recasts, oral metalinguistic, written metalinguistic, written direct, and control) as a between-group variable and type of outcome measure (speeded dictation test, free writing task, and error correction test) as a within-group variable. For the repeated-measures ANOVA we checked the assumption of equality of co-variance matrices using Box’s test, this assumption was not met (p = .003). Therefore, Pillai’s trace was used as the appropriate multivariate statistics (Leech et al., 2005). Levene’s Test of Equality of Variance was not significant for any of the outcome measures. Speeded dictation F(4, 96) = 1.67, p = .162), Free writing task F(4, 96) = 2.36, p = .059), Error correction F(4, 96) = 2.08, p = .089), and therefore the assumption of equality of variance was met for the three dependent variables.
The results of the repeated-measures ANOVA showed a significant main effect for feedback subtypes F(4, 96) = 10.85, p = .000, η2 = .31, suggesting a significant difference among the feedback groups (i.e. oral recasts, oral metalinguistic, written metalinguistic, written direct, and control). Pairwise comparisons using the Bonferroni adjustment for multiple comparisons showed a significant different between the control group and both the written metalinguistic (p < .001, η2 = 1.5) and the oral metalinguistic groups (p < .001, η2 = 1.2), with the estimated marginal mean scores for the written metalinguistic feedback group (EMM = 23.99) and for the oral metalinguistic group (EMM = 22.66) being significantly larger than the mean score of the control group (EMM = 1.56). No difference was found between the control group and the oral recast group and the control group and the written direct group, suggesting that these feedback types did not have a significant effect on the learners’ test performance. However, there was a significant difference between the oral recasts and the written metalinguistic feedback groups (p = .000, η2 = 1.6) and the oral recast group and the oral metalinguistic feedback group (p = .002, η2 = 1.2) with the estimated marginal mean scores of written metalinguistic feedback and oral metalinguistic feedback groups being larger than that of the oral recast group (EMM = 5.19). Altogether, these results suggest that the most effective CF subtypes were metalinguistics feedback (both oral and written), and that these feedback types were more effective than oral recasts and written direct feedback.
The results of the mixed-model repeated-measures ANOVA also showed a main effect for outcome measures (i.e. speeded dictation test, free writing task, and error correction test), using Pillai’s trace, F(2, 95) = 9.06, p = .000, η2 = .160, and a significant interaction between subtypes of CF and outcome measure, using Pillai’s trace, F(8, 192) = 2.17, p = .031, η2 = .083. Pairwise comparisons for the main effect, using the Bonferroni adjustment for multiple comparisons, showed a significant difference between the speeded dictation test and the error correction test (p = .000, test, η2 = .37), with the estimated marginal mean score for the error correction test (EMM = 21.34) being significantly larger than that of the speeded dictation test (EMM = 6.75). They also showed a significant difference between the free writing task and the error correction test (p = .017, η2 = .25), with the estimated marginal mean score of the error correction test (EMM = 21.34) being larger than that of the free-writing task (EMM = 11.49). No difference was found between the mean score of the free writing task and that of the speeded dictation test. These results suggest that the effect of feedback subtypes in general was more pronounced on the test that tapped into explicit knowledge (i.e. error correction test) than those that tapped more into implicit knowledge (i.e. the speeded dictation test and the free writing task).
The pairwise comparisons for the interaction effects, using Bonferroni adjustment for multiple comparisons, showed no significant difference between the different CF subtypes in the speeded dictation test, indicating no effect of CF on this particular type of implicit knowledge test. However, in the free writing task, the results showed a significant difference between written metalinguistic group and the control group (p = .049, η2 = .70) with the estimated marginal mean score of the written metalinguistic feedback group (EMM = 22.84) being significantly larger than of the control group (EMM = 1.87), suggesting that the written metalinguistic group contributed positively to the learners’ accuracy of the target form in this particular type of test. No significant difference was found between other groups. If we assume that free writing tasks tap more into implicit knowledge, these results can be taken to suggest that written metalinguistic CF contributed significantly to the development of this type of knowledge.
In the third outcome measure (error correction test), the results showed a significant difference between the control group and all CF groups, except for the oral recast group (control group: EMM = .83 vs. written metalinguistic CF: EMM = 40.90, p = .000, η2 = 1.46), control group vs. oral metalinguistic CF (EMM = .83 vs. EMM = 35.55, p = .000, η2 = 1.13), and control group vs. written direct CF (EMM = .833 vs. EMM = 25.62, p = .027, η2 = .90). If we assume that error correction tests tap more into explicit knowledge, these results suggest that except for oral recasts all other CF subtypes contributed significantly to the development of learner explicit knowledge of the target form.
In the error correction test, there was also a significant difference between oral recasts and written metalinguistic CF (oral recasts: EMM = 3.81 vs. written metalinguistic: EMM = 40.90, p = .000, η2 = 1.81) as well as between oral recasts and oral metalinguistic CF (p = 001, η2 = 1.30), with the oral metalinguistic group (EMM = 35.55) outperforming the oral recast groups (EMM = 3.81). These findings suggest that metalinguistic CF (both oral and written) contributed significantly more to the development of explicit knowledge of the target form than oral recasts. No difference was found between oral recasts and written direct CF (EMM = 25.62, p = 09), suggesting no differential effect in the contributions of these two feedback types to the development of the explicit knowledge of the target form (for the mean gains of feedback types across the three outcome measures, see Figure 2).

Mean gain scores of the different types of oral and written feedback across outcome measures.
VII Discussion
In the following section, we will discuss the results of the two research questions and sub-questions that guided the study.
As for research question 1, our findings showed no difference in the effect of oral versus written CF on the acquisition of third person singular -s in the simple present tense. There was, however, a significant accuracy gain of all the groups that received corrective feedback in comparison with the control group. These findings suggest that CF is effective overall as shown by previous literature (e.g. Al-Rubai’ey & Nassaji, 2013; Bitchener & Knoch, 2008; Chandler, 2003; Ferris, 2006; Ferris & Roberts, 2001; Lalande, 1982; Lee, 2004; Sheen, 2007; Suzuki et al., 2019). These findings not only confirm the effectiveness of CF, but also – since the study examined the effect of CF in a specific context and on a specific target form which was not previously explored – contribute to the generalizability of the effects of CF to a broader range of contexts and language structures.
The results regarding the effect of CF on different types of outcome measure (i.e. explicit versus implicit knowledge measures) suggest (1) that written CF affected explicit knowledge more significantly than oral CF did, (2) that CF contributed more to the development of explicit knowledge than implicit knowledge, and (3) that both explicit and implicit knowledge may have contributed to the observed outcomes. These findings are important because most previous studies have either employed measures of explicit knowledge such as grammaticality judgment tests or error correction tests (e.g. Ellis et al., 2006a; Endley & Karim, 2022; Shao & Liu, 2022). If they have used measure of free writings (e.g. Shintani et al., 2014), they have combined the data across tests without examining whether written or oral CF facilitates the development of the learner’s implicit knowledge versus explicit knowledge of the target structure. The present study contributes to resolving this uncertainty by examining the effect of CF by measuring explicit and implicit knowledge simultaneously.
The findings regarding the more beneficial effect of written CF than on oral CF on learners’ explicit knowledge measured through the written error correction test can be partly explained by the difference between written and oral register. When students receive written CF, they often have more time to think and process the feedback more consciously (Sheen, 2010); also, when they are asked to do error correction tests which are not time pressured, they rely more on their explicit knowledge. Therefore, both written CF and the error correction tests involved more conscious knowledge, and the effect of written CF on error correction tests suggests that more explicit written CF contributes more to explicit knowledge than implicit feedback. Additionally, writing in general is a more explicit task than oral production. In writing, learners have time to correct their mistakes as they are more aware of what they have written (Kenworthy, 2006). In the case of oral production, the discourse that is produced is spontaneous, which means that there is less time to think in detail about what to say. Therefore, the probability of committing errors without having the possibility of thinking is higher (Picón Jara, 2015).
A worthy point to consider when discussing the findings regarding the effect of CF on different types of outcome measure (i.e. explicit versus implicit knowledge measures) is the potential influence of the pretests and focused CF on participants’ awareness of the study’s objectives. In situations where participants are explicitly exposed to the target structure through pretests and CF, it becomes more challenging to assert that implicit knowledge alone is being measured. However, it is important to note that the posttests in this study were designed with distractors to assess participants’ understanding of the target structure in a more subtle manner. These distractors were strategically included to reduce the participants’ awareness of the study’s focus on the target structure, aiming to create a more natural testing environment. Nevertheless, it is possible that both explicit and implicit knowledge may have contributed to the observed outcomes in this study.
Research question 2 concerned the effect of different subtypes of oral and written CF on the acquisition of third person singular -s in the simple present tense. Our findings showed that metalinguistic CF was more effective in the acquisition of the target structure in both feedback modalities (i.e. oral and written) than recasts and written direct CF. The efficacy of metalinguistic CF on the acquisition of third person singular -s in the simple present tense can be explained by its saliency and noticeability since it explicitly provides learners with the opportunity to diagnose their ungrammatical utterances (Nassaji, 2017). Furthermore, since metalinguistic feedback is output-triggering by nature, it has the potential to promote the grammatical accuracy of L2 learners (Rassaei, 2015). Previous studies comparing metalinguistic feedback with more implicit feedback such as indirect feedback or even with direct feedback alone have also shown more positive effect for metalinguistic feedback (e.g. Bitchener & Knoch, 2009; Bitchener et al., 2005; Sheen, 2007). For example, Bitchener et al.’s (2005) findings of a study on advanced L2 learners suggested that oral metalinguistic feedback when combined with direct feedback was more effective than direct feedback alone. Sheen’s (2007) findings also provided evidence that written as well as oral metalinguistic CF improved learners’ accuracy.
In terms of the differential effects of different subtypes of CF on different types of outcome measures, the results showed no significant difference between the different CF subtypes in the speeded dictation test, indicating no differential effect of CF subtypes on this particular type of knowledge test. However, in the free writing task, the results showed a significant effect for the metalinguistic feedback, suggesting that the written metalinguistic CF contributed positively to the learners’ accuracy of the target form in this particular type of knowledge test. No significant effect was found for the other feedback subtypes. If we assume that free writing tasks tap more into implicit knowledge, these results can be taken to suggest that written metalinguistic CF contributed significantly to the development of this type of knowledge.
In the third outcome measure (error correction test), the results showed a significant effect for all CF types, except for the oral recast group. If we assume that error correction tests tap more into explicit knowledge, these results suggest that except for oral recasts all other CF subtypes contributed significantly to the development of learner explicit knowledge of the target form. In the error correction test, there was also a larger effect of both written metalinguistic CF and oral metalinguistic than oral recasts, and the effects of oral recasts and written direct feedback were comparable. If we assume the error correction test tapped more into explicit knowledge, these findings suggest that metalinguistic CF (both oral and written) contributed significantly more to the development of explicit knowledge of the target form than oral recasts.
In short, we can see two main results: first, no differential effect of CF subtypes on speeded a dictation test (or implicit knowledge test), but a differential effect of CF on explicit knowledge test (error correction test) with both oral and written metalinguistic CF contributing more to this explicit knowledge test than oral recasts. Another finding was that metalinguistic CF also contributed significantly to the free writing tasks. This latter finding is important in that it shows that metalinguistic CF has the potential to contribute to the development of both explicit and implicit knowledge (Ellis et al., 2006b).
The results that all CF subtypes contributed to the error correction task except for recasts, and that the effects of oral recasts were similar to those of direct feedback, align with Shintani and Ellis (2013) who also found no significant effect for direct CF on the accurate use of the target feature (English articles). These findings suggest that correcting learners’ errors either orally through recasts or in written form through direct correction alone may not be a benefited corrective strategy because, although these corrective feedback strategies provide the learner with the target form, it is possible that learners may not understand the correction. When direct correction is not accompanied by an understanding of why the sentence is erroneous, the learner may not be able to learn from the correction. In Shintani and Ellis’s study metalinguistic explanation led to gains in accuracy in the error correction test as well as in a new piece of writing completed immediately after the treatment. Shintani and Ellis interpreted these results as indicating that the metalinguistic explanation helped learners to understand the form. Of course, the effect of this type of feedback was not evident in the delayed writing task, which suggests that these effects were not durable. This suggests that, while error correction can have a significant impact on learning, no matter how it is delivered, it may not be able to have a lasting effect if the feedback is not ongoing or if it is not supported by other instructional strategies (Nassaji, 2016).
VIII Conclusions and implications
One of the important contributions of this study lies in its systematic approach to comparing the differential impacts of different types and subtypes of CF delivered through two distinct modalities: oral and written. By investigating the effects of oral and written corrective feedback, particularly focusing on the acquisition of the third person singular -s in the simple present tense, this study seeks to address a gap in our understanding of how corrective feedback (CF) influences L2 learning. Through examining the specific impacts of oral and written CF, the research shed light on effective strategies for enhancing language learning in this specific linguistic context.
The study found significant differences between the control group and both the written and oral corrective feedback groups, suggesting that both types of feedback were more effective than no feedback at all. This implies that language instructors can use either or both oral or written corrective feedback, as both are effective in improving learner performance.
The study’s findings also highlight the importance of distinguishing between explicit and implicit knowledge in written corrective feedback research. It suggests that written corrective feedback (CF) may exert a more pronounced influence on learners’ explicit knowledge, involving their conscious understanding of grammar rules. This suggests that instructors need to consider the varied impacts of different feedback modalities on learners' linguistic development. They should design their lessons with a clear understanding of whether they aim to develop learners’ explicit or implicit knowledge. For example, if the goal is to enhance explicit knowledge of grammar rules, written CF might be more suitable than oral feedback.
The study found no effect of different subtypes of feedback on speeded dictation. However, in a free writing task, the group receiving written metalinguistic feedback showed significantly higher accuracy compared to the control group, suggesting a positive impact on implicit knowledge development. This implies that written metalinguistic CF may contribute to the development of implicit knowledge, particularly in tasks like free writing. Therefore, language instructors should consider incorporating written metalinguistic feedback into their teaching methods to enhance learners' accuracy in producing target forms in tasks requiring implicit knowledge application, such as free writing (Nassaji, 2009).
Empirically, the study opens opportunities for further research into the nuanced effects of CF on explicit and implicit language knowledge across various linguistic structures and learning contexts. Researchers can explore the specifics of how different types of CF impact language learning outcomes. For example, while past research has explored the effects of oral and written CF to some extent, this research goes a step further by examining these effects on different outcome measures. These outcome measures, designed to assess both explicit and implicit language knowledge, tap into diverse facets of language learning. This expansion of the research scope enables us to gain a more comprehensive understanding of how CF influences language learners, acknowledging that language acquisition involves both conscious rule-based understanding and intuitive, implicit language use.
Nevertheless, it is important to acknowledge certain limitations of this study. The research was conducted exclusively with adult intermediate ESL learners, and their performance was solely assessed using the target structure in pre- and posttests. No additional individual factors that might have influenced the learners’ performance in both tests were explored. Thus, further research is needed to investigate the role of other learner characteristics that were not accounted for in this study but that are known to mediate the effects of CF for individual learners (e.g. motivation, aptitude, skill level, anxiety level, age).
In addition, this study employed a limited number of CF types (oral recasts, written metalinguistic, oral metalinguistic and written direct). Given the wide range of CF types (including oral explicit correction, oral explicit correction with metalinguistic explanation, oral clarification requests, repetition, elicitation, written reformulation, indirect metalinguistic written correction) that constitute both explicit correction and prompts, further research is also needed to identify the components of these CF types that might contribute to their effectiveness. Overall, as we move forward, researchers can build upon the findings of this and other similar studies to uncover the intricate factors that influence language learning and writing development, providing a more comprehensive and effective approach to second language research and education.
As previously mentioned in Section V, students in the oral CF groups were asked to retell a story orally, while students in the written CF groups were asked to reproduce a story in writing. We acknowledge that this difference in the modality of practice could potentially complicate the interpretation of our results. However, while the modality of practice was not held constant between the groups, we believe this decision was justified for the following two reasons. First, it is for ecological validity. In real-world language learning settings, students often receive corrective feedback in different modalities depending on the context (e.g. oral discussions for oral corrective feedback, written assignments for written corrective). By replicating this context in our study, we aimed to enhance the ecological validity of our findings. While controlling practice tasks may enhance the internal validity of the study, it may also limit the study’s external validity, as it may not accurately reflect the real-world educational contexts where practice tasks and CF modalities are often interrelated. Second, our study’s research design is consistent with research methodologies employed in prior studies investigating CF modalities. For instance, Sheen (2010) conducted a similar investigation where they compared oral CF to written CF. In their study, too, oral feedback was provided on oral tasks and written feedback was provided on written tasks, hence allowing for the differences in practice modalities. The adoption of a design consistent with previous research allows comparison of results with those studies.
