This study investigated the efficacy of artificial intelligence-based dynamic written corrective feedback on second language writing accuracy, fluency, complexity, and functional adequacy, while also examining user sentiment for teacher and students. Utilizing Claude 3 Opus as the primary artificial intelligence tool, the research compares artificial intelligence-generated feedback to traditional teacher-provided dynamic written corrective feedback within a 15-week intensive English program involving intermediate-high learners of English as a second language (n = 41). Using a quasi-experimental design, participants were randomly assigned to a control (teacher-based feedback) and treatment (artificial intelligence-based feedback) groups. Multiple metrics were used to assess second language writing development, including the error-free clause ratio, fluency, syntactic complexity (mean length of T-unit and clauses per T-unit), and rubric-based functional adequacy. Findings from repeated measures analysis of variance indicated that although both groups receiving dynamic written corrective feedback improved in writing accuracy, the teacher feedback group outperformed the artificial intelligence group in fluency and functional adequacy. No significant differences were observed for measures of syntactic complexity. Sentiment analysis revealed mixed reactions: although most students found artificial intelligence-based feedback helpful and easy to use, 27% of their commentary expressed concerns regarding feedback accuracy and clarity. Teachers echoed these concerns, citing some inconsistencies and student confusion. Additionally, the study compared Claude 3 Opus, Claude 3.5 Sonnet, and ChatGPT-4 in their ability to identify errors. Results suggest Sonnet may outperform Opus and ChatGPT-4, although unexpected autocorrections by Claude models introduced reliability concerns. These findings suggest that although artificial intelligence tools like Claude 3 Opus may facilitate writing accuracy gains comparable to those achieved through teacher feedback, they could inadvertently hinder other aspects of writing development. Given ongoing advancements in generative artificial intelligence, further research is warranted to explore whether newer models employing test-time compute or generative reasoning can offer improved dynamic written corrective feedback quality without compromising fluency or functional adequacy.
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
0.00 MB
6.69 MB