Sage Journals: Discover world-class research

Abstract

Recent studies have highlighted the potential of generative artificial intelligence, such as ChatGPT, to address challenges in providing accurate and pedagogically relevant feedback. However, empirical evidence on how prompt engineering shapes feedback quality remains limited. This study examined how zero-shot, few-shot and chain-of-thought prompting strategies influenced the accuracy and depth of ChatGPT-generated qualitative feedback on second language (L2) essays. A total of 176 essays from Filipino and Thai learners with intermediate English proficiency were evaluated using ChatGPT-4o under the three prompting strategies. The findings showed that few-shot prompting achieved the highest accuracy, while chain-of-thought prompting produced the most elaborated feedback, particularly in addressing grammatical complexity. Zero-shot prompting lagged in both accuracy and depth, with notable issues in grammatical feedback. Implications for L2 writing instruction, assessment and research are discussed.

Keywords

Automated writing evaluation automated written feedback ChatGPT generative artificial intelligence writing feedback

Get full access to this article

View all access options for this article.

References

Allen

Mizumoto

(2024) ChatGPT over my friends: Japanese English-as-a-foreign-language learners’ preferences for editing and proofreading strategies. RELC Journal. Epub ahead of print 26 July 2024. DOI: 10.1177/003368822412625

Al Nazi

Hossain

Al Mamun

(2025) Evaluation of open and closed-source LLMs for low-resource language with zero-shot, few-shot, and chain-of-thought prompting. Natural Language Processing Journal 10: 100124.

Alyasiri

Salman

Salisu

(2024) ChatGPT revisited: using ChatGPT-4 for finding references and editing language in medical scientific articles. Journal of Stomatology, Oral and Maxillofacial Surgery 125(5): 101842.

Banihashem

Kerman

Noroozi

, et al. (2024) Feedback sources in essay writing: peer-generated or AI-generated feedback? International Journal of Educational Technology in Higher Education 21: 23.

Barrot

(2021) Effects of Facebook-based e-portfolio on ESL learners’ writing performance. Language, Culture and Curriculum 34(1): 95–111.

Barrot

(2023) Using ChatGPT for second language writing: Pitfalls and potentials. Assessing Writing 57: 100745.

Barrot

(2024) Trends in automated writing evaluation systems research for teaching, learning, and assessment: A bibliometric analysis. Education and Information Technologies 29: 7155–7179.

Behzad

Kashefi

Somasundaran

(2024) Assessing online writing feedback resources: generative AI vs. Good Samaritans. In: Calzolari

Kan

Hoste

Lenci

Sakti

Xue

(eds) Proceedings of the 2024 Joint International Conference on Computational Linguistics. Torino: Language Resources and Evaluation (LREC-COLING 2024), 1638–1644.

Bitchener

Ferris

(2012) Written Corrective Feedback in Second Language Acquisition and Writing. New York: Routledge.

10.

Bui

Barrot

(2025) ChatGPT as an automated essay scoring tool in the writing classrooms: how it compares with human scoring. Education and Information Technologies 30(2): 2041–2058.

11.

Cavalcanti

Barbosa

Carvalho

, et al. (2021) Automatic feedback in online learning environments: a systematic literature review. Computers and Education: Artificial Intelligence 2: 100027.

12.

ElEbyary

Shabara

(2024) ChatGPT-generated corrective feedback: does it do what it says on the tin? Teaching English with Technology 24(3): 68–89.

13.

Escalante

Pack

Barrett

(2023) AI-generated feedback on writing: insights into efficacy and ENL student preference. International Journal of Educational Technology in Higher Education 20: 57.

14.

Feuerriegel

Hartmann

Janiesch

, et al. (2024) Generative AI. Business & Information Systems Engineering 66(1): 111–126.

15.

Zou

Xie

, et al. (2022) A review of AWE feedback: types, learning outcomes, and implications. Computer Assisted Language Learning 37(1–2): 179–221.

16.

Giray

(2023) Prompt engineering with ChatGPT: A guide for academic writers. Annals of Biomedical Engineering 51: 2629–2633.

17.

Gupta

Tiwari

Chaudhary

(2025) Generative AI: Techniques, Models and Applications . Springer Nature.

18.

Huawei

Aryadoust

(2023) A systematic review of automated writing evaluation systems. Education and Information Technologies 28(4): 771–795.

19.

Hyland

(2006) Feedback on second language students’ writing. Language Teaching 39(2): 83–101.

20.

Ishikawa

(2013) The ICNALE and sophisticated contrastive interlanguage analysis of Asian learners of English. Learner Corpus Studies in Asia and the World 1: 91–118.

21.

Jacobsen

Weber

(2025) The promises and pitfalls of large language models as feedback providers: a study of prompt engineering and the quality of AI-driven feedback. AI 6(2): 35.

22.

Kim

(2024) Exploring the potential of using ChatGPT for rhetorical move-step analysis: the impact of prompt refinement, few-shot learning, and fine-tuning. Journal of English for Academic Purposes 71: 101422.

23.

Knoth

Tolzin

Janson

, et al. (2024) AI Literacy and its implications for prompt engineering strategies. Computers and Education: Artificial Intelligence 6: 100225.

24.

Lee

Teo

Tan

(2024a) Prompt engineering for knowledge creation: using chain-of-thought to support students’ improvable ideas. AI 5(3): 1446–1461.

25.

Lee

Jung

Jeon

, et al. (2024b) Few-shot is enough: exploring ChatGPT prompt engineering method for automatic question generation in English education. Education and Information Technologies 29(9): 11483–11515.

26.

Lin

Crosthwaite

(2024) The grass is not always greener: teacher vs. GPT-assisted written corrective feedback. System 127: 103529.

27.

(2023) The CLEAR path: A framework for enhancing information literacy through prompt engineering. The Journal of Academic Librarianship 49(4): 102720.

28.

Meyer

Jansen

Schiller

, et al. (2024) Using LLMs to bring evidence-based feedback into the classroom: aI-generated feedback increases secondary students’ text revision, motivation, and positive emotions. Computers and Education: Artificial Intelligence 6: 100199.

29.

Mizumoto

Shintani

Sasaki

, et al. (2024) Testing the viability of ChatGPT as a companion in L2 writing accuracy assessment. Research Methods in Applied Linguistics 3(2): 100116.

30.

Ngo

TT-N

Chen

HH-J

Lai

KK-W

(2022) The effectiveness of automated writing evaluation in EFL/ESL writing: a three-level meta-analysis. Interactive Learning Environments 32(2): 727–744.

31.

Ramesh

Sanampudi

(2021) An automated essay scoring systems: a systematic literature review. Artificial Intelligence Review 55(3): 2495–2527.

32.

Saricaoglu

Bilki

(2025) The capacity of ChatGPT-4 for L2 writing assessment: a closer look at accuracy, specificity, and relevance. Annual Review of Applied Linguistics 45: 1–21.

33.

Shi

Aryadoust

(2022) A systematic review of automated writing evaluation systems. Education and Information Technologies 27(1): 105–136.

34.

Shin

Lee

(2024) Exploratory study on the potential of ChatGPT as a rater of second language writing. Education and Information Technologies 29(18): 24735–24757.

35.

Steiss

Tate

Graham

, et al. (2024) Comparing the quality of human and ChatGPT feedback of students’ writing. Learning and Instruction 91(1): 101894.

36.

Velásquez-Henao

Franco-Cardona

Cadavid-Higuita

(2023) Prompt engineering: A methodology for optimizing interactions with AI-language models in the field of engineering. Dyna 90(SPE230): 9–17.

37.

Wan

Chen

(2024) Exploring generative AI assisted feedback writing for students’ written responses to a physics conceptual question with prompt engineering and few-shot learning. Physical Review Physics Education Research 20(1): 010152.

38.

Warschauer

Ware

(2006) Automated writing evaluation: defining the classroom research agenda. Language Teaching Research 10(2): 157–180.

39.

Wei

Wang

Schuurmans

, et al. (2022) Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35: 24824–24837.

40.

Yan

(2023) Impact of ChatGPT on learners in a L2 writing practicum: an exploratory investigation. Education and Information Technologies 28(11): 13943–13967.

41.

Yan

Zhang

(2024) L2 writer engagement with automated written corrective feedback provided by ChatGPT: a mixed-method multiple case study. Humanities and Social Sciences Communications 11(1): 1086.

42.

Yavuz

Çelik

Yavaş Çelik

(2025) Utilizing large language models for EFL essay grading: an examination of reliability and validity in rubric-based assessments. British Journal of Educational Technology 56(1): 150–166.

43.

Yong

Jeon

Gil

, et al. (2023) Prompt engineering for zero-shot and few-shot defect detection and classification using a visual-language pretrained model. Computer-Aided Civil and Infrastructure Engineering 38(11): 1536–1554.

44.

Zeevy-Solovey

(2024) Comparing peer, ChatGPT, and teacher corrective feedback in EFL writing: Students' perceptions and preferences. Technology in Language Teaching & Learning 6(3): 1482–1482.

Generative Artificial Intelligence for Automated Qualitative Feedback: A Cross-Comparison of Prompting Strategies

Abstract

Keywords

Get full access to this article

References