Sage Journals: Discover world-class research

Abstract

AI chatbots have emerged as innovative educational tools and drawn increasing attention from educators and researchers in programming education. Although previous research has highlighted potentials of applying AI chatbots in programming education, there is a lack of empirical evidence to understand the overall effects of using AI chatbots in programming learning as well as the critical factors that influence the effects. To fill this gap, this study conducted a meta-analysis of 32 empirical studies published between 2015 and 2025 to examine the overall effect size of applying AI chatbots on programming learning performance and identify significant moderators. The results indicated a small-to-medium effect on posttest performance (g+ = 0.538, 95% CI [.202, .873], p < .01) and a medium-to-large effect on practice performance (g+ = 0.650, 95% CI [.330, .970], p < .001), based on robust variance estimation models. Moderator analyses revealed that research design and AI chatbot-to-student ratio significantly influenced posttest performance. Specifically, true experimental designs demonstrated significantly larger effects than quasi-experimental designs, and a 1:1 chatbot-student ratio was substantially more effective than a 1:N ratio. These findings underscore the potential of AI chatbots in programming education and offer practical insights for optimizing their integration into instructional design.

Keywords

AI chatbots programming pedagogy learning performance meta-analysis effect size large language model

Get full access to this article

View all access options for this article.

References

Abdulla

Ismail

Fawzy

Elhag

(2024). Using ChatGPT in teaching computer programming and studying its impact on students performance. Electronic Journal of e-Learning, 22(6), 66–81. https://doi.org/10.34190/ejel.22.6.3380

Agbo

F. J.

Olivia

Oguibe

Sanusi

I. T.

Sani

(2025). Computing education using generative artificial intelligence tools: A systematic literature review. Computers and Education Open, 9, 100266. https://doi.org/10.1016/j.caeo.2025.100266

Begg

C. B.

Mazumdar

(1994). Operating characteristics of a rank correlation test for publication bias. Biometrics, 50(4), 1088–1101. https://doi.org/10.2307/2533446

Borenstein

(2022). Comprehensive meta‐analysis software. In Systematic reviews in health research. Wiley. https://doi.org/10.1002/9781119099369.ch27

Borenstein

Hedges

L. V.

Higgins

J. P. T.

Rothstein

H. R.

(2021). Introduction to meta‐analysis. Wiley. https://doi.org/10.1002/9781119558378

Cassano

Gouwar

Nguyen

Phipps-Costin

Pinckney

Yee

M.-H.

Anderson

C. J.

Feldman

M. Q.

Guha

Greenberg

Jangda

(2023). MultiPL-E: A scalable and polyglot approach to benchmarking neural code generation. IEEE Transactions on Software Engineering, 49(7), 3675–3691. https://doi.org/10.1109/tse.2023.3267446

Catalán

González-Castro

Delgado

Alario-Hoyos

Muñoz-Merino

(2021). Conversational agent for supporting learners on a MOOC on programming with Java. Computer Science and Information Systems, 18(4), 1271–1286. https://doi.org/10.2298/csis200731020c

Chen

Wei

Zhang

(2025). Learning by teaching with ChatGPT: The effect of teachable ChatGPT agent on programming education. British Journal of Educational Technology, 57(1), 163–184. https://doi.org/10.1111/bjet.70001

Chen

Yang

Metwally

A. H. S.

Lavonen

Wang

(2023). Fostering computational thinking through unplugged activities: A systematic literature review and meta-analysis. International Journal of STEM Education, 10(1), 47. https://doi.org/10.1186/s40594-023-00434-7

10.

Chiu

T. K.

(2024). A classification tool to foster self-regulated learning with generative artificial intelligence by applying self-determination theory: A case of ChatGPT. Educational Technology Research & Development, 72(4), 2401–2416. https://doi.org/10.1007/s11423-024-10366-w

11.

Choi

Kim

(2025). The impact of a large language model-based programming learning environment on students’ motivation and programming ability. Education and Information Technologies, 30(6), 8109–8138. https://doi.org/10.1007/s10639-024-13107-x

12.

Cohen

(2013). Statistical power analysis for the behavioral sciences. Routledge. https://doi.org/10.4324/9780203771587

13.

Deng

Jiang

Liu

(2025). Does ChatGPT enhance student learning? A systematic review and meta-analysis of experimental studies. Computers & Education, 227, 105224. https://doi.org/10.1016/j.compedu.2024.105224

14.

Duval

Tweedie

(2000). Trim and fill: A simple Funnel‐Plot–Based method of testing and adjusting for publication bias in meta‐analysis. Biometrics, 56(2), 455–463. https://doi.org/10.1111/j.0006-341x.2000.00455.x

15.

Egger

Smith

G. D.

Schneider

Minder

(1997). Bias in meta-analysis detected by a simple, graphical test. BMJ, 315(7109), 629–634. https://doi.org/10.1136/bmj.315.7109.629

16.

Essel

H. B.

Vlachopoulos

Tachie-Menson

Johnson

E. E.

Baah

P. K.

(2022). The impact of a virtual teaching assistant (chatbot) on students' learning in Ghanaian higher education. International Journal of Educational Technology in Higher Education, 19(1), 57. https://doi.org/10.1186/s41239-022-00362-6

17.

Fang

J. W.

Chen

Weng

Q. L.

Y. F.

Hwang

G. J.

Xia

Y. C.

(2025). Effects of a GenAI-based debugging approach integrating the reflective strategy on senior high school students’ learning performance and computational thinking. Educational Technology & Society, 28(3), 66–81. https://doi.org/10.30191/ets.202507_28(3).sp06

18.

Fauzi

Tuhuteru

Sampe

Ausat

A. M. A.

Hatta

H. R.

(2023). Analysing the role of ChatGPT in improving student productivity in higher education. Journal of Education, 5(4), 14886–14891. https://doi.org/10.31004/joe.v5i4.2563

19.

Gaitantzi

Kazanidis

(2025). The role of artificial intelligence in computer science education: A systematic review with a focus on database instruction. Applied Sciences, 15(7), 3960. https://doi.org/10.3390/app15073960

20.

Garcia

M. B.

(2025). Teaching and learning computer programming using ChatGPT: A rapid review of literature amid the rise of generative AI technologies. Education and Information Technologies, 30(12), 16721–16745. https://doi.org/10.1007/s10639-025-13452-5

21.

Gasaymeh

A.-M. M.

(2024). The effect of flipped interactive learning (FIL) based on ChatGPT on students’ skills in a large programming class. International Journal of Information and Education Technology, 14(11), 1516–1522. https://doi.org/10.18178/ijiet.2024.14.11.2182

22.

Gong

Qiao

(2025). Impact of generative AI dialogic feedback on different stages of programming problem solving. Education and Information Technologies, 30(7), 9689–9709. https://doi.org/10.1007/s10639-024-13173-1

23.

Guo

Sun

(2025). The impact of applying GenAI in scratch programming on university students’ computational thinking. In 2025 5th international conference on artificial intelligence and education (ICAIE) (pp. 398–401). https://doi.org/10.1109/icaie64856.2025.11158692

24.

Haindl

Weinberger

(2024). Does ChatGPT help novice programmers write better code? Results from static code analysis. IEEE Access, 12, 114146–114156. https://doi.org/10.1109/ACCESS.2024.3445432

25.

Harrer

Cuijpers

Furukawa

T. A.

Ebert

D. D.

(2021). Doing meta-analysis with R. Chapman and Hall/CRC. https://doi.org/10.1201/9781003107347

26.

Hedges

L. V.

(1981). Distribution theory for glass's estimator of effect size and related estimators. Journal of Educational Statistics, 6(2), 107-128. https://doi.org/10.3102/10769986006002107

27.

Hedges

L. V.

Olkin

(1985). Statistical methods for meta-analysis. Elsevier. https://doi.org/10.1016/c2009-0-03396-0

28.

Hedges

L. V.

Tipton

Johnson

M. C.

(2010). Robust variance estimation in meta-regression with dependent effect size estimates. Research Synthesis Methods, 1(1), 39–65. https://doi.org/10.1002/jrsm.5

29.

Higgins

J. P.

Thompson

S. G.

Deeks

J. J.

Altman

D. G.

(2003). Measuring inconsistency in meta-analyses. BMJ, 327(7414), 557–560. https://doi.org/10.1136/bmj.327.7414.557

30.

Hsu

T. C.

Hsu

T. P.

(2025). Teaching AI with games: The impact of generative AI drawing on computational thinking skills. Education and Information Technologies, 30(15), 21499–21518. https://doi.org/10.1007/s10639-025-13624-3

31.

Tian

(2025). Enhancing student engagement in online collaborative writing through a generative AI-based conversational agent. The Internet and Higher Education, 65, 100979. https://doi.org/10.1016/j.iheduc.2024.100979

32.

Huang

E. J.

Rees Lewis

Gaudani

Easterday

Gerber

(2023). Intelligent coaching systems: Understanding one-to-many coaching for ill-defined problem solving. Proceedings of the ACM on Human-Computer Interaction, 7(CSCW1), 1–24. https://doi.org/10.1145/3579614

33.

Jin

Lee

Shin

Kim

(2024). Teach AI how to code: Using large Language models as teachable agents for programming education. Proceedings of the CHI Conference on Human Factors in Computing Systems, 1–28. https://doi.org/10.1145/3613904.3642349

34.

Kaleemunnisa

Scharff

Bathula

Zhumakova

(2024). ChatGPT in the classroom: Experimentation in a python class for non-computing majors. In 2024 IEEE integrated STEM Education conference (ISEC) (pp. 1–7). IEEE. https://doi.org/10.1109/isec61299.2024.10665172

35.

Katona

Gyonyoru

K. I. K.

(2025). AI-based adaptive programming education for socially disadvantaged students: Bridging the digital divide. TechTrends, 69(5), 925–942. https://doi.org/10.1007/s11528-025-01088-8

36.

Kazemitabaar

Chow

C. K. T.

Ericson

B. J.

Weintrop

Grossman

(2023). Studying the effect of AI code generators on supporting novice learners in introductory programming. In Proceedings of the 2023 CHI conference on human factors in computing systems (pp. 1–23). Association for Computing Machinery. https://doi.org/10.1145/3544548.3580919

37.

Klar

(2025). Using ChatGPT is easy, using it effectively is tough? A mixed methods study on K-12 students’ perceptions, interaction patterns, and support for learning with generative AI chatbots. Smart Learning Environments, 12(1), 32. https://doi.org/10.1186/s40561-025-00385-2

38.

Kohen-Vacs

Usher

Jansen

(2025). Integrating generative AI into programming education: Student perceptions and the challenge of correcting AI errors. International Journal of Artificial Intelligence in Education, 35(5), 3166–3184. https://doi.org/10.1007/s40593-025-00496-4

39.

Kosar

Ostojić

Liu

Y. D.

Mernik

(2024). Computer science education in ChatGPT era: Experiences from an experiment in a programming course for novice programmers. Mathematics, 12(5), 629. https://doi.org/10.3390/math12050629

40.

Lee

H.-Y.

Chen

P.-H.

Lin

C.-J.

Huang

Y.-M.

T.-T.

(2025). Leveraging ChatGPT for personalized reflective learning in programming education: Effects on self-efficacy, higher-order thinking, and project implementation skills. Education and Information Technologies, 30(17), 24815–24854. https://doi.org/10.1007/s10639-025-13733-z

41.

H. J.

Huang

Q. R.

Wen

L. P.

Chen

Z. Z.

(2025). Generative artificial intelligence supported programming learning: Learning effectiveness and core competence. Sage Open, 15(3), Article 21582440251377986. https://doi.org/10.1177/21582440251377986

42.

(2023). Studies advanced in chatbots based on deep learning. Applied and Computational Engineering, 6(1), 678–683. https://doi.org/10.54254/2755-2721/6/20230921

43.

Liu

Zhang

Yang

(2025). Can AI chatbots effectively improve EFL learners’ learning effects? A meta-analysis of empirical research from 2022–2024. Computer Assisted Language Learning, 1–27. https://doi.org/10.1080/09588221.2025.2456512

44.

A. W.

Singh

(2023). From ELIZA to ChatGPT: The evolution of natural Language processing and financial applications. Journal of Portfolio Management, 49(7), 201–235. https://doi.org/10.3905/jpm.2023.1.512

45.

Magruder

M. L.

Delanois

R. E.

Nace

Mont

M. A.

(2023). ChatGPT and other natural Language processing artificial intelligence models in adult reconstruction. The Journal of Arthroplasty, 38(11), 2191–2192. https://doi.org/10.1016/j.arth.2023.06.030

46.

Manorat

Tuarob

Pongpaichet

(2025). Artificial intelligence in computer programming education: A systematic literature review. Computers and Education: Artificial Intelligence, 8, 100403. https://doi.org/10.1016/j.caeai.2025.100403

47.

Mellado

Cubillos

(2025). Can generative artificial intelligence outperform self-instructional learning in computer programming? Impact on motivation and knowledge acquisition. Applied Sciences, 15(11), 5867. https://doi.org/10.3390/app15115867

48.

Moher

Liberati

Tetzlaff

Altman

D. G.

PRISMA Group . (2009). Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. PLoS Medicine, 6(7), Article e1000097. https://doi.org/10.1371/journal.pmed.1000097

49.

Nikolaidis

Flamos

Gulati

Feitosa

Ampatzoglou

Chatzigeorgiou

(2024). A comparison of the effectiveness of ChatGPT and Co-Pilot for generating quality Python code solutions. In 2024 IEEE international conference on software analysis, evolution and reengineering - Companion (SANER-C) (pp. 93–101). IEEE. https://doi.org/10.1109/saner-c62648.2024.00018

50.

Ott

Robins

Shephard

(2016). Translating principles of effective feedback for students into the CS1 context. ACM Transactions on Computing Education, 16(1), 1–27. https://doi.org/10.1145/2737596

51.

Ouyang

Guo

Zhang

Bai

Jiao

(2024). Comparing the effects of instructor manual feedback and ChatGPT intelligent feedback on collaborative programming in China’s higher education. IEEE Transactions on Learning Technologies, 17, 2173–2185. https://doi.org/10.1109/tlt.2024.3486749

52.

Ouyang

(2024). The effects of educational robotics in STEM education: A multilevel meta-analysis. International Journal of STEM Education, 11(1), 7. https://doi.org/10.1186/s40594-024-00469-4

53.

Park

Kim

(2025). Code suggestions and explanations in programming learning: Use of ChatGPT and performance. International Journal of Management in Education, 23(2), Article 101119. https://doi.org/10.1016/j.ijme.2024.101119

54.

Park

Chung

Lee

(2023). Effect of AI chatbot emotional disclosure on user satisfaction and reuse intention for mental health counseling: A serial mediation model. Current Psychology, 42(32), 28663–28673. https://doi.org/10.1007/s12144-022-03932-z

55.

Pirzado

F. A.

Ahmed

Mendoza-Urdiales

R. A.

Terashima-Marin

(2024). Navigating the pitfalls: Analyzing the behavior of LLMs as a coding assistant for computer science students—A systematic review of the literature. IEEE Access, 12, 112605–112625. https://doi.org/10.1109/ACCESS.2024.3443621

56.

Pustejovsky

J. E.

Tipton

(2022). Meta-analysis with robust variance estimation: Expanding the range of working models. Prevention Science, 23(3), 425–438. https://doi.org/10.1007/s11121-021-01246-3

57.

Raihan

Siddiq

M. L.

Santos

J. C. S.

Zampieri

(2025). Large Language models in computer science education: A systematic literature review. Proceedings of the 56th ACM Technical Symposium on Computer Science Education (1, pp. 938–944). Association for Computing Machinery. https://doi.org/10.1145/3641554.3701863

58.

Rokkones

R. K.

Giannakos

(2025). Toward hybrid teaching intelligence: Investigating the potential of teacher–AI collaboration using large language models. Behaviour & Information Technology, 1–19. https://doi.org/10.1080/0144929X.2025.2564368

59.

Rouhani

Lillebo

Farshchian

Divitini

(2022). Learning to program: An In-service teachers’ perspective. In 2022 IEEE global engineering education conference (EDUCON) (pp. 123–132). IEEE. https://doi.org/10.1109/educon52537.2022.9766781

60.

Rovshenov

Sarsar

(2023). Research trends in programming education: A systematic review of the articles published between 2012-2020. Journal of Educational Technology and Online Learning, 6(1), 48–81. https://doi.org/10.31681/jetol.1201010

61.

Ryan

R. M.

Deci

E. L.

(2020). Intrinsic and extrinsic motivation from a self-determination theory perspective: Definitions, theory, practices, and future directions. Contemporary Educational Psychology, 61, 101860. https://doi.org/10.1016/j.cedpsych.2020.101860

62.

Shanshan

Sen

(2024). Empowering learners with AI‐generated content for programming learning and computational thinking: The lens of extended effective use theory. Journal of Computer Assisted Learning, 40(4), 1941–1958. https://doi.org/10.1111/jcal.12996

63.

Suh

Lee

(2025). Programming education with ChatGPT: Outcomes for beginners and intermediate students. Education and Information Technologies, 30(14), 19511–19536. https://doi.org/10.1007/s10639-025-13542-4

64.

Sun

Boudouaia

Zhu

(2024). Would ChatGPT-facilitated programming mode impact college students’ programming behaviors, performances, and perceptions? An empirical study. International Journal of Educational Technology in Higher Education, 21(1), 14. https://doi.org/10.1186/s41239-024-00446-5

65.

Sun

Zhang

Liu

Zhang

(2025). How self-regulated learning is affected by feedback based on large Language models: Data-driven sustainable development in computer programming learning. Electronics, 14(1), 194. https://doi.org/10.3390/electronics14010194

66.

Sung

Y. T.

Chang

K. E.

Yang

J. M.

(2015). How effective are mobile devices for language learning? A meta-analysis. Educational Research Review, 16, 68–84. https://doi.org/10.1016/j.edurev.2015.09.001

67.

Sweller

(1988). Cognitive load during problem solving: Effects on learning. Cognitive Science, 12(2), 257–285. https://doi.org/10.1207/s15516709cog1202_4

68.

Tang

Liang

Luo

(2025). Enhancing programming performance, learning interest, and self-efficacy: The role of large language models in middle school education. Systems, 13(7), 555. https://doi.org/10.3390/systems13070555

69.

Tanvir

S. H.

Kim

G. J.

(2024). WIP: Generative and custom chatbots in computer programming education and their effectiveness A systematic literature review. In 2024 IEEE frontiers in education conference (FIE) (pp. 1–5). IEEE. https://doi.org/10.1109/fie61694.2024.10893425

70.

Urban

Lukavský

Brom

Hein

Svacha

Děchtěrenko

Urban

(2025). Prompting for creative problem-solving: A process-mining study. Learning and Instruction, 99, 102156. https://doi.org/10.1016/j.learninstruc.2025.102156

71.

Wang

Shan

Kao

Zhang

Chen

(2024). Investigating dialogic interaction in K12 online one-on-one mathematics tutoring using AI and sequence mining techniques. Education and Information Technologies, 30(7), 9215–9240. https://doi.org/10.1007/s10639-024-13195-9

72.

Wohlin

(2014). Guidelines for snowballing in systematic literature studies and a replication in software engineering. Proceedings of the 18th international conference on evaluation and assessment in software engineering (pp. 1–10). https://doi.org/10.1145/2601248.2601268

73.

(2024). Do AI chatbots improve students learning outcomes? Evidence from a meta‐analysis. British Journal of Educational Technology, 55(1), 10–33. https://doi.org/10.1111/bjet.13334

74.

Yan

Y.-M.

Chen

C.-Q.

Y.-B.

X.-D.

(2025). LLM-based collaborative programming: Impact on students’ computational thinking and self-efficacy. Humanities and Social Sciences Communications, 12(1), 1–12. https://doi.org/10.1057/s41599-025-04471-1

75.

Yang

A. C. M.

Lin

J.-Y.

Lin

C.-Y.

Ogata

(2024). Enhancing python learning with PyTutor: Efficacy of a ChatGPT-Based intelligent tutoring system in programming education. Computers and Education: Artificial Intelligence, 7, 100309. https://doi.org/10.1016/j.caeai.2024.100309

76.

Yang

T.-C.

Hsu

Y.-C.

J.-Y.

(2025). The effectiveness of ChatGPT in assisting high school students in programming learning: Evidence from a quasi-experimental research. Interactive Learning Environments, 33(6), 3726–3743. https://doi.org/10.1080/10494820.2025.2450659

77.

Yilmaz

Karaoglan Yilmaz

F. G.

(2023). The effect of generative artificial intelligence (AI)-based tool use on students’ computational thinking skills, programming self-efficacy and motivation. Computers and Education: Artificial Intelligence, 4, 100147. https://doi.org/10.1016/j.caeai.2023.100147

78.

Yin

Goh

T.-T.

(2024). Using a chatbot to provide formative feedback: A longitudinal study of intrinsic motivation, cognitive load, and learning performance. IEEE Transactions on Learning Technologies, 17, 1378–1389. https://doi.org/10.1109/tlt.2024.3364015

79.

Zabala

Narman

H. S.

(2024). Development and evaluation of an AI-Enhanced python programming education system. In 2024 IEEE 15th annual ubiquitous computing, electronics & Mobile communication conference (UEMCON) (pp. 787–792). IEEE. https://doi.org/10.1109/uemcon62879.2024.10754661

80.

Zhang

(2024). A study on impact of junior high school students’ programming learning effect based on generative artificial intelligence. In 2024 4th international conference on educational technology (ICET) (pp. 106–109). IEEE. https://doi.org/10.1109/icet62460.2024.10869172

81.

Zhang

Y. J.

Lange

Fukuoka

(2020). Artificial intelligence chatbot behavior change model for designing artificial intelligence chatbots to promote physical activity and a healthy diet: Viewpoint. Journal of Medical Internet Research, 22(9), Article e22845. https://doi.org/10.2196/22845

82.

Zhang

Zhao

Zhou

Kim

J. H.

(2024). Do you have AI dependency? The roles of academic self-efficacy, academic stress, and performance expectations on problematic AI usage behavior. International Journal of Educational Technology in Higher Education, 21(1), 34. https://doi.org/10.1186/s41239-024-00467-0

83.

Zhao

Yang

Wang

(2025). A generative artificial intelligence (AI)-Based human-computer collaborative programming learning method to improve computational thinking, learning attitudes, and learning achievement. Journal of Educational Computing Research, 63(5), 1059–1087. https://doi.org/10.1177/07356331251336154

84.

Zheng

Niu

Zhong

Gyasi

J. F.

(2023). The effectiveness of artificial intelligence on learning achievement and learning perception: A meta-analysis. Interactive Learning Environments, 31(9), 5650–5664. https://doi.org/10.1080/10494820.2021.2015693

85.

Zhu

Wang

Bao

(2025). Using AI chatbots in visual programming: Effect on programming self-efficacy of upper primary school learners. International Journal of Information and Education Technology, 15(1), 30–38. https://doi.org/10.18178/ijiet.2025.15.1.2215

86.

Zimmerman

B. J.

(2002). Becoming a self-regulated learner: An overview. Theory Into Practice, 41(2), 64–70. https://doi.org/10.1207/s15430421tip4102_2

Do AI Chatbots Improve Students’ Learning Performance in Programming Education? Evidence from a Meta-Analysis

Abstract

Keywords

Get full access to this article

References