Sage Journals: Discover world-class research

Abstract

This study evaluates the performance of three leading AI chatbots—OpenAI’s ChatGPT, Google’s Gemini, and Microsoft Bing Copilot—in answering multiplechoice questions (MCQs) from the UGC-NET Education paper. Using 150 randomly selected questions from examination cycles between June 2019 and December 2023, the chatbots’ accuracy was assessed against the official answer key. Copilot demonstrated the highest accuracy (86%), followed by Gemini (79.33%) and ChatGPT (78.67%). Unit-wise analysis revealed distinct strengths: Copilot excelled in “Pedagogy and Technology in Education,” Gemini performed best in “Research in Education,” while ChatGPT demonstrated a balanced performance. Chi-square analysis indicated no statistically significant differences among the chatbots. These findings highlight AI’s potential as a supplementary educational tool while underscoring the need for improvements in handling complex topics. The study offers recommendations for enhancing chatbot algorithms to improve their effectiveness in academic contexts, providing valuable insights for educators and developers regarding AI integration in education.

Keywords

Artificial Intelligence ChatGPT Google Gemini Microsoft Bing Copilot UGC-NET Examination Education

Get full access to this article

View all access options for this article.

References

Abu-Haifa

Etawi

Alkhatatbeh

Ababneh

(2024). Comparative Analysis of ChatGPT, GPT-4, and Microsoft Copilot Chatbots for GRE Test. International Journal of Learning, Teaching and Educational Research, 23(6), 327–347. https://doi.org/10.26803/ijlter.23.6.15

Arsal

Saleem

Jalil

Ali

Zahra

Rehman

A. U.

Muhammad

(2024). Emerging Cybersecurity and Privacy Threats of ChatGPT , Gemini, and Copilot: Current Trends, Challenges, and Future Directions . https://doi.org/10.20944/preprints202410.1909.v1

Arif

Yuhdi

(2020). Integration of high order thinking skills in research method subject in university. Britain International of Linguistics Arts and Education (BIoLAE) Journal, 2(1), 378–383. https://doi.org/10.33258/biolae.v2i1.207

Beerappa

Madhu

Kannappanavar

B. U.

(2021). An Overview and Content Analysis of UGC-NET with Reference to LIS Discipline and New Pattern. Library Philosophy and Practice. https://digitalcommons.unl.edu/libphilprac/6117

Bhatt

Hossain

Majumder

Chandra

M. S.

Ghimire

, Faisal Shahzad

Verma

K. K.

Riar

A. S.

Rajput

V. D.

Oliveira

M. W.

Nisi

Almalki

R. S.

Bárek

Brestic

Maitra

(2024). Prospects of artificial intelligence for the sustainability of sugarcane production in the modern era of climate change: An overview of related global findings. Journal of Agriculture and Food Research, 18. https://doi.org/10.1016/j.jafr.2024.101519

Biswas

S. S.

(2023). Role of Chat GPT in Public Health. In Annals of Biomedical Engineering (Vol. 51, Issue 5, pp. 868–869). Springer. https://doi.org/10.1007/s10439-023-03172-7

Hiwa

Dilan S.

Sarhang Sedeeq Abdalla Muhialdeen

Aso S.

Hamasalih

Hussein M.

Karim

Sanaa O.

. (2024). Assessment of Nursing Skill and Knowledge of ChatGPT, Gemini, Microsoft Copilot, and Llama: A Comparative Study. Barw Medical Journal. https://doi.org/10.58742/bmj.v2i2.87

Gilson

Safranek

C. W.

Huang

Socrates

Chi

Taylor

R. A.

Chartash

(2023). How Does ChatGPT Perform on the United States Medical Licensing Examination? The Implications of Large Language Models for Medical Education and Knowledge Assessment. JMIR Medical Education, 9. https://doi.org/10.2196/45312

Guo

Nan

Liang

Guo

Chawla

Wiest

Zhang

(2023). What can large language models do in chemistry? a comprehensive benchmark on eight tasks. Advances in Neural Information Processing Systems, 36, 59662–59688.

10.

Harinarayana

Raju Associate Professor

V. N.

(2017). Wikipedia and LIS: A study of coverage of concepts for UGC-NET. In Annals of Library and Information Studies (Vol. 64).

11.

Hochmair

Juhász

Kemp

(2024). Correctness Comparison of ChatGPT-4, Gemini, Claude-3, and Copilot for Spatial Tasks. Transactions in GIS, 28(7), 2219–2231. https://doi.org/10.1111/tgis.13233

12.

Imran

Almusharraf

(2024). Google Gemini as a next generation AI educational tool: a review of emerging educational technology. In Smart Learning Environments (Vol. 11, Issue 1). Springer. https://doi.org/10.1186/s40561-024-00310-z

13.

Jedrzejczak

W. W.

Kochanek

(2023). Comparison of the audiological knowledge of three chatbots – ChatGPT, Bing Chat, and Bard . https://doi.org/10.1101/2023.11.22.23298893

14.

Kamalov

Santandreu Calonge

Gurrib

(2023). New Era of Artificial Intelligence in Education: Towards a Sustainable Multifaceted Revolution. Sustainability (Switzerland), 15(16). https://doi.org/10.3390/su151612451

15.

Kung

T. H.

Cheatham

Medenilla

Sillos

De Leon

Elepaño

Madriaga

Aggabao

Diaz-Candido

Maningo

Tseng

(2023). Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digital Health, 2(2 February). https://doi.org/10.1371/journal.pdig.0000198

16.

Lee

G. U.

Hong

D. Y.

Kim

S. Y.

Kim

J. W.

Lee

Y. H.

Park

S. O.

Lee

K. R.

(2024). Comparison of the problem-solving performance of chatgpt-3.5, chatgpt-4, Bing chat, and bard for the Korean emergency medicine board examination question bank. Medicine, 103(9), e37325. https://doi.org/10.1097/md.0000000000037325

17.

Massey

P. A.

Montgomery

Zhang

A. S.

(2023). Comparison of ChatGPT-3.5, ChatGPT-4, and Orthopaedic Resident Performance on Orthopaedic Assessment Examinations. Journal of the American Academy of Orthopaedic Surgeons, 31(23), 1173–1179. https://doi.org/10.5435/JAAOS-D-23-00396

18.

Nazir

Wang

(2023). A comprehensive survey of ChatGPT: Advancements, applications, prospects, and challenges. In Meta-Radiology (Vol. 1, Issue 2). KeAi Publishing Communications Ltd. https://doi.org/10.1016/j.metrad.2023.100022

19.

Noda

Izaki

Kitano

Komatsu

Ichikawa

Shibagaki

(2024). Performance of ChatGPT and bard in self-assessment questions for nephrology board renewal. Clinical and Experimental Nephrology, 28(5), 465–469. https://doi.org/10.1007/s10157-023-02451-w

20.

Okagbue

E. F.

Ezeachikulo

U. P.

Akintunde

T. Y.

Tsakuwa

M. B.

Ilokanulo

S. N.

Obiasoanya

K. M.

Ilodibe

C. E.

Ouattara

C. A. T.

(2023). A comprehensive overview of artificial intelligence and machine learning in education pedagogy: 21 Years (2000–2021) of research indexed in the scopus database. Social Sciences and Humanities Open, 8(1). https://doi.org/10.1016/j.ssaho.2023.100655

21.

Phillips

Kiryakoza

Arefin

Choudhary

Garifullin

(2024). ChatGPT versus Google Gemini: a comparison to evaluate patient education guide created on common neurological disorders. Discover Artificial Intelligence, 4(1). https://doi.org/10.1007/s44163-024-00189-2

22.

Rahman

M. M.

Watanobe

(2023). ChatGPT for Education and Research: Opportunities, Threats, and Strategies. Applied Sciences (Switzerland), 13(9). https://doi.org/10.3390/app13095783

23.

Reyhan

A. H.

Mutaf

Ç.

Uzun

İ.

Yüksekyayla

(2024). A Performance Evaluation of Large Language Models in Keratoconus: A Comparative Study of ChatGPT-3.5, ChatGPT-4.0, Gemini, Copilot, Chatsonic, and Perplexity. Journal of Clinical Medicine, 13(21). https://doi.org/10.3390/jcm13216512

24.

Rojas

Burgess

Toro-Pérez

Salehi

(2024). Exploring the performance of ChatGPT versions 3.5, 4, and 4 with vision in the Chilean medical licensing examination: Observational study. JMIR Medical Education, 10, e55048–e55048. https://doi.org/10.2196/55048

25.

Roumeliotis

K. I.

Tselikas

N. D.

(2023). ChatGPT and Open-AI Models: A Preliminary Review. In Future Internet (Vol. 15, Issue 6). MDPI. https://doi.org/10.3390/fi15060192

26.

Saeidnia

H. R.

(2023). Welcome to the Gemini era: Google DeepMind and the information industry. Library Hi Tech News. https://doi.org/10.1108/LHTN-12-2023-0214

27.

Snigdha

N. T.

Batul

Karobari

M. I.

Adil

A. H.

Dawasaz

A. A.

Hameed

M. S.

Mehta

Noorani

T. Y.

(2024). Assessing the performance of ChatGPT 3.5 and ChatGPT 4 in operative dentistry and Endodontics: An exploratory study. Human Behavior and Emerging Technologies, 2024(1). https://doi.org/10.1155/2024/1119816

28.

Sonmezoglu

B. G.

Sonmezoglu

H. I.

(2024). Comparative Analysis of AI Chatbots Chat GPT, Gemini, and Copilot’s Answers to Common Cataract Questions. Pakistan Journal of Ophthalmology, 40(4), 370–375. https://doi.org/10.36351/pjo.v40i4.1887

29.

Sowjanya

Kumar Rathod

Prof. Jayant

Suraj Tejesvin

Sanil

Thanvi S

Thirumala . (2024). CHATBOT: A Comprehensive Review of AI. International Journal of Advanced Research in Science, Communication and Technology, 126–133. https://doi.org/10.48175/ijarsct-19219

30.

Surameery

N. M. S.

Shakor

M. Y.

(2023). Use Chat GPT to Solve Programming Bugs. International Journal of Information Technology and Computer Engineering, 31, 17–22. https://doi.org/10.55529/ijitc.31.17.22

31.

Toharudin

(2017). Critical thinking and problem solving skills: how these skills are needed in educational psychology. International Journal of Science and Research, 6(3), 2004–2007. http://DOI: 10.21275/ART20171836

32.

Walberg

H. J.

Haertel

G. D.

(1992). Educational psychology's first century. Journal of Educational Psychology, 84(1), 6–19.

Comparing AI-Generated Responses: A Study on ChatGPT,Gemini,and Copilot in Education

Abstract

Keywords

Get full access to this article

References