Abstract
This study evaluates the performance of three leading AI chatbots—OpenAI’s ChatGPT, Google’s Gemini, and Microsoft Bing Copilot—in answering multiplechoice questions (MCQs) from the UGC-NET Education paper. Using 150 randomly selected questions from examination cycles between June 2019 and December 2023, the chatbots’ accuracy was assessed against the official answer key. Copilot demonstrated the highest accuracy (86%), followed by Gemini (79.33%) and ChatGPT (78.67%). Unit-wise analysis revealed distinct strengths: Copilot excelled in “Pedagogy and Technology in Education,” Gemini performed best in “Research in Education,” while ChatGPT demonstrated a balanced performance. Chi-square analysis indicated no statistically significant differences among the chatbots. These findings highlight AI’s potential as a supplementary educational tool while underscoring the need for improvements in handling complex topics. The study offers recommendations for enhancing chatbot algorithms to improve their effectiveness in academic contexts, providing valuable insights for educators and developers regarding AI integration in education.
Keywords
Get full access to this article
View all access options for this article.
