Abstract

System architecture
The Intelligent Teaching Team of the Shanghai Institute (Laboratory) of AI Education and the Institute of Curriculum and Instruction of East China Normal University collaborated to develop the High-Quality Classroom Intelligent Analysis Standard system. This system was measured from the dimensions of Class Efficiency, Equity and Democracy, referred to as CEED system. The CEED system applied artificial intelligence (AI) to identify, infer, interpret, and comprehend data. This encompassed tasks such as discourse classification, behavior recognition, dialogue analysis, and the implementation of high-quality classroom intelligent analysis standards within class videos.
Discourse classification
AI utilized vast linguistic datasets collected from databases for machine learning, developing capabilities in speech recognition, semantics, pragmatics, and contextual understanding. By employing voice tone and voiceprint recognition, AI could differentiate between teachers and students, facilitating automatic discourse differentiation. The text-convolutional neural network text classification algorithm was employed to represent text data in a matrix form. Subsequently, convolutional layers and pooling layers were used to process the matrix, extracting local features of the text. Finally, the processed features were input into fully connected layers for classification, enabling further analysis of the distribution of teacher–student interaction, the classification of classroom segments, as well as the extraction of class keywords.
Behavior analysis
Leveraging deep learning algorithms such as OpenPose and Alphapose, a library for human body pose estimation was constructed, enabling target detection and pose recognition. This involved using a human skeletal model to infer human poses accurately, detecting and estimating key points and pose information from images or videos. Through the recognition of classroom poses and calibration of directional datasets, AI could analyze specific classroom behaviors, such as raising hands, standing up, nodding or bowing, writing, reading, and so on.
Dialogue analysis
By integrating word vector analysis models with machine learning techniques, AI was able to categorize dialogue attributes and their respective levels, facilitating deeper semantic classification analysis of classroom conversations. For instance, the system could identify whether a teacher's question is closed-ended or open-ended, which enabled AI to grasp the underlying meaning of the conversations. In the automatic classification of classroom dialogue, this study constructed a systematic framework for classroom analysis using machine learning models and AI technology. This framework primarily included the IRE grading coding for teacher–student interaction, the nine-event analysis of Gagne's teaching stages, the S-T analysis of classroom interaction, and the RT-CH analysis of classroom types. Specifically, IRE grading code was a classical model structure of classroom dialogue, that is, “Initiation (teacher)–Response (student)–Evaluation (teacher).” The nine-event analysis of Gagne's teaching stages was based on the theory of Gagne's teaching stage, which automatically classified teachers’ classroom discourse, and the S-T analysis of classroom interaction described the interaction between teacher and students. The RT-CH analysis of classroom types was used to describe the proportion of teachers’ and students’ behaviors. RT referred to the proportion of teachers’ behaviors. The larger the value, the more the teachers’ behaviors. CH was the number of conversions between teachers’ behaviors and students’ behaviors; the larger the value, the more frequent the conversions between teacher–student behaviors, according to which the classroom teaching mode could be described.
The team transcribed the audio of teacher–student dialogues in teaching videos into text and assigned 20 researchers to independently complete coding based on the encoding rules of each analysis model. Inconsistencies in coding content were discussed and corrected, providing standards and guidelines for subsequent machine learning. Subsequently, corresponding classification models for each classroom analysis framework were built, and the processed data were divided into “training set,” “validation set,” and “test set.” The team employed BERT (Bidirectional Encoder Representation From Transformers) pre-trained language models. After training, the model's accuracy was validated, and the optimal model was selected to predict the classification of classroom data, thus realizing the construction of an intelligent classroom analysis system. Through machine learning training, automatic encoding of teacher–student dialogue and learning interaction was achieved. We could analyze the quality of teacher–student dialogue and classroom activities, which enhanced the efficiency and quality of classroom analysis.
High-Quality Classroom Intelligent Analysis Standard system
The construction of the CEED analysis framework is based on the theoretical and practical development of classroom teaching research, combined with the processing ability of AI technology for multi-modal data (such as speech, behavior, psychology, etc.), in order to make up for the shortcomings of scale, standardization, and specialization in traditional classroom analysis. This standard placed greater emphasis on students’ learning outcomes, engagement, achievement of learning objectives, and learning methods. It was divided into three levels: class efficiency, class equity, and class democracy, comprising nine dimensions (as outlined in Table 1). Leveraging multimodal data and AI analysis results, classrooms were comprehensively analyzed from multiple perspectives. Overall, this standard employed AI technology for comprehensive classroom analysis, with a heightened focus on students’ learning situations. This provided new insights for classroom evaluation. Through AI technology, classroom data can be collected and analyzed automatically in multiple dimensions to form a more comprehensive classroom evaluation standard, address the one-sidedness of single-technology analysis of the classroom, and achieve a comprehensive consideration of the three levels: classroom efficiency, fairness, and democracy.
Standards for high-quality classroom intelligent analysis.
Research findings
The classroom videos for this study originated from the nationwide open-class video data collected by the project team, and the study was conducted with the informed consent of the respective schools. Among these, we selected Chinese language subject videos as our research focus, encompassing video data from nine grades of Chinese language classroom instruction, spanning from Grade 1 to Grade 9. Each grade comprises a total of 112 Chinese language classes, adding up to 1,008 classes in total, with a cumulative video duration of approximately 674 hr.
Using the classroom intelligent analysis system, automatic coding analysis was applied to 1,008 Chinese primary and secondary school classroom videos to determine the type of time segment for each minute. The statistical data obtained through the analysis of the Classroom Intelligent Analysis System revealed that, on average, “teacher presentation” time accounted for 51.9% of the total time in the 1,008 classes. “Teacher presentation” continued to dominate the majority of classroom time, with the class structure primarily consisting of teachers delivering knowledge, explaining steps, and presenting content. Besides, the average proportion of “teacher–student interaction” time was 30.5%, while “individual tasks” and “group activities,” involving activities such as individual reading, highlighting words and phrases, in-class exercises, group discussions on specific topics, and collaborative tasks, had very limited durations, accounting for 12.3% and 5.3%, respectively.
Considering one class session to be 40 min, the distribution of classroom time in these 1,008 classes is approximately 21 min for teacher presentation, 12 min for teacher–student interaction, 5 min for individual tasks, and 2 min for group activities.
Upon further analysis, it was observed that the overall distribution of classroom time did not exhibit significant differences across different grade levels (as shown in Figure 1). However, the duration of teacher instruction in the first, eighth, and ninth grades was significantly higher than in other grades (p < .001). Additionally, the proportion of time allocated to individual tasks in the first, second, eighth, and ninth grades was significantly lower than that in the third and fourth grades (p < .001). Regarding “group activities,” the first grade had the lowest proportion of time at 2.6%, significantly lower than in the third, fourth, sixth, and seventh grades.

Distribution proportion of classroom time segments across different grades.
This study empirically validates the commonly held perception that “primary schools’ classes involve more interaction, while secondary schools involve more lecturing.” Research findings indicate that the proportion of teaching time for the seventh-, eighth-, and ninth-grade teachers increases progressively with grade level, while the proportions of teacher–student interaction, individual tasks, and group activities decrease steadily as grade level rises. Teachers in higher grades clearly tend to adopt an instructional approach dominated by “lecturing.”
Further analysis of the coded data reveals a significant fluctuating decrease in the number of open-ended questions as the grade level increases. There is a fluctuating but overall decreasing trend in the average number of open-ended questions from the first grade to the ninth grade, as illustrated in Figure 2. Notably, the average number of open-ended questions in the ninth grade is significantly lower than that in the first, second, fourth, and sixth grades (p < .01). This suggests that teachers in higher grades tend to prefer “closed” and “safe” types of questions and approaches, rather than opting for more open-ended and challenging questions to engage students in participation and interaction.

Number of open-ended questions in different grades.
This also reveals a result that is contradictory to common impressions. Generally, higher-grade students are assumed to have relatively higher cognitive complexity, indicating that middle school students have higher cognitive abilities compared to primary school students. However, the study uncovers that in actual classrooms, teachers tend to pose more closed, simpler, and safer questions to middle school students even when addressing content with higher cognitive complexity. This contradiction suggests that classroom interactions may not entirely align with the cognitive development patterns of students.
Through classroom intelligent analysis systems, large-sample classroom data can be analyzed and statistically processed. In contrast, traditional manual analysis of a class takes approximately 1.5 hr, while system-based analysis averages 20 min, significantly improving the efficiency of classroom analysis. In terms of the quality and accuracy of classroom analysis, researchers have checked the results analyzed by classroom intelligent analysis systems, and the accuracy rate is up to 90%. However, it is difficult to collect classroom audio and video for subjects such as physical lessons. Limited by the characteristics of the subjects, the system is currently unable to perform the analysis.
Within the academic context of deepening the application scope and influence of the system, the team is committed to further expanding the potential user group of this system. In addition to researchers who can utilize the classroom intelligent analysis system to conduct precise statistical analysis of vast amounts of classroom data, including Chinese and English classes, thereby producing high-quality research outcomes, teachers and school administrators can also benefit greatly from it. Specifically, teachers can utilize this system to evaluate their classroom behaviors and performances. This evaluation is not limited to traditional classroom effectiveness but can also cover multiple dimensions such as classroom atmosphere, student engagement, and teacher–student interaction. By quantifying these indicators, teachers can more accurately grasp the quality of the classroom and provide scientific evidence for their own teaching improvement. Furthermore, these detailed data cannot only reveal the strengths and weaknesses in the teaching process but also provide strong support for targeted teacher training. By deeply analyzing different teachers’ teaching styles, student interaction patterns, and classroom effects, administrators can design more targeted training programs to help teachers optimize their teaching strategies and enhance teaching quality.
Although the application of AI in classroom analysis offers advantages, such as scalability and standardization, educational researchers should approach it with caution. It is necessary to remain vigilant in addressing issues such as technical errors and biases inherent in AI technology in order to prevent systematic misjudgments caused by uninterpretable algorithms and biased data. What's more, further research is needed to substantiate the exploration of phenomena and patterns in large-scale classroom teaching using AI technology.
Takeaway message
•Traditional classroom observation heavily relied on manual annotation, which consumed a significant amount of time and human resources, making it challenging to systematically code and analyze large-scale classroom videos. The team developed the High-Quality Classroom Intelligent Analysis Standard (CEED) system, which utilized AI for classroom intelligent analysis.
•This paper used the CEED system to make a statistical analysis of 1,008 recorded videos of primary and secondary schools in China.
•The findings revealed that teacher presentation occupied 51.9% of the total time, teacher–student interaction accounted for 30.5%, personal task time made up 12.3%, and group activity constituted 5.3%. “Teacher presentation” still occupied the majority of classroom time.
•This study demonstrated that using AI for big data annotation can effectively reduce the time and human resources required for traditional classroom analysis, enabling systematic statistical analysis of large-scale classroom data.
Footnotes
Contributorship
Yihe Gao is responsible for writing the article. Xiaozhe Yang is responsible for system development.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Ethical statement
This study has received written informed consent from the participating schools.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This paper was supported by the China National Social Science Foundation (BHA220144).
