Abstract
While student-centered pedagogies have been widely recognized for improving mathematical learning outcomes, most existing studies focus on short-term interventions. This study explored the long-term effectiveness of student-centered pedagogies on students’ mathematics performance and confidence by comparing two Chinese secondary schools (Grades 7–9) that are similar demographically but differ significantly in the extent to which they used student-centered pedagogies (p < 0.01). A total of 1029 students were assessed after they had experienced secondary education for nearly one and three consecutive academic years, respectively. Contrary to findings on the short-term effectiveness of student-centered pedagogies reported in existing literature, no significant overall difference in mathematics performance was found between students exposed long-term to student-centered versus teacher-centered pedagogies. However, student-centered school showed clear benefits for low-performing students in both Grades 7 and 9, who significantly outperformed their peers in the teacher-centered school. In terms of mathematics confidence, the initial positive effects of student-centered pedagogies were found to diminish over time: a two-week implementation significantly boosted students’ mathematics confidence, a one-academic-year implementation showed modest potential, and a three-academic-year implementation exhibited no advantage as compared to teacher-centered pedagogies. This study calls for a more nuanced understanding of when and for whom student-centered pedagogies are effective in fostering cognitive and affective outcomes.
Keywords
1. Introduction
Extensive research has demonstrated that student-centered pedagogies (SCPs), including the use of cooperative grouping, problem-based learning, and inquiry-based learning, improves several different kinds of mathematical learning outcomes, as compared to teacher-centered pedagogies (TCPs) (e.g., Boom-Cárcamo et al., 2024; Hendriana et al., 2018; Karamustafaoğlu & Pektaş, 2023; Ridlon, 2009; Saragih & Habeahan, 2014). With respect to cognitive outcomes, SCPs have been found to be effective in improving mathematics performance across many grade levels. For example, via quasi-experimental design, Ridlon (2009) found that American sixth-grade students in a student-centered environment had significantly higher gains in mathematics performance than did those in a teacher-centered environment. With respect to affective outcomes, SCPs have been found to be effective in fostering mathematics confidence. For instance, via a quasi-experimental study, Kandil and Işıksal-Bostan (2019) found that SCPs led to greater improvement in students’ mathematics confidence as compared to TCPs.
However, two critical gaps remain in the existing literature about the effectiveness of SCPs on cognitive and noncognitive outcomes. First, the SCPs interventions used in most studies are relatively short in duration, leaving the long-term benefit of its implementation largely unexplored. For instance, Chan's (2011) intervention spanned only 5 hours, making it unclear whether the positive effects of SCPs on mathematics confidence persist in the long run if they were continuously implemented. Second, while prior studies on the effectiveness of SCPs have been conducted in many regions of the world, SCPs remain relatively new in China and have not been as widely studied in that context.
To address these gaps, the present study explored the relationship between long-term SCPs and the constructs of mathematics performance and mathematics confidence within Chinese context. Two Chinese secondary schools (Grades 7–9), similar in many ways but differing significantly in the pedagogy used in mathematics classes, were compared. Both the mathematics performance and mathematics confidence of seventh- and ninth-grade students from the two schools were investigated after they had experienced their respective pedagogical environments for nearly one and three consecutive academic years, respectively. This study sought to answer the following two research questions:
2. Student-Centered Pedagogy and its Relationship to Performance and Confidence
2.1 Conceptualizations and Effectiveness of SCPs
Student-centered pedagogies have been found to be effective in terms of both mathematics performance and confidence (e.g., Boom-Cárcamo et al., 2024; Klang et al., 2021; Ridlon, 2009). As a result, this instructional approach has been advocated in the curriculum and policy documents in many countries (e.g., Ministry of Education of People's Republic of China, 2011, 2022; National Council of Teachers of Mathematics, 2000; National Governors Association Center for Best Practices & Council of Chief State School Officers, 2010).
The term “student-centered pedagogies” often refers to approaches that put students in the center of the learning process, incorporating scaffolding that provides students with support that is gradually removed as they develop expertise (Hmelo-Silver et al., 2007; Savery & Duffy, 1995). Student-centered pedagogies provide an environment in which students act as a learning community where shared understanding is constructed and where teachers act as facilitators. In the teaching of mathematics, SCPs refer to an environment where (a) student mathematical thinking is made public, (b) students actively engage with each other's mathematical thinking, and (c) student mathematical sense-making, conjecturing, and justifying drive instruction (Thanheiser & Melhuish, 2023).
It is important to note that there are a variety of different terms used in the literature to describe SCPs. Some of these terms refer to the instructional philosophy behind the student-centered pedagogical approach—such as the use of the phrase “constructivist teaching” (Baviskar et al., 2009). Other terms refer to general pedagogical approaches that are considered to be student-centered, such as cooperative grouping (Gillies, 2003), reformed teaching (Sawada et al., 2002), or ambitious instruction (Stroupe, 2016). Other terms refer to specific, highly structured forms of SCPs, such as problem-based learning (Savery & Duffy, 1995), inquiry-based learning (Maaß & Artigue, 2013), and project-based learning (Guo et al., 2020). Following Brush and Saye (2000), Clark et al. (2012), and Glasgow (1997), here we consider all of these exemplars of a constructivist learning environment to be SCPs.
Related, there is a smaller set of phrases that are typically used to describe TCPs, including conventional teaching (Li, 2016), fully guided instruction (Clark et al., 2012), and direct instruction (Adams & Carnine, 2003). Teacher-centered pedagogies are often described as being based upon a model of an active teacher and a passive student and are often aligned with transmission models of teaching. Within such a teacher-centered environment, instruction refers to the process of transmitting information to the learners (Duffy & Cunningham, 1996), during which teachers provide explicit instructional guidance to fully explain the concepts and skills that students are required to learn (Clark et al., 2012).
In practice, actual pedagogy is inadequately characterized by the simple dichotomy between SCPs and TCPs. In fact, the concept of guided participation, or transitional pedagogies, which is considered by some to be a component of both SCPs and TCPs (Rogoff et al., 1993), emerged from the recognition that learning can neither be all teacher-centered nor all student-centered (Budd et al., 2013; Rogoff et al., 1993). In the present study, we endorse a conceptualization of pedagogy that aligns with the continuum proposed by Sawada, Piburn and colleagues (Piburn et al., 2000; Sawada et al., 2002), which categorizes pedagogy into three distinct yet interconnected approaches: TCPs, transitional pedagogies, and SCPs.
Many studies have demonstrated the effectiveness of short-term SCPs on mathematics performance. For example, after implementing a SCP (cooperative learning) intervention for 15 weeks, Klang et al. (2021) found significant effects of the intervention on students’ overall performance in geometry, as compared to a business-as-usual control condition. As another example, Ridlon (2009) conducted a nine-week experimental study to compare the mathematics performance of American sixth-grade students in a SCP (problem-based learning) classroom with those in an environment using TCPs, finding that students using SCPs had significantly higher gains in achievement than did those in a teacher-centered environment. Similarly, using an experimental design that incorporated assessments, surveys, and observations, Boom-Cárcamo et al. (2024) found that SCP (problem-based learning) significantly enhanced the mathematics performance of university students compared to those who did not receive this educational approach. Karamustafaoğlu and Pektaş (2023) mixed-method study in Turkey found that SCP (inquiry-based learning) conducted in an out-of-school learning environment enhanced 11th-grade students’ problem-solving skills in STEM. In Polly et al. (2015), multilevel analyses revealed that teachers’ transition from teacher-centered to SCPs had statistically significant improvement on student mathematics achievement. Via quasi-experimental study, Saragih and Habeahan (2014) found that students who received a SCP (problem-based learning) intervention increased their mathematical problem-solving skills compared to students who received TCPs. As these studies indicate, there seem to be strong evidence that SCPs enhance mathematics performance across grade levels.
Although fewer empirical studies focus on the effects of SCPs on students’ confidence in mathematics, there is nevertheless strong evidence of such a connection. For example, Riegle-Crumb et al. (2019) found that increased use of SCP (inquiry-based learning) was significantly associated with greater confidence for mathematics. Similarly, Siregar and Maat (2020) conducted a quasi-experimental study using a pretest–posttest design to examine the effectiveness of a student-centered pedagogical approach (discovery learning) in Geometry on improving secondary students’ confidence in mathematics. Their findings revealed that students in the SCPs condition had higher confidence than those in the TCPs condition. Kandil and Işıksal-Bostan (2019) conducted a quasi-experimental study to explore the effect of SCP (inquiry-based instruction) enriched with origami activities on seventh-grade students’ self-efficacy in geometry—students’ belief in their ability to successfully perform geometry-related tasks. Their result indicated that there was a statistically significant positive effect of SCPs on seventh-grade students’ self-efficacy in geometry. These studies provided consistent evidence of SCPs’ effectiveness in enhancing students’ confidence in mathematics.
While evidence in support of SCPs grows, there is also a body of literature that highlights the disadvantages of this approach as compared to TCPs, leaving some uncertainty in the overall effectiveness of SCPs. For instance, Ma (2023) found that the use of SCP (inquiry-based learning) was negatively associated with students’ scientific performance across countries. She advised caution against the overuse of SCPs, highlighting that student learning outcomes may depend more on the quality rather than the quantity of such practices. Similarly, Clark et al. (2012) criticized SCPs for lacking sufficient instructional guidance and imposing excessive cognitive load on learners, which may hinder students’ ability to acquire and retain knowledge. Clark et al. (2012) stated that knowledge must first be transmitted and absorbed, particularly for novices, before students can engage in independent application or exploration. They argued that TCPs are more effective for building foundational knowledge and ensuring efficient learning outcomes. Similarly, Fan and Zhu (2007) showed evidence that explicit instruction on problem-solving heuristics would offer greater benefit to students’ mathematics performance. These studies exploring the relative disadvantages of SCPs suggest that their effectiveness on learning outcomes should continue to be explored. As a result, one additional goal of this study was to confirm the results of prior studies that have documented the effectiveness of short-term implementations of SCPs on learning outcomes.
2.2 Limitations of Existing Research About SCPs
As mentioned above, a key limitation of the existing literature on the effectiveness of SCPs is the short-term or unspecified duration of many instructional implementations, leaving the long-term benefits of their use largely unexplored. For example, in studies exploring the effect of SCPs on students’ confidence in mathematics, Siregar and Maat (2020) did not specify the duration of their SCP (discovery learning) interventions, while Kandil and Işıksal-Bostan (2019) examined the effect of only three weeks of a SCP (inquiry-based learning) intervention. Zakariya (2022) conducted a systematic review of intervention studies—including SCPs interventions—focused on improving mathematics confidence and found that the duration of interventions ranged from as short as 10 min to as long as 16 weeks. With respect to the relationship between SCP and mathematics performance, the same is true. For example, Boom-Cárcamo et al. (2024) omitted the duration of their SCPs intervention; in other studies that did specify the duration of their interventions, such as Klang et al. (2021) and Ridlon (2009), the duration ranged from only a few weeks to less than one semester.
This lack of evidence on the long-term effects of SCPs on students’ mathematics performance and confidence is concerning, as neglecting long-term impacts may lead to misleading conclusions about the overall effectiveness of this approach. Given that research suggests both academic achievement and motivation fluctuate across grade levels (e.g., Fu et al., 2016; Nagy et al., 2010), students’ mathematics performance and confidence under long-term use of SCPs may evolve differently from the trends observed in short-term implementations. With respect to performance, Jimerson et al. (1999) found only a moderate correlation (r = 0.6) between early/middle childhood and middle childhood/high school, suggesting that mathematics performance fluctuates widely across stages. Similarly, Fu et al. (2016) explored the trajectories of Chinese elementary students’ mathematics performance and identified four distinct patterns: low-stable (low initial level that remained stable), high/moderate-decreasing (moderately high initial level that then decreased), high-increasing (high initial level that then increased), and high-stable (high initial level that remained stable), suggesting substantial heterogeneity in Chinese students’ trajectories of mathematics performance. With respect to confidence, students’ perceived mathematics ability was found to generally decline as they grew older and advanced across grade levels (Jacobs et al., 2002; Nagy et al., 2010). Interestingly, studies have also indicated that mathematics confidence increases over time during the short-term student-centered pedagogical implementation (Masitoh & Fitriyani, 2018). It thus remains unclear whether students’ mathematics confidence would continue to grow or eventually decline under long-term use of SCPs. These findings highlight the variability in students’ mathematics performance and confidence over time, underscoring the need for further exploration of SCPs’ long-term impact.
Another weakness in the existing literature is that while most studies related to the effectiveness of SCPs have been conducted in many regions of the world, very few have focused on the Chinese context. For example, the studies by Klang et al. (2021), Polly et al. (2015), and Ridlon (2009) were conducted in Western countries, while the studies of Kandil and Işıksal-Bostan (2019), Karamustafaoğlu and Pektaş (2023), Saragih and Habeahan (2014), and Siregar et al. (2020) were conducted in Southeast or Western Asia. The omission of studies that are situated in China is potentially problematic, as neglecting the Chinese perspective may overlook important insights.
In particular, prior to 2001, Chinese mathematics instruction was almost exclusively based on TCPs. Following the Ministry of Education's recommendation of SCPs after 2001 (Ministry of Education of the People's Republic of China, 2001), several Chinese schools claimed to adopt SCPs (Liu, 2021). However, because of fundamental differences between characteristics of typical mathematics teaching in China as compared to many other countries in the world (Wang & Lin, 2005), the features of SCPs as implemented in China appear to be a bit different from SCPs features in other countries (Zhang, 2025). For example, mathematics instruction in Hong Kong SAR (Hiebert et al., 2005) and mainland China more generally (Cai & Ding, 2017; Zhang, 2022) has been found to strongly emphasize procedures, even in more student-centered forms of teaching (Zhang, 2025). In most studies of SCPs in Western mathematics classrooms, the student-centered approach tends to be synonymous with more conceptual instruction (Teasdale et al., 2017). Indeed, SCPs have also been reported to be culturally incongruent in African classes, where teachers had to maintain authority to address students’ resistance to participate, even when those teachers aspire to promote democratic and student-centered learning (Perumal, 2008). Similarly, Cheung (2016) found that Chinese teachers adapted Western-defined frameworks in ways that reflect local norms. Acknowledging the possible cultural bias embedded in dominant notions of student-centeredness is important, as the subtle differences in the implementations of SCPs across countries may lead to different outcomes in different cultural contexts. Therefore, it would be problematic for our understanding of the effectiveness of SCPs to neglect the Chinese implementation of it, given the possible differences between what SCPs look like in China as compared to most of the rest of the world.
Similarly, differences in Chinese students’ beliefs (e.g., Li, 2003; Sun et al., 2013) and motivation (e.g., Kao, 2022; Salili, 1996) compared to their Western counterparts may further contribute to variations in affective outcomes associated with SCPs, requiring further exploration. For example, research (e.g., Li, 2003) has shown that while American students emphasized ability and ability-based motivation as primary contributing factors to success, Chinese students viewed effort as the primary determinant. In a similar vein, Western students tended to engage in self-enhancement, whereas East Asian students were more likely to engage in self-criticism (Markus & Kitayama, 1991). Given that cultural factors shape motivation that individuals develop, which in turn indirectly influence their self-confidence (Ryan & Deci, 2000), these culture-induced differences in motivation may result in varying levels of mathematics confidence among Chinese learners under SCPs and warrant further investigation in the Chinese context.
2.3 The Present Study
This present study addresses both gaps noted above—the lack of evidence on the effectiveness of long-term SCPs implementation and the absence of research in the Chinese context—by evaluating the effectiveness of a longer term SCPs implementation (one and three academic years, respectively 1 ) on mathematics performance and confidence within the Chinese context. As a first step in this investigation, the pedagogical approaches of the two Chinese lower secondary schools (Grades 7–9)—one of which claimed to implement student-centered reformed pedagogies and one that did not—were investigated using classroom observations and teacher surveys. After confirming significant differences in pedagogies, the two school students’ mathematics performance and confidence were measured using standardized tests and surveys, respectively.
3. Methods
Beyond the theoretical distinction between teacher-centered and SCPs, operationalizing these terms for research purposes, particularly in classroom observation studies, is a complex undertaking. As a result, a necessary first step of this study was to assess the relative degree and extent to which each of the two schools’ pedagogies were aligned with a student-centered or a teacher-centered approach. We begin by describing the two schools and our procedure for determining the predominant mode of instruction for each.
3.1 School Contexts and Pedagogical Approach
School A and School B were initially selected because they had comparable features but different stated pedagogical approaches. In terms of similarities, first, both schools are in nearby (20 km apart) towns within the same county. As a result, they follow the same educational policies and have the same content standards for mathematics. Second, both schools have a similar student demographic and socioeconomic profile. Third, both schools assign students to classrooms randomly within each grade level rather than tracking based on achievement. Fourth, as public schools, teachers in both schools receive government-funded salaries, ensuring comparable teacher compensation and resources across the two schools.
School A positions itself in public documents (e.g., on Wikipedia) as a site where SCP is implemented in all subject areas (including mathematics). More specifically, School A states on Wikipedia (blinded) that its students have transitioned from being passive recipients of knowledge to independent learners, its teachers have shifted from knowledge transmitters to planner and designers, and its instruction has evolved from conventional learning to inquiry-based learning. In contrast, School B makes no explicit claims about its pedagogy on any website; in conversations with the first author, School B's administrators stated that they follow conventional teaching.
A selection of mathematics lessons in grades 7–9 from each school was observed to confirm the extent that each school's claims about its instructional approach for teaching mathematics in grades 7–9 were valid. All observed lessons took place in the years of 2022 and 2023.
Observed lessons were rated using a well-established observation protocol that has been widely used to investigate student-centered pedagogical approaches in mathematics: the Reformed Teaching Observation Protocol (RTOP; Sawada et al., 2002). The RTOP was specifically designed to evaluate the extent to which mathematics and science classrooms adopt reform-based teaching (Piburn et al., 2000). It has been found to have high reliability (0.954) and high validity including face validity, construct validity, and predictive validity (Sawada et al., 2002). The RTOP contains five subscales (lesson design and implementation, propositional knowledge, procedural knowledge, communicative interactions, and student–teacher relationships) for evaluating a classroom lesson (see Sawada et al., 2002). Each subscale contains five items, and each item is rated on a scale from 0 to 4, where 4 means the item is “very descriptive” of the lesson while 0 indicates that the item “never occurred.” The total score for a lesson rated using the RTOP ranges from 0 to 100. The higher the score, the more student-centered the observed pedagogy is (Sawada et al., 2002), while the lower the score, the more teacher-centered the pedagogy is. RTOP scores can be considered along a continuum, where score ranges indicate the degree of alignment between observed instruction and student- or teacher-centered instruction (where higher scores point to more student-centered instruction). In particular, scores between 0 and 30 provide the strongest evidence of teacher-centered instruction while scores between 61 and 100 provide the strongest evidence of student-centered instruction (Madsen & Richards, 2022). Scores between 31 and 60 indicate transitional pedagogies (Madsen & Richards, 2022).
In School A, we observed a total of 16 lessons—at least one lesson from each of School A's seven mathematics teachers. In School B, we randomly selected eight out of the 17 teachers and observed a total of 16 lessons from them. The RTOP scores for all 32 lessons are shown in Figure 1, where the y-axis represents the RTOP scores. Across all lessons, School A's average RTOP score was 22.1 points higher than that of School B (z = −4.435, p < 0.01), indicating that School A's instructional pedagogy in mathematics was significantly more student-centered than in School B (Zhang, 2025). Specifically, all observed lessons in School B were rated as teacher-centered (RTOP in School B (n in along a continuum, where score ranges Figure 1). In contrast, seven lessons in School A were rated as student-centered (RTOP > 50), and nine as transitional, based on Budd et al. (2013).

Total Reformed Teaching Observation Protocol (RTOP) scores for each observed lesson. Note: Higher scores indicate more student-centered instruction.
Observed lessons were also analyzed qualitatively, the results of which align with the RTOP findings in that School A mathematics lessons exhibited more student-centered features compared to those at School B (Zhang, 2025). All three features of SCPs in mathematics classrooms (student thinking is made public, students actively engage with each other's thinking, and sense-making and conjecturing drive instruction (Thanheiser & Melhuish, 2023), were observed to a much greater extent in School A's lessons, whereas none were evident in School B's.
As additional evidence about the pedagogy used in each school, the Teacher-Student Orientation Survey, a questionnaire designed by PISA 2012 (OECD, 2014), was also administered to compare both schools’ teachers’ perceptions of their pedagogies. This questionnaire contains questions for teachers such as, “I ask students to help plan classroom activities or topics.” All seven School A mathematics teachers and nine out of seventeen School B mathematics teachers completed the survey. The results showed that teachers at School A scored higher on each statement (p < 0.05 or d > 0.56), indicating that School A mathematics teachers either significantly outperformed those at School B or demonstrated practically meaningful differences, with large effect sizes observed despite the small sample size. This suggests that School A teachers perceived School A's pedagogy to be more student-centered than School B teachers’ perceptions of School B's pedagogy (Zhang, 2025).
In light of these findings that School A was found to rely more on SCPs than School B, in the remaining sections we use “SCP School” and “TCP School” to refer to Schools A and B, respectively.
3.2 Participants
A total of 1029 students, including 365 SCP School students and 664 TCP School students participated in this study.
Mathematics performance was assessed using a longitudinal design at two time points from an entire cohort of Grade 7 and Grade 9 students from the two schools, involving 220 SCP School and 524 TCP School students. Specifically, we obtained both Grade 7 (year 2020–2021) and Grade 9 (year 2022–2023) mathematics final exam scores for all students enrolled in the two schools, from the same participants. 212 SCP School and 523 TCP School students had scores for both exams, indicating that they remained in the same school from Grade 7 to Grade 9 without transferring or dropping out.
Mathematics confidence was assessed using both longitudinal and cross-sectional designs at four time points across three cohorts (see Figure 2). For cohort A, we longitudinally assessed the confidence of 102 SCP (2 classes) and 92 TCP (2 classes) students at two time points, at the very beginning (before receiving their first math lesson) and again two weeks into their Grade 7 year. Cohorts B and C had their confidence assessed cross-sectionally. Cohort B provided data on the confidence of students during the third quarter of their seventh-grade year, where 45 SCP (1 class) and 48 TCP (1 class) students were assessed at nearly one academic year after secondary school had started. Cohort C provided data on the confidence of students in the ninth-grade near the conclusion of their lower secondary education, approximately three academic years after entering secondary school; 81 SCP (2 classes) and 76 TCP (2 classes) students were assessed for mathematics confidence toward the end of Grade 9. For all cohorts, we randomly selected the classes that participated in the confidence survey.

Four time points collected for confidence data.
3.3 Measures
Students’ mathematics performance was evaluated using the Grade 7 and Grade 9 final examinations, both of which were standardized tests that aimed to evaluate the mathematics performance of students at the end of the Grade 7 and Grade 9 courses. Specifically, the same Grade 7 final exam was administered across all secondary schools in the county, including the two schools in this study. The Grade 9 final exam (known as Zhongkao) was a national compulsory, standardized, government administered assessment in China to evaluate Grade 9 students’ achievement across required subjects at the end of their lower secondary education and to determine high school (Grade 10–12) admission across the country. The distribution of easy, moderate, and difficult tasks in these tests is usually in a ratio of 7:2:1 to examine a broader coverage of problem solving as defined in Chinese educational context (Ministry of Education of People's Republic of China, 2022).
Confidence toward mathematics was assessed using a questionnaire from TIMSS 2019 (Mullis et al., 2020). This assessment has been found to have high validity and reliability across multiple countries (IEA, 2020). The Cronbach's alpha for all responses in this study was 0.892, indicating high internal consistency reliability for the items. Sample items include “I usually do well in mathematics,” “Mathematics is more difficult for me than for many of my classmates,” “I am good at working out difficult mathematics problems,” “Mathematics is not one of my strengths.” All items included a 4-point Likert scale, with 4 indicating strong agreement with the item statement and 1 indicating strong disagreement with the item statement.
3.4 Data Analysis
With respect to mathematics performance, the final exam scores (maximum score of 120) of students in both schools were compared both overall and via a quartile analysis, using t-test, to determine how the distributions of grades differed. To mitigate the multiple comparisons problem (Barnett et al., 2022), this study applied the Holm–Bonferroni Correction, which adjusted the alpha threshold to a level lower than 0.05 and required stronger evidence for significance.
With respect to mathematics confidence, the total confidence score for each student (on the nine-item instrument) was calculated (maximum score of 36), and the average total confidence scores within each school were plotted to visualize confidence changes across four time points. For the longitudinal data (time points 1 and 2, see Figure 3), a repeated measures ANOVA was conducted, with time and school as fixed effects, to evaluate whether there were significant differences in the nearly one-and three-academic-year implementation of SCPs. For cross-sectional comparisons (time points 1 vs. 3; time points 1 vs. 4), a two-way ANOVA was conducted to evaluate whether there were significant differences before and after the SCPs implementation.

Boxplot for Grade 7 and Grade 9 mathematics performance.
For interpreting the results, a significant main effect of time would indicate that participants’ confidence changed over time. A significant main effect of school would suggest an overall difference in confidence levels between the two schools. A significant time × school interaction would indicate that the change in confidence differed between schools, which are a necessary condition for providing direct evidence of the effectiveness of SCPs over TCPs.
4. Results
4.1 Mathematics Performance (RQ 1)
With respect to our first research question, no statistically significant differences were found when comparing the overall Grade 7 (t = 0.190, p = 0.849) and Grade 9 (t = 0.133, p = 0.894) final exam performance between students from the two schools. Both schools had similar mean scores for both exams (SCP School mean = 66.01 for Grade 7 and 50.53 for Grade 9, TCP School mean = 65.50 for Grade 7 and 50.24 for Grade 9).
Our investigation of the distribution of performance scores (see Figure 3) suggested that TCP School students had a wider overall range of performance than students from SCP school. To explore these differences further, a quartile analysis was conducted to compare the performance data in greater detail. Students’ grades were ranked from lowest to highest within each school and then were divided into four equal quartiles: Q1 (25th percentile), Q2 (50th percentile/median), Q3 (75th percentile), Q4 (100th percentile/lowest value). Table 1 presents the mean, standard deviation, and p-value for each quartile comparison between the two schools.
Quartile Analysis of Grade 7 and Grade 9 Performance Scores.
SCP = student-centered pedagogies; TCP = teacher-centered pedagogies.
At both grade levels, lower performers (both Q1 and Q2) had higher mean scores in the SCP School than in the TCP School, whereas top performers (both Q3 and Q4) had higher mean scores in the TCP School than in the SCP School. After applying the Holm–Bonferroni Correction, only the differences at Q1 remained statistically significant for both grade levels (p = 0.002, p = 0.011).
Consequently, the results in Table 1 suggest that by the end of Grade 7 and Grade 9, the two schools differed significantly in the quartile involving lowest performers (Quartile 1), where the bottom 25% of students at SCP School performed better than those at TCP School.
4.2 Mathematics Confidence (RQ 2)
Figure 4 illustrates the average total confidence scores for each school at different time points. The two schools had no significant differences in mathematics confidence at T1, the start of their secondary education. However, the TCP School had higher mean scores than the SCP School.

Mean total confidence scores for four time points.
4.2.1 Two-Week Implementation Effect (T1 vs. T2)
The assumption of sphericity was automatically satisfied because the within-subject factors included only two time points. The assumption of normality was checked by examining a Q-Q plot of the model's standardized residuals, which showed that the points adhered closely to the line of expected normal values. With both key assumptions met, the planned repeated measures ANOVA was deemed appropriate and reported as follows.
Regarding the short-term implementation effect of SCPs (T1 vs. T2), the repeated measures ANOVA revealed a significant main effect of time (F(1, 192) = 10.038, p = 0.002, partial η2 = 0.050) and a nonsignificant main effect of school (F(1, 192) = 1.014, p = 0.315, partial η2 = 0.005). Importantly, the interaction between time × school was significant (F(1, 192) = 9.633, p = 0.002, partial η2 = 0.048), with SCP students increasing their mean total scores by 1.97, while TCP students decreased theirs by 0.06. This demonstrates that the pattern of change differed between the SCP and TCP Schools during the first two weeks. A follow-up t-test suggested that SCP School participants exhibited a statistically significant increase in total confidence two weeks after secondary school began, compared to their baseline levels (p = 0.005**, d = 0.390), while TCP School participants had no changes in confidence within the first two weeks.
4.2.2 One-Year Implementation Effect (T1 vs. T3)
The Shapiro–Wilk test suggested that the normality for all groups is assumed except for the group of TCP School at Time 1 (W(93) = 0.966, p = 0.017). However, given the large sample size of this group (n = 93), this may reflect the test's sensitivity to minor deviations. Inspection of Q-Q plots indicated that all groups, including TCP School at Time 1, adhered sufficiently to the diagonal line. Therefore, the assumption of normality was considered reasonably satisfied for the two-way ANOVA
The assumption of homogeneity of variances was violated based on Levene's test (F(3, 288) = 4.128, p = 0.007). To address this violation, the model was reestimated using HC3 robust standard errors. The robustness of the findings was further supported by nonparametric bootstrapping with 5000 samples.
With respect to the one-academic-year implementation effect of SCPs (T1 vs. T3), the two-way ANOVA results revealed a significant interaction effect (F(1, 288) = 4.532, p = 0.034, partial η2 = 0.015). As Figure 5 (left) illustrates, SCP School participants exhibited an increasing pattern of confidence, while TCP School participants showed a decreasing trend as shown in Figure 5 (left). This significant effect was consistent when using bootstrap resampling (B = –3.263, p = 0.029, 95% CI [–6.261, –0.305]) and HC3 robust standard errors (B = –3.26, p = 0.034, 95% CI [–6.275, –0.251). Neither the main effect of time (F(1, 288) = 0.461, p = 0.498, partial η2 = 0.002) nor the main effect of school (F(1, 288) = 0.079, p = 0.779, partial η2 < 0.001) was significant. However, a follow-up t-test suggested that neither the change in confidence within the SCP School nor that within the TCP School from T1 to T3 was statistically significant.

Interaction effect for one-year (T1 vs. T3) and three-year (T1 vs. T4) implementation.
4.2.3 Three-Year Implementation Effect (T1 vs. T4)
Results of the Shapiro–Wilk test suggested that the assumption of normality was satisfied for all groups, with the exception of the group of TCP School at Time 1 (W(93) = 0.966, p = 0.017) and SCP School at Time 4 (W(81) = 0.964, p = 0.022). Nonetheless, the Q-Q plots for all groups indicated that the data points adhered sufficiently to the diagonal line. Therefore, the significant Shapiro–Wilk test was likely due to the large sample size and the assumption of normality was deemed to be met for the purpose of the two-way ANOVA. In addition, the assumption of homogeneity of variances was violated based on Levene's test (F(3, 351) = 4.541, p = 0.004). To address this violation and ensure valid inference, the model was refitted using HC3 robust standard errors. The robustness of the findings was further confirmed using nonparametric bootstrapping with 5000 samples.
With respect to the three-academic-year implementation effect of SCPs (T1 vs. T4), the two-way ANOVA results revealed a significant main effect of school (F(1, 351) = 5.565, p = 0.019, partial η2 = 0.016) and a significant main effect of time (F(1, 351) = 4.813, p = 0.029, partial η2 = 0.014). This means that, when considering both time points (T1 and T4), the confidence levels in SCP School were significantly different from those in TCP School. It also indicates that, when considering both schools together, students’ confidence at T1 significantly outperformed those at T4. As shown in Figure 5 (right), both SCP and TCP School students exhibited a decrease in confidence from T1 to T4. Notably, the School × Time interaction was not significant (F(1, 351) = 0.197, p = 0.657, partial η2 = 0.001). This nonsignificant interaction was corroborated by the analysis using HC3 robust standard errors (B = −0.59, p = 0.662, 95% CI [–3.22, 2.04]) and the bootstrap analysis (B = −0.59, p = 0.645, 95% [−3.16, 1.93]).
5. Discussion
This study provides important insights into the effectiveness of long-term implementation of SCPs on secondary students’ mathematics performance and confidence. The mathematics performance and confidence of seventh- and ninth-grade students from two Chinese lower secondary schools (Grade 7–9)—one of which used significantly more SCPs than the other—were evaluated and compared.
5.1 Student-Centered Pedagogies and Mathematics Performance (RQ 1)
Unlike existing literature that found short-term implementation of SCPs significantly improved students’ mathematics performance (as compared to TCPs implementation) (e.g., Boom-Cárcamo et al., 2024; Klang et al., 2021; Saragih & Habeahan, 2014), this study did not find a significant overall difference between the two groups at the end of Grade 7 (after one academic year of implementation) or Grade 9 (after three academic years). There are at least two possible explanations for this discrepancy.
First, the effectiveness of long-term implementation of SCPs on mathematics performance may differ from that observed in short-term implementations. Time might be one important factor moderating the role of pedagogy in performance. It is possible that the two groups of students showed significant differences in overall performance after only a few weeks of exposure to distinct pedagogies, but that this difference diminished over time. In fact, this pattern—where the effectiveness of SCPs on overall mathematics performance may diminish over time—is supported by several studies. For example, Klang et al. (2021) found that SCP (cooperative learning) significantly improved students’ overall geometry performance after a short-term implementation, compared to those in a business-as-usual control condition. However, Kogan and Laursen (2014), who investigated the retention effect of SCP (inquiry-based learning) after the intervention ended, found that its long-term impact on students’ subsequent overall mathematics grades was modest relative to the control group.
The second possible explanation for the discrepancy between the findings of this study and existing literature may be that the effectiveness of SCPs on mathematics performance within the Chinese context differs from that observed in other countries. This may be true because the conceptualizations of and implementations of SCPs and TCPs in Chinese mathematics teaching may differ from their Western counterparts. In particular, both Chinese SCPs and TCPs have been found to place great emphasis on performance, procedures, and problem solving—more so than these pedagogies typically exhibit when implemented in US contexts (e.g., Cai & Ding, 2017; Hiebert et al., 2005; Zhang, 2025). In other words, Chinese TCP may be more effective as than Western TCP, as reported in international literature, which could have the effect of reducing or eliminating any differences between Chinese TCP and Chinese SCP. For example, the mathematics performance of 15-year-old students in China has been reported to be equivalent to three school years ahead of the global average (OECD, 2014).
That said, significant differences were identified between the two schools’ low-performing students in both Grade 7 and Grade 9 performance tests while conducting the Quartile analysis. These findings appear to implicate that the prolonged exposure to SCPs contributes to improving the mathematics performance of low achievers. This result—that SCPs may have a particular strong influence on low achieving students—is consistent with the short-term intervention study of Ridlon (2009), which compared the mathematics performance of American sixth-grade students in SCPs with those in TCPs for nine weeks, finding that the performance of low achievers who experienced SCPs improved the most. This result is also consistent with Kogan and Laursen (2014), who found that low achievers who consistently experienced SCPs earned higher grades in subsequent college math courses after the intervention ended, as compared to their low-achieving peers taught without SCPs.
The finding that low performers in SCP School significantly outperformed those in TCP School in terms of their mathematics performance may be explained by the greater support and inclusion provided to the low performers in the SCP School. Classroom observations indicated that in nearly every lesson observed in TCP School, students in the back rows appeared disengaged and struggled to follow the teacher's pace. These students received minimal attention and remained isolated from their peers. In contrast, SCP School adopted group seating arrangements that integrated low-performing students into collaborative activities, rather than relegating them to the back of the classroom.
5.2 Student-Centered Pedagogies and Mathematics Confidence (RQ 2)
The effectiveness of SCPs over TCPs on enhancing mathematics confidence within the Chinese context appears to diminish over time: effective during a two-week implementation, potentially effective over the course of nearly one academic year, and ultimately showing no effectiveness after three academic years.
With respect to the short-term implementation (two weeks) of SCPs, the significant time × school interaction suggests that during the first two weeks of implementation, students’ mathematics confidence changed differently across SCP and TCP Schools. Post hoc analysis revealed that the SCP School participants experienced a statistically significant increase in mathematics confidence in the first two weeks (p = 0.005), whereas the TCP School showed no significant change. This suggests that the short-term implementation of SCPs had a positive effect on students’ mathematics confidence, as compared to TCPs. This result is consistent with the findings reported by existing literature (e.g., Kandil & Işıksal-Bostan, 2019; Siregar & Maat, 2020).
Regarding the nearly one-academic-year implementation of SCPs, despite the absence of main effects for time or school, the significant time × school interaction suggests that the direction of change in mathematics confidence varied significantly between the two schools. Specifically, mathematics confidence tended to increase in SCP School while declined in TCP School, suggesting a potential positive effectiveness of SCPs after nearly one academic year of its implementation.
With respect to the nearly three-academic-year implementation, the significant main effect of time, along with declining mean scores in both schools, reflects a general decrease in student confidence from the beginning to the end of lower secondary education. This aligns with findings reported in many existing studies, which generally suggest a downward trend in students’ mathematics confidence as they grew older and progress through grade levels (e.g., Jacobs et al., 2002; Nagy et al., 2010). Although no significant differences were found between the two schools at either T1 or T4 individually, the significant main effect of school indicates that, when averaging across time points (T1 and T4), TCP School students had higher confidence levels than those from SCP School. In other words, TCP School students consistently reported slightly higher confidence levels across T1 and T4. However, the time × school interaction was not significant, indicating that the effect of long-term SCPs implementation on confidence has no difference from that of the TCPs implementation. In other words, although confidence levels varied between the two schools, the pattern of change over time was similar.
The decline in the effect of mathematics confidence may be attributed to two factors. First, it could be linked to the heavy reliance on, or inappropriate application of, extrinsic rewards in SCP School's classrooms. Based on classroom observations at SCP School, extrinsic rewards were consistently used in Grade 7 encourage students’ active participation. However, in Grades 8 and 9, these rewards were employed less frequently. According to behaviorist principles, when extrinsic rewards are discontinued, the desired behaviors often diminish (Chance, 1992; Kohn, 1993), meaning that ongoing reinforcement is necessary to maintain these behaviors. Therefore, the reduction in extrinsic reinforcement at SCP School over time may have contributed to a decline in students’ intrinsic motivation, which in turn undermined the long-term effectiveness of the SCP in fostering mathematical confidence. Second, self-efficacy theory may provide an additional explanatory lens (Bandura, 1977). In Grade 7, SCP students may have higher confidence than their TCP peers due to frequent teacher encouragement and externally reinforced success experiences. However, by Grade 9, as students faced with the high-stakes Chinese national examination for high school entrance, these early confidence gains may not have been sufficiently supported by accumulated mastery experiences. As a result, students may have recalibrated their self-beliefs downward when confronted with performance challenges. In contrast, TCP students, who did not experience such early externally driven boosts in confidence, may have coped more steadily with the increasing difficulty, maintaining a more stable level of confidence over time.
5.3 Limitations
Conclusions and inferences drawn from this study should be considered in light of several key limitations. First, unlike the confidence data which included baseline measures, the absence of baseline mathematics performance data prior to students’ entry into lower secondary schools makes it challenging to assess the time × school interaction effect and to evaluate the long-term effectiveness of the SCPs. Therefore, even if t-tests comparing performance across schools at each time point yield no significant differences, the possibility of a time × school interaction effect cannot be ruled out. Furthermore, the absence of baseline performance data limits our ability to determine whether there were initial performance differences between the two schools.
Second, although pedagogy was intended to be the main variable influencing the outcomes of this study, no definitive causal claim can be made between instructional approaches and changes in students’ mathematics confidence owing to the limitation of comparative case studies. While fluctuations in SCP School students’ confidence were observed, the available data do not provide sufficient explanations to account for variations in mathematics confidence across time points for students at both schools. Nonetheless, the strengths of comparative case studies lie in their affordance to examine schools employing these pedagogies over extended periods, which is challenging to achieve with quasi-experimental designs. Few studies, such as ours, have explored the pedagogical duration students experienced lasting longer than one year (e.g., Chan, 2011; Kandil & Işıksal-Bostan, 2019; Klang et al., 2021; Ridlon, 2009).
Third, while School A was significantly more student-centered than School B, not all of School A's lessons fully met the description of SCPs as elaborated in international literature. This might have narrowed the differences between the SCPs and TCPs in this study. Nonetheless, as more schools increasingly adopt hybrid pedagogies that blend teacher-centered and SCPs (e.g., Zhou et al., 2023), it becomes less possible for researchers to locate schools that only use SCPs or TCPs in all mathematics lessons.
Footnotes
Ethical Considerations and Informed Consent
This study has received approval from the Cambridge Research Ethnics Committee and has obtained ethics clearance. Informed consent was obtained from all subjects involved in the study.
Contributorship
Ying Zhang led the conceptualization and design of the study, conducted the research, performed the data analysis, and drafted the manuscript. Jon R. Star contributed to the theoretical framing of the study and provided supervision throughout the writing process. Both authors revised the manuscript through multiple rounds and approved the final manuscript.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by a fieldwork grant awarded by the University of Cambridge.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
