Abstract
This study examined Specifications Grading, an alternative grading system emphasizing clearly defined learning outcomes and revision, and mathematics identity among 846 Latin* Calculus I students at a Hispanic-Serving Institution. Mathematics identity, comprising competence/performance, recognition, and interest, was measured at the beginning and end of the semester. Repeated-measures analyses indicated stable competence/performance and recognition alongside declines in interest. Specifications Grading was associated with increased mathematics identity, and multilingual students experienced smaller declines than their peers overall.
Introduction
For many U.S. college students, Calculus I serves as a launching pad into science, technology, engineering, and mathematics (STEM) careers. For others, particularly Latin* 1 students, it functions as a gatekeeper that limits progression through the STEM pipeline (Bressoud, 2015). Despite gains in STEM degree attainment (Irwin et al., 2021), Latin* students remain underrepresented relative to their college enrollment (Estrada et al., 2016). As institutions enrolling a substantial share of Latin* undergraduates, Hispanic-Serving Institutions (HSIs) are uniquely positioned to move beyond STEM enrollment toward institutional servingness (G. A. Garcia, 2018), defined by the extent to which instructional environments support equity, belonging, and persistence for Latin* students (G. A. Garcia et al., 2019; Ro et al., 2024). Recent syntheses of STEM education research at HSIs further underscore this potential, linking institutional culturally responsive practices to improvements in students’ STEM achievement (Núñez et al., 2021; Kendall et al., 2019), while also noting limited attention to how core instructional structures contribute to these outcomes (Ro et al., 2024).
Among these structures, grading systems play a crucial yet often overlooked role in shaping students’ interpretations of success within STEM (Feldman, 2023). In many Calculus I classrooms, traditional grading remains dominant (Townsley & Lang, 2025), often privileging dominant norms of mathematical communication and competition that marginalizes differing ways of demonstrating understanding (Leyva, 2017). As Ro et al. (2024) note, many HSIs continue to reproduce pedagogical norms rooted in historically White institutions, sustaining weed-out cultures and competitive expectations that conflict with the cultural assets of the Latin* students that they aim to serve. In this sense, grades do more than quantify performance; they signal norms about whose ways of knowing are valued. To better understand how grading structures signal legitimacy in mathematics, it is therefore necessary to move beyond grades as sole indicators of success.
Research on mathematics identity, defined as how learners perceive themselves in relation to mathematics (Lerman, 2000; Sfard & Prusak, 2005), offers a theoretically grounded framework for examining these processes (Cribbs et al., 2015; Robnett et al., 2018). Within this framework, mathematics identity is understood as comprising students’ sense of competence and performance, perceptions of recognition by others, and interest in mathematics (Cribbs et al., 2015), subconstructs that have been consistently linked to persistence and success in STEM pathways. Emerging studies further show that grading systems emphasizing feedback and revision are associated with more supportive identity-related experiences (Cribbs et al., 2015; Fernández et al., 2025; Robnett et al., 2018).
Despite this promise, STEM education research at HSI has largely overlooked how institutional structures shape identity-related outcomes (Ro et al., 2024), with grading practices remaining a particularly underexamined mechanism within servingness frameworks (G. A. Garcia et al., 2019). The present study addresses this gap by examining grading not only as an academic outcome measure, but as a structural feature with implications for how Latin* students understand their relationship to mathematics. Specifically, this study examines associations between Specifications Grading, an alternative grading approach emphasizing reassessment of defined learning outcomes (Nilson, 2014), and the mathematics identity among 846 Latin* students enrolled in Calculus I at an HSI between 2022 and 2024 using repeated-measures ANOVA. We also examine differences by linguistic background to investigate its intersection with grading structures within an HSI context. This was guided by the following research questions: (1) How does enrollment in Specifications Graded Calculus I courses influence Latin* students’ mathematics identity, including competence/performance, recognition, and interest, over time, compared to enrollment in Traditionally Graded Calculus I courses? (2) To what extent do the effects of Specifications Grading on mathematics identity vary by linguistic background among Latin* students?
Literature Review
Grading and Assessment as Gatekeeping in HSI Undergraduate Mathematics
The history of grading and assessment in the United States began as a communicative tool among teachers, students, and parents, but evolved into a bureaucratic mechanism for sorting and comparing students across institutions as public education expanded (Schneider & Hutt, 2014). This sorting function remains most visible in introductory mathematics courses such as Calculus I, which has long served as a gatekeeper to STEM fields (Bressoud, 2015). Despite this, many university mathematics departments continue to rely on such traditional grading systems, which are composed of point-based systems dominated by high-stakes exams, rigid weighting schemes, and limited opportunities for revision (Townsley & Lang, 2025). These systems tend to reward speed and procedural fluency while providing few avenues to demonstrate conceptual understanding or growth.
Research situated in HSI contexts further illustrates how such course-level grading structures interact with broader organizational conditions that shape students’ classroom experiences and educational outcomes (Becker & Cox, 2022). Most notably, an HSI designation does not inherently ensure that institutional structures adequately meet the needs of the Latin* students that they serve. As Ro et al. (2024) argue, many HSIs retain instructional practices inherited from historically White institutions, effectively “ghosting” the HSI context in everyday classroom practices. Therefore, although intended to preserve rigor, such traditional grading practices may often privilege particular forms of mathematical communication and assessment familiarity that are not fully compatible with Latin* students’ ways of knowing and learning.
For Latin* students navigating diverse cultural and linguistic educational backgrounds, traditional grading systems may fail to capture the depth of their mathematical reasoning or the varied ways they engage in mathematical practices (Moschkovich, 2015; Planas & Civil, 2013). Students whose learning emphasizes collaborative problem-solving, oral explanation, or non-dominant languages may be particularly disadvantaged in courses that rely heavily on time-limited, written assessments (Aguirre & del Rosario Zavala, 2013). Equity-oriented scholarship therefore calls for expanding norms of participation in mathematics by valuing students’ identities, experiences, and social contexts rather than replicating dominant structures (Brown, 2018). Indeed, within HSI contexts, racially affirming instructional practices that disrupt assumptions of neutrality and leverage Latin* students’ cultural values have been shown to create more inclusive opportunities for participation and identity development (Leyva et al., 2025), underscoring the need for grading reforms that more closely align assessment with learning, equity, and access in high-stakes gateway courses.
Specifications Grading as an Equity-Oriented Alternative
Traditionally Graded (TG) courses often reflect students’ ability to navigate classroom procedures or linguistic conventions rather than their mastery of mathematical ideas (Link & Guskey, 2019). This conflation shifts grading from a measure of learning to an indicator of conformity and access to academic capital, disproportionately disadvantaging historically marginalized students (Feldman, 2023). Specifications Graded (SG) courses, by contrast, redefine how learning is accessed and communicated (Nilson, 2014). Emphasizing clear learning outcomes and a mastery-oriented approach, SG courses shift grading from ranking to documenting understanding and growth (Link & Guskey, 2019; Nilson, 2014). Instead of partial credit, SG evaluates whether students meet established criteria using a satisfactory/unsatisfactory rubric. Additionally, students are provided with ample opportunities to be reassessed, reframing mistakes as integral to the learning process rather than as unmalleable penalties. Therefore, SG promotes a culture of learning that values mastery over speed and growth over competition.
A growing body of higher education research suggests that such alternative grading approaches, often discussed under other related terms like mastery-based learning, standards-based grading, and ungrading (Hackerson et al., 2024), are gaining increased attention within undergraduate STEM contexts, including mathematics courses (Carlisle, 2020; Prasad, 2020). Importantly, evidence from broader higher education research indicates that such grading systems can support more equitable and effective teaching by emphasizing formative feedback and reducing the punitive consequences of early performance (Bonner, 2016; National Academies of Sciences, Engineering, and Medicine, 2025). Within undergraduate mathematics, SG practices have been shown to encourage persistence, reduce test anxiety, and foster growth-oriented views of learning by positioning assessment as part of the learning process rather than as a one-time judgment of ability (Collins et al., 2019; Fernández et al., 2025; Harsy & Hoofnagle, 2020; Henriksen et al., 2020; Lewis, 2022).
Despite these promising findings, Hackerson et al. (2024) found that research on alternative grading systems remains fragmented across disciplines, characterized by inconsistent terminology, varied implementation models, and a lack of common outcome measures. Moreover, most studies emphasize course-level outcomes or student perceptions, with limited attention to equity-oriented analyses or subgroup differences. Within HSI contexts, this limitation is especially consequential. Although equity-oriented reforms at HSIs have documented positive impacts on Latin* students’ mathematics identity-related constructs, this work has largely focused on co-curricular initiatives and undergraduate research experiences rather than everyday instructional practices such as grading (G. A. Garcia et al., 2019; Kendall et al., 2019; Núñez et al., 2021; Ro et al., 2024).
Additionally, while SG is often framed as reducing implicit norms that disadvantage students unfamiliar with dominant academic expectations, few, if any, studies have explicitly examined how SG intersects with diverse students’ multilingual backgrounds (Hackerson et al., 2024). This omission is significant given extensive evidence that language mediates assessment practices, perceptions of competence, and opportunities to demonstrate mathematical understanding (O. García & Kleyn, 2016; Moschkovich, 2015; Planas & Civil, 2013; Sharma & Sharma, 2023). Consequently, grading reforms such as SG remain underexamined as a mechanism through which HSIs may enact servingness within required, high-stakes courses such as Calculus I. Understanding the equity implications of grading reforms, therefore, requires attention not only to academic outcomes but also to how grading practices shape students’ developing relationships with mathematics.
Mathematics Identity, Grading Practices, and Equity in HSI Contexts
While grades are often used as the primary indicator of student success in undergraduate mathematics, they do not capture how grading structures communicate belonging, competence, and legitimacy in mathematics. To capture these broader dimensions of Latin* students’ experiences beyond course performance alone, this study draws on a mathematics identity framework, which attends to how students come to see themselves as “math people” (Cribbs et al., 2015). According to Cribbs et al. (2015), it encompasses three interrelated subconstructs: (a) competence and performance (i.e., students’ confidence in their mathematical ability and perceived success); (b) recognition (i.e., how they believe others view them as mathematics people); and (c) interest (i.e., their enjoyment and engagement with the discipline). Together, these dimensions influence persistence and achievement in STEM pathways (Cribbs et al., 2015; Fernández et al., 2025; Robnett et al., 2018).
Importantly, emerging evidence suggests that mathematics identity develops not only through instructional content but also through evaluative practices that signal what counts as mathematical success and who is recognized as mathematically competent (Martin, 2009; Sfard & Prusak, 2005). Conversely, traditional high-stakes grading may signal exclusion or deficiency, and thus weakening students’ confidence and belonging, particularly in gatekeeping courses such as Calculus I (Ellis et al., 2016). For Latin* students, in particular, mathematics identity is also shaped by linguistic and cultural factors influencing how competence and belonging are recognized (Leyva, 2017; Moschkovich, 2015). Indeed, research conducted in HSI contexts has shown that when instructional environments affirm students’ cultural and linguistic assets, Latin* students report stronger STEM identities and therefore greater persistence (Contreras Aguirre et al., 2020; Kendall et al., 2019). In contrast, TG practices often privilege White- and English-dominant norms that disregard alternative ways of expressing mathematical understanding (Moschkovich, 2015; Planas & Civil, 2013). These norms reward conventional written forms and precise academic language over collaboration or translanguaging (Planas & Civil, 2013).
Within this context, SG emerges as a theoretically promising alternative precisely because of its potential implications for students’ identity-related experiences, not only their grades. By emphasizing transparent expectations, formative feedback, and opportunities for revision, SG may shape how students interpret their competence, recognition, and interest in mathematics over time. However, scarce research has examined how SG intersects with mathematics identity, particularly for multilingual Latin* students in HSI gateway courses. Addressing this gap, the present study adopts a mathematics identity framework to examine the relationship between SG and students’ identity-related experiences.
Methods
This study employed a quantitative, observational pre- and post-survey design. A validated mathematics identity instrument was administered at two time points to students enrolled in Calculus I sections using SG or TG, and within-student change and between-group differences were analyzed using repeated-measures ANOVA. The sections that follow provide the contextual and methodological foundations necessary for interpreting the analyses and findings.
Data Collection
Data was collected across five consecutive academic semesters (excluding summers), between Fall 2022 and Fall 2024 at an HSI in the southwestern United States. The HSI context is theoretically significant because HSIs are not only sites of concentrated Latin* enrollment but are also increasingly conceptualized in terms of institutional servingness. Moreover, HSIs enroll substantial numbers of students from multilingual households, making them particularly salient contexts for examining how grading structures intersect with linguistic background, a dimension often overlooked in research on alternative grading and undergraduate STEM equity (Hackerson et al., 2024).
Prior to registration, students were unaware whether their Calculus I section used SG or TG, as no distinguishing labels appeared in the catalog. This design inherently minimized self-selection bias by allowing enrollment to occur almost at random. Calculus courses were not designated by major; consequently, all classes remained open to both STEM and non-STEM students. During the second week of each semester (Time 1), participants completed a 12-item mathematics identity survey developed by Cribbs et al. (2015) and provided demographic information. The same survey was administered during the penultimate week (Time 2). These pre- and post-surveys captured potential changes in students’ mathematics identity over time. For more details on the history and implementation at this HSI, please read Villalobos et al. (2025).
Participants
Students were included in the study if they completed both the pre- and post-surveys. Responses showing carelessness or inconsistency were excluded. The final analytic sample consisted of n = 846 undergraduate students (see Table 1). Due to slight variation in item-level completion, sample sizes differed minimally across analyses. Their sex was measured using the item “Please select your gender identification,” with the response options: Female, Male, Not Listed (write-in), and Prefer Not to Answer. No students selected the latter two categories, and therefore, two labels were created to represent the students’ selected sexual identities: Male-identifying and Female-identifying. Lastly, all participants self-identified as Latin*, which aligns with the courses’ nearly 100% Latin* enrollment.
Descriptive Statistics for Study Participants.
Across five semesters, participants were predominantly male-identifying (68%), full-time (84%), and majoring in engineering (43%) or mathematics, computer science, and the physical sciences (34%). Approximately 72% were enrolled in SG sections. To capture linguistic background, participants responded to “Which language(s) were spoken at home while growing up?”, a measure selected instead of “Are you bilingual?” to avoid assuming fixed linguistic identities (De Houwer, 2015). Home-language background was used as a culturally grounded indicator of linguistic exposure and upbringing. For instance, students reporting Only Spanish at home may not possess high academic Spanish proficiency, while those reporting Only English may still use Spanish functionally through family or community networks (O. García & Kleyn, 2016). This variable thus served as a theoretically valid proxy for multilingual repertoires that shape learning and mathematics identity development (Moschkovich, 2015).
Specifications Grading Format
In SG Calculus I sections, the curriculum is composed of 29 Learning Targets, each assessed through collaborative worksheets, four major exams, and an online homework platform. Each Learning Target appeared in both worksheet and exam contexts, providing repeated practice. The worksheets were intentionally structured to support varied forms of mathematical engagement, including symbolic manipulation, visual representations, and written explanations. Although these design features were not explicitly language-focused, they align with prior research suggesting that students from multilingual backgrounds often rely on multiple linguistic and semiotic resources when constructing mathematical meaning (Moschkovich, 2015; Planas & Civil, 2013). Similarly, because worksheets were completed collaboratively, students routinely worked alongside peers with similar or differing linguistic practices. Research in multilingual mathematics classrooms suggests that such collaboration can support sense-making by allowing ideas to be negotiated, revoiced, and clarified through interaction, gesture, and shared reasoning, particularly when students draw on overlapping but non-identical linguistic repertoires (Aguirre & del Rosario Zavala, 2013; Esmonde & Langer-Osuna, 2013).
A defining feature of the SG format was the opportunity for revision and reassessment. Students could revise and resubmit worksheets or retake Learning Targets during weekly Friday sessions throughout the term. These opportunities created structured space for students to revisit mathematical ideas, incorporate feedback, and refine their reasoning across multiple attempts. From a theoretical standpoint, such iterative feedback structures may be particularly relevant for students from multilingual backgrounds, as prior research indicates that revision can support sense-making when initial explanations draw on nonstandard language, mixed linguistic resources, or informal registers (O. García & Kleyn, 2016). Additionally, the extended time afforded by revision and reassessment may be consequential when language mediates access to mathematical meaning. Prior research has documented that multilingual students often benefit from additional time to coordinate linguistic expression with conceptual understanding, particularly in assessment contexts where language can function as a barrier rather than a resource (Moschkovich, 2015; Sharma & Sharma, 2023). In this sense, revision and reassessment can be understood as creating time and space for students to iteratively align linguistic expression with conceptual understanding, rather than treating initial explanations as definitive judgments of competence.
TG sections followed a conventional model based on quizzes, group work, homework, and exams without opportunities for reassessment. In contrast, the SG grading system evaluated student work on a pass/no-pass basis, with problems credited for meeting clearly specified criteria. This evaluation emphasized whether students met the mathematical criteria specified in each Learning Target. As a result, evaluation was less likely to hinge on surface-level linguistic precision and more likely to center conceptual completeness, which can reduce the extent to which dominant English norms act as gatekeeping mechanisms in showing mathematical understanding. Although instructors were not provided with explicit guidelines regarding linguistic flexibility in assessments, this emphasis on conceptual completeness rather than fine-grained point deductions may reduce the extent to which students’ mathematical understanding is filtered through dominant linguistic norms, a concern raised in prior research on assessment and multilingual learners (Moschkovich, 2015). In combination with collaborative work and multiple assessment attempts, this structure may afford students greater latitude to communicate mathematical understanding across iterations, even when their explanations evolve linguistically over time.
Lastly, all Calculus I sections adhered to a standardized curriculum, pacing, and exam content established by a faculty oversight committee to ensure consistency across formats. Thus, while curricular content was held constant across SG and TG sections, SG differed primarily in its assessment structure and opportunities for revision. From an equity-oriented lens, these structural features may offer conditions under which multilingual Latin* students are better able to sustain engagement with mathematics by framing assessment as iterative feedback rather than one-time linguistic performance, even if the course does not explicitly incorporate translanguaging pedagogies. As such, any connections between SG and the patterns observed across students’ mathematics identity outcomes in this study should be interpreted as suggestive rather than conclusive, and they point to the need for further investigation into how assessment structures interact with linguistic and cultural dimensions of learning.
Survey Instrument
Mathematics identity was measured using Cribbs et al.’s (2015) validated 12-item survey capturing three subconstructs: (a) competence/performance (e.g., “I am confident that I can understand math”), (b) interest (e.g., “I am interested in learning more about math”), and (c) recognition (e.g., “My parents/relatives/friends see me as a math person”), along with one single-item identity statement (“I see myself as a math person”). Items were rated on a 5-point Likert scale from 1 (Strongly Disagree) to 5 (Strongly Agree). Mean scores were computed for each subconstruct at both Time 1 and Time 2. The single-item identity measure was analyzed separately. Thus, each participant had four scores per time point (three subconstructs and one overall identity item), ranging from 1 to 5, with higher values reflecting greater mathematical confidence, recognition, interest, and self-identification as a “math person.” This design allowed consistent within-subject comparisons and analyses across grading systems.
Data Analysis
A confirmatory factor analysis (CFA) using Structural Equation Modeling (SEM) validated the three theorized subconstructs (competence/performance, interest, and recognition), confirming alignment between theory and measurement (Cribbs et al., 2015). Assumptions for repeated-measures ANOVA were tested, including normality and sphericity. Given the large sample and symmetric data, Likert-scale means were treated as interval-level data, consistent with methodological recommendations (Westland, 2022). Repeated-measures ANOVAs examined whether mathematics identity scores changed significantly between Time 1 and Time 2 and whether these changes differed by grading method (SG vs. TG). This approach was appropriate for evaluating within-subject change over time. Additional analyses explored whether changes varied by demographic variables, particularly home language (i.e., to assess potential moderating effects). When significant effects on interactions emerged, paired-sample t tests were conducted to examine the direction and magnitude of pre–post differences within specific groups. Effect sizes (i.e., partial eta squared for ANOVAs, Cohen’s d for t tests) were reported to assess practical significance alongside statistical results.
Limitations
Several limitations should be noted. First, reliance on self-reported survey data may introduce social desirability bias. To mitigate this risk, surveys were administered anonymously, participation was voluntary, and students were informed that responses would have no impact on their course grades. Second, administering surveys only at the start and end of the semester limits understanding of identity development across intermediate time points. However, the pre–post design aligns with prior mathematics identity research and was intentionally selected to capture overall directional change across a full instructional period. Third, although all sections followed a standardized curriculum, syllabus, pacing guide, assessments, and final exam, variation in instructor pedagogy may have influenced results despite efforts to promote instructional consistency. Fourth, although the mathematics identity instrument is validated, the Likert-scale format restricts the depth of interpretation. Nevertheless, the instrument assessed multiple identity subconstructs rather than relying on a single composite measure, thereby strengthening interpretive robustness. Finally, the study relied on observational data from existing course sections rather than random assignment. Although students were unaware of grading formats at enrollment and analyses controlled for relevant background variables, causal interpretations are not warranted. Accordingly, findings should be interpreted as associations rather than causal effects of SG.
Results
Validation of the Mathematics Identity Survey Using SEM
Before addressing Research Questions 1 and 2, we validated the mathematics identity survey developed by Cribbs et al. (2015). Structural Equation Modeling (SEM) was employed to examine the relationships among the survey’s latent constructs and their observed indicators. The model demonstrated strong goodness-of-fit, with a Comparative Fit Index (CFI) of 0.947 and a Tucker–Lewis Index (TLI) of 0.929, indicating that the theoretical structure proposed by Cribbs et al. (2015) adequately fit the data in this context. The SEM analysis further revealed strong standardized factor loadings across all constructs, where all loadings exceeded 0.63 and reached as high as 0.85. Reliability indices were also robust, with construct reliability values ranging from 0.67 to 0.90 across subscales. Together, these results provide compelling evidence that the mathematics identity survey reliably captures the core components of students’ mathematical identity, competence/performance, interest, and recognition, within this sample.
Changes in Students’ Mathematical Identity and Its Subconstructs
Given the conceptual alignment of the research questions, the results are presented together. Changes in Latin* students’ mathematics identity and its subconstructs were examined using two-way repeated-measures ANOVAs, with Time (Time 1 vs. Time 2) as the within-subjects factor and Grading Method (SG vs. TG) and Home Language Profile (language spoken at home growing up) as between-subjects factors. Sex (male-identifying vs. female-identifying) and Prior Calculus I Attempt (first-time vs. repeat) were also included as between-subjects factors but yielded no significant effects across analyses. To examine higher-order effects, three-way repeated-measures ANOVAs were conducted; only one significant three-way interaction emerged (interest). For clarity and parsimony, only statistically significant two-way and three-way effects are reported in detail (see Table 2).
Summary of Repeated Measures ANOVA Results Across Constructs.
Significant at *p < .05. ***p < .001.
Among the between-subjects factors, Home Language Profile was of particular interest given the study’s focus on Latin* students’ home linguistic diversity. Several groupings of this variable were explored. Ultimately, a dichotomous categorization was adopted based on theoretical considerations: students were classified as either coming from English-dominant homes (i.e., Only English or Mostly English with some Spanish) or Primarily Spanish or Multilingual homes (i.e., Both English and Spanish equally, Mostly Spanish with some English, or Only Spanish). The six students who selected Other were dropped from analyses that used the dichotomized home-language variable. This categorization aligned with our interest in whether students’ mathematics identity development differed based on the degree of alignment between their home language backgrounds and the English-dominant language of instruction in Calculus I. This categorization is not intended to signal a deficit perspective. Rather, it draws on sociolinguistic scholarship on bilingual identity among Latin* students (e.g., O. García & Kleyn, 2016), highlighting the fluidity of language use. While this binary simplifies the complexity of multilingual repertoires, it offers a theoretically grounded and analytically practical lens for examining language-related variation in mathematics identity development.
Students’ Self-Perception of Their Competence/Performance and Recognition
According to Cribbs et al. (2015), competence/performance and recognition are key components of mathematics identity; however, no significant changes over time were observed for either subconstruct (Table 2).
Students’ Self-Perception of Their Interest in Mathematics
Students’ interest is yet another key subconstruct of mathematics identity that reflects the students’ emotional and motivational engagement with the discipline (Cribbs et al., 2015). As seen in Table 2, a significant main effect of time was observed for students’ interest in mathematics, F(1, 842) = 20.86, p < .001,
Pre–Post Changes in Math Interest by Grading Method and Home Language Profile.
Significant at **p < .005. ***p < .001.
Students’ Self-Perception of Their Overall Mathematics Identity
Mathematics identity captures students’ holistic sense of themselves as doers of mathematics (Cribbs et al., 2015). In our results, two significant interactions emerged. As seen in Table 2, a significant Time × Grading Method interaction was observed, F(1, 834) = 5.52, p = .019,
Pre–Post Changes in Math Identity by Grading Method.
Significant at *p < .05.
Paired samples t-tests were also conducted separately for English-dominant and Primarily Spanish or Multilingual students to explore the Time × Home Language Profile interaction (see Table 5). Among English-dominant students, mathematics identity significantly increased over the semester, rising from M = 3.23 (SD = 1.07) to M = 3.31 (SD = 1.04), t(333) = −1.60, p = .045, d = 0.08. In contrast, Primarily Spanish or Multilingual students showed no significant change in mathematics identity.
Pre–Post Changes in Math Identity by Home Language Profile.
Significant at *p < .05.
Discussion
The Enduring Nature of Competence/Performance and Recognition
The absence of significant changes in students’ self-perceived competence, performance, and recognition within a single semester suggests that these constructs may be less responsive to short-term interventions than interest, which often fluctuates with immediate experiences (Lee et al., 2024). Competence and performance reflect deeply rooted beliefs about one’s ability to understand and do mathematics (i.e., beliefs shaped by years of schooling and cumulative experiences of success or failure) (Black et al., 2011). One semester may therefore represent only a small segment within broader mathematical trajectories extending across K–16 education. Recognition, similarly, relies on external validation of students’ perceptions of being seen by instructors, peers, or family as “math people.” Meaningful shifts may require sustained exposure to environments where students are publicly positioned as mathematically competent (Esmonde & Langer-Osuna, 2013). Moreover, perceived recognition may diverge from external recognition. Whitcomb et al. (2023), for example, found that first-year physics majors reported higher perceived recognition than non-majors in the same courses, suggesting that major status alone can signal competence, even if self-perception lags. Future research employing longitudinal or mixed-method designs could better capture how recognition evolves across semesters and contexts.
The stability of competence/performance and recognition may also indicate persistent systemic barriers, particularly for students from linguistically minoritized backgrounds. Moschkovich (2015) argues that multilingual learners often demonstrate sophisticated reasoning through translanguaging practices that remain undervalued in English-dominant classrooms. For Latin* students, including those enrolled at large HSIs, racialized perceptions of mathematical ability may be associated with constrained institutional recognition (Martin, 2009). Even English-dominant Latin* students may experience constrained recognition due to racialized assumptions that privilege White-dominant cultural norms. When such recognition norms remain unexamined, HSIs may continue to reproduce historically White standards of mathematical competence (Ro et al., 2024), limiting the extent to which students are publicly positioned as capable mathematics learners despite institutional commitments to access. These findings underscore the need to expand what counts as mathematical competence, recognizing diverse reasoning, communication, and linguistic practices as legitimate forms of mathematics understanding, especially within HSIs meant to serve the needs of Latin* students.
Declines in Interest and the Role of Course Modality
Across both grading systems, students’ mathematics interest declined over the semester, consistent with prior evidence that gateway STEM courses often reduce motivation and belonging (Lee et al., 2024). However, when comparing course modalities, students from primarily Spanish or multilingual homes exhibited smaller declines in interest within SG courses compared to TG. While causality cannot be inferred, this trend aligns with research suggesting that SG assessment structures can mitigate the stress of high-stakes grading by framing evaluation as an opportunity for growth rather than judgment (Nilson, 2014). The iterative design of SG, allowing reassessment and revision, may be especially beneficial for students who engage in reflective, cross-linguistic, or collaborative sense-making processes.
Indeed, research across multilingual and SG instructional contexts is consistent with research documenting associations between opportunities for feedback and revision and students’ motivation, anxiety, and participation (Harsy & Hoofnagle, 2020; Henriksen et al., 2020; Lewis, 2022; Sharma & Sharma, 2023; Tripp et al., 2025). By slowing the tempo of evaluation and reframing mistakes as part of learning, for instance, SG can cultivate a more affirming classroom climate for linguistically diverse learners. In particular, extended time for revision may allow multilingual students to coordinate linguistic expression with conceptual understanding across attempts, rather than experiencing early assessments as definitive judgments of ability.
Interestingly, English-dominant students in SG courses showed a slight decline in interest, suggesting that some students accustomed to traditional grading may find SG models disorienting. Transparent feedback cycles and flexible deadlines, while equitable in design, may be accompanied by heightened awareness of learning gaps and temporary fluctuations in confidence without explicit emotional or instructional scaffolding (Bonner, 2016; Hernandez-Martinez & Williams, 2013). For students who previously excelled under TG systems, SG may disrupt familiar reward structures and require identity renegotiation (Streifer et al., 2024). Consequently, successful SG implementation may benefit from intentional framing and continuous guidance to help students interpret reassessment as evidence of progress rather than deficiency.
These findings underscore that SG is not experienced uniformly across students or contexts. Within HSIs, this highlights the importance of attending not only to the structural features of SG but also to how those features are communicated, framed, and taken up by students. Assessment structures that emphasize revision and extended time for sense-making may create more supportive conditions for multilingual students’ engagement, particularly in high-pressure gateway courses such as Calculus I. However, such structures alone do not guarantee positive experiences or outcomes. Students’ interpretations of reassessment, feedback, and expectations play a critical role in whether these features are experienced as opportunities for growth or as signals of deficiency. Incorporating structured supports, such as guided reflection, goal setting, or metacognitive prompts, may help students across linguistic backgrounds make sense of reassessment as part of learning rather than remediation, thereby strengthening the potential of SG to sustain interest and engagement over time.
Specifications Grading as a Tool for Mathematics Identity Development
Students enrolled in SG Calculus I sections demonstrated a small yet statistically significant increase in overall mathematics identity compared to TG peers, supporting the view that equitable assessment can enhance students’ sense of identifying as a math person (Nilson, 2014). This finding echoes prior research associating SG systems with persistence and confidence by reframing success around effort, feedback, and growth rather than fixed ability (Harsy & Hoofnagle, 2020; Lewis, 2022; Ma et al., 2024). For Latin* students historically excluded from mathematics pathways (Rodriguez et al., 2020), SG may thus serve as a structural mechanism that validates persistence and repositions struggle as productive.
However, linguistic disaggregation complicates this picture. When analyzed by home-language background, mathematics identity gains were concentrated among English-dominant students, while those from primarily Spanish or multilingual homes showed no statistically significant change. This suggests that SG alone may not fully disrupt longstanding linguistic hierarchies. This finding is especially salient for HSIs, where multilingualism is a defining institutional characteristic rather than an exception. Students from multilingual households may continue to experience marginalization within classroom discourse that privileges monolingual norms (Moschkovich, 2015; Planas & Civil, 2013). Even in equitable grading contexts, if students’ contributions are not linguistically or culturally recognized, identity growth may remain constrained.
The absence of a three-way interaction among time, grading method, and language background highlights the complexity of these relationships. For HSIs, this underscores the distinction between enrolling Latin* students and actively enacting servingness through instructional and assessment practices that attend to linguistic diversity. SG may support multilingual students through reduced anxiety and greater autonomy, but these are benefits not directly captured by mathematics identity measures. Alternatively, classroom-level variables such as instructor discourse practices, peer dynamics, and teacher attitudes toward linguistic diversity may moderate SG’s impact. Future research should examine how SG can be integrated with culturally sustaining pedagogies that explicitly affirm students’ multilingual and cultural resources, transforming not only assessment practices but also the relational dynamics that shape recognition and belonging.
Conclusion and Implications
This study extends research on equitable assessment by examining how SG relates to mathematics identity development among Latin* students in Calculus I at an HSI. Using a validated three-factor model, results revealed stable competence/performance and recognition, overall declines in interest, and modest increases in mathematics identity among SG participants relative to TG peers. Disaggregated findings showed that SG courses were associated with smaller declines in interest for students from primarily Spanish or multilingual homes and more favorable identity patterns for English-dominant students. These patterns suggest that institutional contexts characterized by linguistic diversity like HSIs are closely related to how students experience and respond to assessment practices in high-stakes mathematics courses. Importantly, these findings do not support causal claims about the influence of SG on Latin* students’ mathematics identities. Rather, the results are descriptive and reflect outcome differences across Calculus I courses that adopted SG and those that relied on traditional grading approaches.
Interpreted within these design limitations, the findings nonetheless suggest several implications for research and practice. First, SG may be understood as an equity-oriented assessment approach whose structural features, such as transparency, emphasis on mastery, and opportunities for feedback, are consistent with instructional conditions that support positive mathematics identity development. Within HSIs, where Calculus I often represents a critical juncture for STEM persistence, such assessment structures may play an important role in shaping students’ early disciplinary experiences. However, SG alone cannot dismantle the linguistic and cultural inequities embedded in mathematics education. To maximize its impact, SG must be coupled with culturally sustaining pedagogies that affirm students’ diverse ways of communicating and reasoning. Strategies such as reflective goal-setting, metacognitive journaling, and linguistically inclusive feedback can help students internalize progress as competence rather than compliance, thereby strengthening the alignment between assessment practices and institutional commitments to equity and student success. Second, recognition of practices such as celebrating multiple modes of explanation, encouraging peer affirmation, and highlighting translanguaging as a mathematical asset can further support students’ identity formation, particularly in multilingual HSI contexts. Faculty professional development efforts should therefore attend not only to the technical implementation of SG, but also to how assessment practices interact with classroom discourse, recognition, and linguistic diversity.
Finally, longitudinal research is needed to trace how sustained exposure to SG influences mathematics identity across multiple semesters and STEM pathways. Future research should more closely examine Latin students’ own perspectives on how SG and traditional grading practices shape their calculus experiences, including how these assessment approaches influence specific dimensions of mathematics identity such as recognition, interest, and perceived competence. Qualitative and mixed-methods studies are particularly well positioned to illuminate the mechanisms through which assessment practices interact with students’ lived experiences. Such studies can illuminate whether incremental gains observed within one course accumulate over time to improve persistence, particularly for multilingual and first-generation Latin* students. In this sense, SG holds promise not as a stand-alone solution, but as one component of broader efforts to align assessment, pedagogy, and institutional servingness within HSIs.
Footnotes
Acknowledgements
We thank institutional personnel who supported data collection across multiple semesters.
Ethical Considerations
This study received approval from the Institutional Review Board (IRB) at The University of Texas Rio Grande Valley.
Consent to Participate
All participants provided informed consent prior to participation.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Institutional resources supported data collection and analysis.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
The student-level data used in this study contain sensitive academic and demographic information. Per Institutional Review Board (IRB) requirements and university policy, all identifiable and de-identified data will be securely destroyed at the conclusion of the approved retention period. In accordance with these protocols, the dataset is not publicly available and cannot be shared.
Identifying Information Disclosure
All institutions, approvals, and affiliations associated with this research are listed above. No additional identifying information exists that could compromise the anonymity of the peer-review process.
