Abstract
The Algebra Concept Inventory (ACI) is the first large-scale instrument validated to measure the foundational algebraic conceptual understanding of college students. This study uses ACI scores to conduct the first quantitative analysis on the relationship between algebraic conceptual understanding and college outcomes, thus exploring the predictive validity of the ACI. Specifically, we investigate whether ACI scores predict: (1) math course grades; (2) subsequent completion of math courses required for STEM (science, technology, engineering, and mathematics) majors; (3) completion of STEM versus non-STEM degrees; and (4) the extent that differences in these outcomes by race/ethnicity or gender are explained by ACI scores. Results indicate that ACI scores significantly predict math course outcomes and STEM versus non-STEM degree completion, as well as significant proportions of differences in these outcomes by race/ethnicity and gender. This illustrates the importance of providing every student instruction that supports development of the kinds of foundational algebraic conceptual understanding measured by the ACI.
Keywords
Higher education researchers have identified algebra as a barrier to college and STEM degrees (e.g., Adelman, 2006; Bailey et al., 2010; Cohen & Kelly, 2019), and researchers in undergraduate mathematics education have documented how algebra knowledge can be critical to higher-level college courses, such as calculus (e.g., Frank & Thompson, 2021; Stewart & Reeder, 2017; Stewart et al., 2018), which are often a required part of STEM degree sequences. Yet, despite the acknowledged importance of algebra to college outcomes, studies investigating college math and STEM outcomes have not generally measured algebra knowledge directly, often relying on imperfect proxies that assess a much wider range of skills or knowledge. Most existing research has used successful completion of various secondary or postsecondary mathematics courses as a proxy for algebra skills and knowledge (e.g., Gipson, 2016; Maltese, 2008; Nicholls et al., 2013). Some studies have measured mathematical skills and knowledge more directly by using course grades or SAT/ACT test scores (e.g., Alkhasawneh & Hargraves, 2014; LeBeau et al., 2012; Sahin et al., 2012; Wolniak, 2016). However, grades “assess a multidimensional construct containing both cognitive and non-cognitive factors” (Brookhart et al., 2016, p. 803), and tests like the SAT/ACT combine a wide array of broader skills and knowledge. Thus, these measures were designed for a different purpose, and are not designed to assess cognitive measures of specific types of algebraic thinking.
Without a better understanding of which types of algebra knowledge are critical to college math and STEM outcomes, it is difficult to design college curricula that better support students in developing this knowledge. One type of mathematics knowledge that has been stressed as important (yet is often overlooked in instruction) is conceptual understanding (e.g., Aly, 2022; Boyce & O’Halloran, 2020; National Research Council, 2001; Richland et al., 2012; Webel et al., 2017). Yet, algebra courses in college tend to focus on procedural practice and memorization disconnected from conceptual understanding (Crooks & Alibali, 2014; Hammerman & Goldberg, 2003; Hodara, 2011; Rittle-Johnson & Schneider, 2014) and research has found that most K–12 students graduate without flexible, conceptual mathematics knowledge (Richland et al., 2012). This may leave students unprepared to apply algebraic knowledge flexibly in novel contexts related to their STEM field, which is critical to success in the major and in the STEM workforce (Quarles & Davis, 2017).
Access to the kinds of algebra instruction that support development of conceptual understanding is also an equity issue. Racial/ethnic minorities, women, and students with disabilities are more likely to be placed into non-credit algebra classes in college (e.g., Chen & Simone, 2016; Hodara, 2019; Sanabria et al., 2020) that tend to rely more on rote procedural instruction (Crooks & Alibali, 2014; Hammerman & Goldberg, 2003; Hodara, 2011; Rittle-Johnson & Schneider, 2014). This may exacerbate existing K–12 inequities, where more marginalized students have been shown to have less access to the kinds of rich instruction that provide opportunities for sensemaking and other higher-level skills needed in college (Aly, 2022; Schoenfeld, 2022; Stepter, 2023; Yeh et al., 2020). This may be one reason why students from groups that have been traditionally underrepresented and underserved in STEM are more likely to change their mind about majoring in STEM in college (e.g., Black et al., 2021; Hatfield et al., 2022; Riegle-Crumb et al., 2019; Wright et al., 2023; Zhang, 2021).
In this study, we explore the extent to which algebraic conceptual understanding predicts college math course and STEM degree outcomes in college. The Algebra Concept Inventory (ACI) is the first assessment of its kind to undergo large-scale validation for measuring the foundational algebraic conceptual understanding of college students (Wladis et al., 2024a, 2024b; Wladis, Murray, & Aly, 2025b; Wladis, Murray, Hachey, et al., 2025c; Wladis et al., manuscript under review). Here, we analyze whether ACI scores predict students’ subsequent college math course grades, completion of core math requirements for STEM majors, and completion of STEM versus non-STEM degrees. We also investigate whether ACI scores explain differences in these outcomes by race/ethnicity and gender.
Performing these analyses allowed us to explore how supporting and measuring the development of algebraic conceptual understanding may be important to college math and STEM outcomes, which goes beyond just enrollment in or completion of particular math courses in high school or college. This is an area that has often been hypothesized as important (e.g., Al-Mutawah et al., 2019; Crooks & Alibali, 2014; National Research Council, 2001; Richland et al., 2012), but rarely explored quantitatively in empirical studies. If ACI scores predict math and STEM outcomes in college, then this suggests that it may be particularly important for mathematics educators and researchers to attend to the ways that algebraic conceptual understanding is taught throughout the K–16 curriculum, and to consider whether all students have access to instruction that supports the development of this understanding. This may be one malleable factor that could help to improve enrollment in and completion of STEM degrees, particularly for groups that have been traditionally underrepresented and underserved in STEM fields.
Literature Review
The Leaky STEM Pipeline
Studies have long explored various factors related to enrollment, persistence, and STEM degree completion in college. Data show that students who initially indicate an interest in STEM often leave college with non-STEM degrees (e.g., van den Hurk et al., 2018; Zhang, 2021). Further, students from traditionally underrepresented groups are less likely to both enroll in and complete STEM degrees (e.g., Black et al., 2021; Hatfield et al., 2022; Wright et al., 2023; Zhang, 2021); this persistent discrepancy sets STEM fields apart from non-STEM fields (Riegle-Crumb et al., 2019). The loss of students from STEM at various points in their educational trajectory is commonly referred to as the “Leaky STEM Pipeline,” and it represents a significant potential loss of talent for STEM professions (e.g., van den Hurk et al., 2018). Much of the research on the Leaky STEM Pipeline has focused on equal access concerns, as certain groups are more highly represented in STEM degrees and the STEM workforce than others. For example, Black and Hispanic students, low-income students, women, and students with disabilities are all more likely to switch out of STEM degrees, leading to underrepresentation (e.g., Wang, 2013; Zhang, 2021) and a lack of diversity that is critical to innovation (e.g., Freeman & Huang, 2014) .
Research has described structural barriers in K–12 education that impact pathways into STEM for low income, rural, and students of color, such as lack of access to equitable school funding, science resources, high-quality teachers, and technology access (Charleston et al., 2014; Reardon et al., 2021; Scott & Martin, 2014). Studies consistently suggest that these underrepresented students have less access to precollege experiences that prepare them for college STEM majors, including advanced math and science coursework (e.g., Office for Civil Rights, US Department of Education, 2023; Schoenfeld, 2022). Both prior high school preparation and socioeconomic status have been found to be a factor in STEM major choice and persistence (Anderson & Kim, 2006; Chen & Weko, 2009; Ellington, 2006; Herrera & Hurtado, 2011; Tyson et al., 2007; Wang, 2013; Zhang, 2021). Mathematics placement is also a contributing factor, as research indicates that students of color and lower socioeconomic status students are disproportionately overrepresented in developmental algebra courses in college (e.g., Hodara, 2019), where passing rates are often below 50% (e.g., Coltharp, 2020); this serves to delay/derail STEM degree attainment.
Connection Between Algebra Learning and STEM Degree Attainment
Algebra has been studied as a predictive factor in determining whether students enroll in or complete STEM degrees. Substantial research has explored whether the timing or amount of particular math courses taken during K–12 schooling predicts STEM degree enrollment or completion in college (e.g., Anderson & Kim, 2006; Chen & Weko, 2009; Ellington, 2006; Redmond-Sanogo et al., 2016; Tyson et al., 2007; Wang, 2013). For example, Tyson et al. (2007) report that Black and Hispanic students on average complete lower levels of math courses in high school than other groups, but that Black and Hispanic students who take higher level math courses are as likely as White students to pursue STEM degrees. This may be related to the fact that Black and Hispanic students have less access to higher-level math and science courses in school compared to White and Asian students (Civil Rights Data Collection, 2023).
Using a nationally representative dataset, Chen and Weko (2009) found that taking higher level math courses in high school was associated with STEM pathway entrance in college. Wang (2013), also using a national representative dataset, found that the effect of high school exposure to math and science on STEM pursuit in college was statistically significant and positive across all racial groups, with math achievement in 12th grade positively associated with intent to pursue STEM fields. Further, (Minaya, 2021) reports findings that dual credit enrollment in high school algebra (where students receive both high school and college credit for the course) increases the likelihood of selecting a STEM major in college; this study found a particularly strong relationship with beginning college as a STEM major and persisting in the major for Black and Hispanic students. Cohen and Kelly (2019) report that students who take developmental algebra in college (which is often conceptualized as a proxy for non-taking or non-success in algebra in high school 1 ) were more likely to change to non-STEM majors than those who were labeled by the institution as prepared for “college-level” math at the start of postsecondary education. This is in line with Crisp et al. (2009), who found that students who enrolled for credit in “non-developmental” algebra (or higher) during the first semester of college were more likely to persist toward a STEM degree.
However, all these studies focus on students’ course-taking, rather than on measures of mathematical knowledge that students may have acquired as a result of these courses. This can be problematic because which courses students take is the result of many complex confounding structural factors (e.g., availability of course offerings in school, academic advisement, college placement procedures, etc.), which may be unrelated to students’ mathematical knowledge. Some research has attempted to address this by exploring the relationship between grades or test scores and college outcomes. Studies have demonstrated a correlation between high school math course grades or overall GPA and college STEM outcomes (e.g., Gipson, 2016; Maltese, 2008; Nicholls et al., 2013). Other research has analyzed the correlation between scores on certain standardized assessments such as the SAT or ACT and subsequent STEM enrollment and degree completion in college: students with higher scores on the mathematics portions of these tests are more likely to enroll in and persist in STEM majors (Alkhasawneh & Hargraves, 2014; LeBeau et al., 2012; Sahin et al., 2012; Wolniak, 2016). Additionally, research has noted a correlation between advance placement (AP) exam scores (on the Calculus and other science AP exams) and STEM enrollment/retention (Ackerman et al., 2013; see review in Patel, 2024). These assessments all measure different types of mathematical knowledge that are either more general than or distinct from algebraic conceptual understanding. While these assessments may include some items that measure algebraic conceptual understanding, they do not measure it as a separate construct; for example, many items on these assessments focus on computation or other mathematical topics beyond algebra. Thus, no existing large-scale research has explored the relationship specifically between algebraic conceptual understanding and college math/STEM outcomes.
The Importance of Algebraic Conceptual Understanding
Conceptual understanding is considered to be a critical component of mathematical knowledge (e.g., Al-Mutawah et al., 2019; National Research Council, 2001; Richland et al., 2012), and algebra is considered foundational to much of math and science (Juraev & Bozorov, 2024). However, research suggests that existing algebra instruction has not been particularly successful in helping students to develop conceptual understanding. Algebra courses in college tend to focus on procedural practice and memorization disconnected from conceptual understanding (Crooks & Alibali, 2014; Goldrick-Rab, 2006; Hammerman & Goldberg, 2003; Hodara, 2011; Rittle-Johnson & Schneider, 2014). Further, access to instruction that supports development of conceptual understanding has been found to be an equity issue, with more marginalized students often relegated to mathematics courses that provide fewer opportunities for developing conceptual skills such as sense-making (Schoenfeld, 2022; Stepter, 2023; Yeh et al., 2020). Thus, differences in algebraic conceptual understanding may contribute to differential college math and STEM outcomes.
What research exists on college students’ algebraic thinking shows that understanding of core algebraic concepts it a critical factor in student success at all levels of mathematics from developmental algebra (Givvin et al., 2011; Stigler et al., 2010) through calculus (e.g., Frank & Thompson, 2021; Stewart & Reeder, 2017; Stewart et al., 2018). Without effective instruction in conceptual understanding, students may interpret algebra as a sequence of arbitrary algorithms that they apply without understanding (e.g., Hiebert & Grouws, 2007; Richland et al., 2012), and as a result, they may be unable transfer knowledge from one context to another (Blanton Otto, 2018; Rebello et al., 2017; Richland et al., 2012) and may struggle to use procedures flexibly and in the appropriate contexts (e.g., Al-Mutawah et al., 2019; Givvin et al., 2011; Stigler et al., 2010). Conceptual understanding may be particularly important for STEM majors, since STEM students need to be able to apply their algebraic knowledge flexibly in novel contexts related to their particular STEM field (Quarles & Davis, 2017). However, the lack of existing research directly linking algebraic conceptual understanding to students’ college math and STEM outcomes makes it difficult to determine the precise role that algebraic conceptual understanding may play in the leaky STEM pipeline. This study aims to expand existing models of STEM retention to include algebraic conceptual understanding specifically as a potentially important factor that has been overlooked in prior frameworks.
Conceptual Framework: Models of STEM Retention
Most existing models of STEM major retention that include academic factors as predictors rely on broad and more diffuse measures of knowledge, such as highest math course completed in high school, GPA, or scores on general standardized assessments (e.g., Alkhasawneh & Hargraves, 2014; Gipson, 2016; see review in Snyder & Cudney, 2017). These are useful for certain higher-level policy decisions (e.g., supporting students to enroll in more math courses in high school), but they do not provide sufficient information about which types of knowledge are particularly important for college mathematics and STEM courses, and thus are not designed to inform instructional change.
Standardized tests (e.g. ACT, SAT, state math exams) and classroom assignments (used to calculate high school grades) contain many different problem types on different topics, and often focus heavily on computation rather than measuring understanding (Alkhasawneh & Hargraves, 2014; Gipson, 2016; LeBeau et al., 2012; Maltese, 2008; Nicholls et al., 2013; Sahin et al., 2012; Wolniak, 2016). GPA and standardized assessments like these are relatively diffuse measures of more general (mathematics) knowledge, and, as such, it is difficult to tease out which knowledge types are important to STEM enrollment, persistence, and completion, or to measure whether students are being successfully supported to develop those knowledge types in different instructional contexts.
Measures of high school math course completion are particularly problematic as a proxy for knowledge, because this influences college mathematics course placement, which subsequently exerts its own effects on a student’s STEM trajectory in multiple ways that can be unrelated to mathematical knowledge upon college entry. For example, the math course in which a student is placed at college entry can impact the time investment needed to complete a degree. Being able to start college by enrolling in higher-level math courses allows students to complete STEM degrees in fewer terms, which could influence decisions to enroll in or persist in the major, particularly for more marginalized students who have on average less financial or time capital to invest in college (Alkhasawneh & Hargraves, 2014; Crisp et al., 2009; Wladis et al., 2024a, 2024b). Placement may also influence STEM degree enrollment or completion through the quality of instruction and culture in lower- versus higher-level math courses in college. Negative instructional experiences (e.g., Ellis et al., 2016; Hatfield et al., 2022) and stigma in lower- versus higher-level math courses (e.g., Larnell, 2016; Roberts, 2020), may influence student decisions to enroll or persist in STEM degrees. These experiences may lower students’ mathematics self-efficacy or emotional well-being, or a “chilly” climate in certain courses may convince them that they are not a good “fit” for a particular STEM major at the college even when they remain confident of their own skills (Charleston et al., 2014; Jensen & Deemer, 2019; Lin et al., 2018; Palid et al., 2023; Scott & Martin, 2014). A third possibility of how math placement could impact STEM attainment is that the types of learning opportunities afforded in higher- versus lower-level math courses to develop different types of knowledge may be important to STEM outcomes (Berkowitz & Stern, 2018; Quarles & Davis, 2017). For example, more advanced math courses often value conceptual skills such as abstraction, generalization, and justification, so students who have access to courses that stress these knowledge types may be better positioned to enroll and persist in STEM degrees (Quarles & Davis, 2017), regardless of the level of students’ mathematics knowledge upon college entry.
To better tease out the unique contribution of algebraic conceptual understanding from other potential impacts of course placement or more diffuse measures of general mathematics knowledge, we theorize a more comprehensive model of STEM major retention that includes algebraic conceptual understanding as one component that has been unexplored in prior large-scale research (Figure 1). This model depicts both previously tested relationships and hypothesized relationships analyzed in this study.

Model of STEM retention accounting for algebraic conceptual understanding.
Prior research has explored several pathways in the model depicted in Figure 1. For example, research that has explored changing college math placement policies (Ngo et al., 2018; Scott-Clayton, 2012; Stancher, 2019) has often considered the entire path from placement to college outcomes without analyzing the individual components between those two boxes separately. This research has also tended to focus more on non-STEM majors (an exception is Park et al. [2018], who found that placement mismatch impacted STEM college outcomes).
Other research has explored specific categories of factors that predict outcomes, with or without including math course placement as a predictive factor. For example, much of the research on co-requisite courses has focused on the pathway from placement to environmental impact and then to outcomes; this research has considered how the specific course sequence offered to students may impact their ability to complete degrees by influencing the number of terms needed to finish a degree, and has tended to focus primarily on non-STEM majors (see reviews in Emblom-Callahan et al., 2019; Ryu et al., 2022). Some studies have considered the impact of the classroom or department climate on students’ STEM outcomes, or the pathway from psychosocial impacts to outcomes depicted in Figure 1 (Charleston et al., 2014; Jensen & Deemer, 2019; Lin et al., 2018; Palid et al., 2023; Scott & Martin, 2014). These studies have tended to focus less on placement. Additional research has explored how various measures of general cognitive skills (math-specific or even broader) may predict STEM outcomes, considering the pathway as a whole from the cognitive impacts box (Figure 1) to outcomes without considering how individual components in the cognitive impacts box might predict STEM outcomes separately (Berkowitz & Stern, 2018; Fagan et al., 2019; Wai et al., 2009). 2
However, we could find no studies that explicitly explore the black path in Figure 1, where measures of algebraic conceptual understanding are used to predict STEM outcomes. In this study, we aim to fill this gap by analyzing the extent to which algebraic conceptual understanding predicts STEM outcomes in college (the rightmost black arrow in Figure 1). We do this while including fixed effects by course (thin black line), thus controlling for the specific course in which a student is enrolled; also included are models with controls (dotted black lines), to control for background variables. Also investigated in this study are various mediation models, to explore the extent to which any observed differences in STEM outcomes by race/ethnicity or gender can be explained by differences in algebraic conceptual understanding. These models are described in more detail in the Method section.
Here, we conceptualize STEM outcomes as successes or failures measured at the institutional, rather than student level. The extent to which different students are successfully completing particular college math classes and/or electing to complete a STEM versus non-STEM degree is the result of many institutional and societal-level factors that are often beyond the control of individual students. Our goal in exploring the relationship between algebraic conceptual understanding and STEM-related college outcomes is not to label or categorize students based on their current knowledge states, or to make claims about which students should pursue STEM degrees—any such usage of this research for these purposes is expressly counter to the goals of this study and the validated use of the ACI (Wladis et al., 2024a, 2024b; Wladis et al., manuscript under review).
Rather, we frame a student’s current level of algebraic conceptual understanding as the result of the cumulative effect of their prior formal and informal instructional experiences (inside and outside school, in both the K–12 and college context). By exploring the relationship between student scores on the ACI and their subsequent STEM outcomes in college, it allows for consideration of the extent to which unequal access to opportunities to develop algebraic conceptual understanding may be a critical component of unequal access to STEM degrees. If algebraic conceptual understanding is a strong predictor of STEM outcomes in college, separate from the level of the specific math course in which a student has currently been placed, then this suggests that improving instruction in algebraic conceptual understanding (for instance, in high-school and early college) is an essential component of ensuring that every student has the opportunity to pursue a STEM degree.
The Algebra Concept Inventory (ACI)
The Algebra Concept Inventory (ACI) has been validated in recent large-scale studies as a measure of the foundational algebraic conceptual understanding of students in a wide range of math courses in college starting at elementary algebra, 3 and it is the first instrument validated for this purpose (Wladis et al., 2024a, 2024b; Wladis, Murray, & Aly, 2025b; Wladis, Murray, Hachey, et al., 2025; Wladis et al., manuscript under review). “Classical” 4 algebra (in contrast to linear/abstract or “modern” algebra [Cooke, 2008]) denotes algebra focused on the transformation, representation, structure, and properties of algebraic expressions and equations, typically in progressively more complex, generalized and abstract ways as students move through the curriculum. The term “foundational” is used to denote concepts from classical algebra that are important to the college mathematics curriculum but also accessible to students as soon as they start elementary algebra (e.g., solutions of equations; structure of symbolic representations). While classical algebra may grow more advanced by including concepts beyond foundational algebra in subsequent mathematics courses (e.g., logarithmic or trigonometric functions), many core concepts from foundational algebra remain critical throughout all levels of mathematics courses in college (e.g., Frank & Thompson, 2021; Stewart & Reeder, 2017). For brevity, throughout the rest of this paper, we use the term “algebraic conceptual understanding” to refer to conceptual understanding of foundational classical algebra.
On the ACI, conceptual understanding was defined as “understanding related directly to the meaning of [an algebra] concept (Wladis et al., 2018), rather than ability to apply procedures or produce a formal proof” (Melhuish & Hicks, 2019, p. 123). In particular, ACI items (as described in Wladis et al., manuscript under review; Wladis et al., 2024a, 2024b) were designed to measure algebraic conceptual understanding separately, rather than as one component of a larger construct such as algebraic proficiency, as has been done in many national and international assessments (e.g., NAEP [NCES, 2024]; TIMSS [Mullis et al., 2023]). Thus, ACI items were designed to minimize construct-irrelevant variance (e.g., variation in performance due to procedural fluency skills or knowledge in domains outside algebra). ACI items were created both (1) to require mathematical sensemaking to answer correctly (i.e., a correct answer cannot be obtained solely through executing rote procedures or restating memorized facts); and (2) to not require calculation or transformation to answer correctly. Procedural fluency is important and can be related to conceptual understanding, but items measuring this skill were deemed inappropriate on this assessment intended to measure conceptual understanding specifically (Wladis et al., 2024a, 2024b; Wladis et al., manuscript under review).
As an illustration, we present two ACI items focused on different areas in algebra: the first explores conceptions of algebraic syntax structure, and the second conceptions of solutions of equations (See Figure 2).

Algebra Concept Inventory example items. 5
For Item 1, students conceptualizing the standard intended structure of the syntax of the expression would typically choose D. Students who choose A, B, or C typically extract the structure of the syntax from non-mathematically salient features drawn from instructional experiences and may conceptualize the meaning and structure of algebraic syntax differently than those who link it to standard operational precedence (e.g., Wladis, Murray, & Aly, 2025a; Wladis, Sencindiver, et al., 2023). For Item 2, students who understand both that valid transformations preserve the solution of an equation and that all values of x make the equation
The ACI is currently the only large-scale validated college concept inventory that focuses explicitly on college students’ conceptual understanding of foundational algebra and therefore does not include any algebra concepts that would be inaccessible to students in an elementary algebra course (Wladis et al., 2024a, 2024b; Wladis et al., manuscript under review). Measuring foundational classical algebra across all math courses in college is important because students in elementary and intermediate algebra classes in college have often had little access to rich instruction that supports the development of conceptual understanding in addition to procedural skills (Crooks & Alibali, 2014; Goldrick-Rab, 2006; Hammerman & Goldberg, 2003; Hodara, 2011), and because foundational classical algebra concepts have been found to impact student work in mathematics courses throughout the college curriculum (Frank & Thompson, 2021; Stewart & Reeder, 2017; Stewart et al., 2018; A. Weinberg et al., 2016). A detailed interpretation and use statement for the ACI can be found at www.algebraconceptinventory.org.
Analysis has shown that the ACI has excellent reliability and validity as a measure of the foundational algebraic conceptual understanding of college students across a wide range of mathematics courses (Wladis et al., 2024a, 2024b; Wladis et al., manuscript under review). Because the ACI requires only limited prior exposure to algebra (i.e., current/prior enrollment in Algebra I or elementary algebra), covers a wide range of algebra concepts, and has undergone large-scale mixed methods validation with college students, it is an excellent candidate for exploring hypotheses about whether algebraic conceptual understanding can predict college students’ math course outcomes and STEM degree progress or attainment. In addition, because the ACI has been generated based on specific constructs that have been explicitly operationalized (i.e., each item on the inventory is designed to measure specific concepts and conceptions [Wladis et al., manuscript under review; Wladis et al., 2024a, 2024b]), if ACI scores do predict math and STEM outcomes in college, the framework used to generate the ACI provides an actionable starting point for revising curriculum and instruction to better prepare students for college math and STEM courses.
Methods
Data Collection
Data for this study were obtained from the City University of New York (CUNY) Institutional Research (IR) Offices. This included information about all students enrolled in any mathematics course at the largest community college at CUNY between Spring 2019 and Summer 2023. Data included the course in which the student was enrolled, the grade in that course, gender, race/ethnicity, age, GPA at the beginning of the term, number of credits earned by the beginning of the term, major, any degrees earned, and home zipcode (so that this could be merged with household income by zipcode data from the American Community Survey and used as a partial proxy for socioeconomic status [SES]). For more details, see the Measures subsection below. For some robustness tests (see Table A1 in the Appendix), a measure of English-language-learner (ELL) status was also used; this was based on placement assessments administered by the college. Results in Table A1 demonstrate that models have identical or near identical coefficients, standard errors, and p values, regardless of whether ELL status is included in the model.
This study merged institutional records with an existing dataset of scores on the ACI originally generated during ACI validation. Math classes included elementary algebra, intermediate algebra, mathematics for Elementary Educators I and II, quantitative reasoning, various levels of statistics, mathematics for liberal arts majors, mathematics for health science majors, discrete mathematics, precalculus, Calculus I, II and III, advanced calculus, linear algebra, abstract algebra, and differential equations. A total of 402 ACI items were taken by students during validation. CUNY provided institutional research data for a total of 6,582 students with ACI scores. 6
Analytical Methods
Measures
The primary independent variable of interest was a student’s score on the ACI. We explored a variety of dependent variables as potential outcomes that could be predicted by ACI score. First, course grade was generated by translating letter grades into a corresponding GPA scale (0–4); official withdrawals, incomplete grades, and audit grades were not assigned any grade value. Because some courses did not have grades beyond pass/fail designations (e.g., developmental courses), we also computed “successful course completion,” defined as completing the course with a C or better, the typical criterion necessary for transfer or credit in the major. A variable to indicate whether a student obtained a STEM versus non-STEM degree during the study period was generated by considering the CIP code of the associated degree (Manly et al., 2018). In addition to dependent variable data on student course grades, majors, and degree completion, control variables in this study were also taken from Institutional Research datasets and included: gender, race/ethnicity, age, GPA at the beginning of the term, number of credits earned at the beginning of the term, and home zipcode as a partial proxy for SES (Wladis, Hachey, et al., 2024). Home zipcode was used to generate median household income of the zipcode using the U.S. Census Bureau’s pre-pandemic American Community Survey data (because pre-pandemic data was not subject to the issues with data quality that the Census Bureau cited with the 2020 data due to the pandemic; see, for example, Villa Ross et al., 2021).
There were virtually no missing data in IR variables; the only variables with missing data were race/ethnicity, gender and age (missing under 0.1%) and U.S. zipcode (missing 1.6%). Zipcode missingness included some data entry errors (e.g., zipcode entered into wrong field) and some foreign students with no U.S. address on file (96.3% of students with a foreign address on file had a U.S. address on file, so this represented a small percentage of foreign students). Because of the small number of missing IR variables, and the fact that we cannot assume that zipcode is missing at random, 7 we used listwise deletion. This may mean that full model results (that use control variables) may be less applicable to a small group of foreign students who do not have U.S. addresses on file.
For mediation analyses only, race/ethnicity was dichotomized into a single variable to provide sufficient statistical power to detect effects: underrepresented minority status (URM) was assigned to students who identified as Black, Hispanic, or American Indian/Native Alaskan, since these groups have historically been underrepresented and underserved in STEM majors (e.g., Black et al., 2021; Hatfield et al., 2022; Wright et al., 2023; Zhang, 2021). While there are limitations to dichotomizing such data, categories were collapsed into a binary variable to increase power so that important differences could be better detected by statistical models, and because these categories reflect to some extent politically and socially constructed categories that are at the root of much structural marginalization and discrimination (Balestra & Fleischer, 2018). This variable was not dichotomized in other regression models.
For each math course in which a student was enrolled during the study period, we classified courses two ways: (1) by commonly-used categories typical to U.S. STEM degree math course sequences (elementary algebra, intermediate/college algebra, precalculus, Calculus I, and Calculus II); and (2) by course prerequisite sequence (see Table 1). “Elementary” and “intermediate” algebra are often labeled as “below-college-level” and are often designed to review procedures from first- and second-year K–12 algebra courses, respectively. “College” algebra may be practically indistinguishable from intermediate algebra, or may be more advanced, and typically carries college credit. We refer to these designations not to endorse them (issues with these definitions are discussed at length elsewhere; Wladis, Bjorkman, et al., 2023; Wladis, Makowski, et al., 2023) but simply to describe existing course designations and prerequisite sequences.
Course Sequence Classification Based on Prerequisites
Analyses controlled for specific course in which a student was enrolled or the sequence number of the enrolled course, typically by including the level in the course sequence as a fixed effect (unless analysis was already limited to a single course or sequence level), because scores on the inventory are correlated with the sequence level of the course in which a student is enrolled (for more details see Wladis et al., 2024a, 2024b, manuscript under review; Wladis, Murray, & Aly, 2025a; Wladis, Murray, Hachey, et al., 2025b). Random effects were also used to control for clustering by instructor, as students enrolled in math courses with the same instructor are more likely to have similar outcomes. More details on how analysis was carried out are provided in the next subsections.
ACI scores
ACI scores were originally generated for each student during validation studies of the ACI (Wladis et al., 2024a, 2024b; Wladis et al., manuscript under review), using two-parameter logistic (2PL) item-response-theory (IRT) models (Birnbaum, 1968). These were estimated using marginal maximum likelihood (MML) estimation on each wave, using the R package “mirt” (Chalmers, 2012). 2PL models are commonly used to analyze dichotomous items and consist of logistic models where items are allowed to differ on both difficulty level and discrimination (which represents how well they distinguish between students with higher versus lower levels of “algebraic conceptual understanding”). The equation for 2PL models is:
where
Regression models
Multilevel fixed effects linear and logistic regression were used to assess the validity of the ACI as a predictor of college math and STEM outcomes. While we give a few descriptive results without fixed effects, most analyses reported here control for fixed effects by course level (where outcomes are only compared among students at the same level in the course sequence) and random effects by course instructor (to control for the fact that students taught by the same instructor may have more similar outcomes). The linear multilevel model equations used were:
for
For all models with a binary dependent variable, we ran both logistic and linear probability fixed-effects models. Both produced similar results, so we report linear probability models because readers are much less likely to misinterpret the coefficients of linear probability models than odds ratios (e.g., Norton & Dowd, 2018) and because we cannot compare coefficients across logistic regression models due to rescaling (Buis, 2010; Erikson et al., 2005; Long, 1997; Winship & Mare, 1984; Wooldridge, 2002).
For logit models (e.g., successful completion of an intermediate algebra course; STEM degree attainment), the same equations were used as in linear regression, but with a logit link employed to model the probability distribution:
where
Mediation models
This paper also considers models that aim to understand to what extent disparities in dependent variables such as math course grade or STEM degree attainment by race/ethnicity (or gender) can be explained by disparities in ACI score by race/ethnicity (or gender). Following van VanderWeele and Robinson (2014), we conceptualize ACI score as a mediator and employ statistical models from the mediation literature to decompose disparities into two parts. 8 The first part is an “indirect” disparity that is a function of the combined effects of: (a) the relationship between ACI score and race/ethnicity (or gender) and (b) the relationship between ACI score and the dependent variable (e.g., math course grade; STEM degree attainment). The second part is a “direct” disparity, which quantifies the proportion of the total disparity that would remain even if the distribution of ACI score were to be equalized across racial/ethnic (or gender) groups. Direct and indirect disparities, as we call them here, are the same coefficients that are often referred to in the mediation literature as direct and indirect “effects”; however, we avoid the use of the term “effect” because this is an observational rather than causal study (however, to improve readability, we include the term “effect” in quotes inside a parenthetical reference in tables for those readers more familiar with the terminology used in the mediation literature; we couple this with reminders to the reader that these results do not support causal inference).
For mediation analysis, we used the KHB decomposition method, available in Stata (Buis, 2010). This is a general decomposition method based on an SEM framework that can be used with either linear or logistic regression. In linear regression mediation is modeled as follows:
where
and then
Adding
Descriptive Statistics and Weighting
Students who consent to have their scores on a math assessment included in research may vary from those who do not; for example, they might be more likely to have higher math self-efficacy or achievement. Thus, it was important to check for potential differences between students for whom ACI scores are available vs. those who are not. Descriptive statistics comparing these two groups are provided in Table 2.
Summary Statistics Before Weighting, Comparing Those for Whom ACI Score Was Available to Total Math Course Population
Note. SE = Standard Error; CI = 95% Confidence Interval; PI = Pacific Islander; AI = American Indian; NA = Native American.
Students for whom ACI scores were available were more likely to be Asian and less likely to be Hispanic; more likely to be women; were slightly older on average; had a slightly lower number of credits earned by the start of the term; had slightly higher GPAs on average, and were enrolled in a slightly higher math course in the sequence. However, there was no difference in terms of other race/ethnicity categories or median household income of zipcode. We note that despite the observed differences, students with all GPA levels, course levels, and racial/ethnic, and gender groups were well-represented in the ACI data, and, thus, weighting can be used to address variation on observables between students for whom ACI scores are available and the full sample. To address this variation, we performed entropy balancing (Hainmueller, 2012). Entropy balancing is a weighting method, like other propensity score weighting methods, where the goal is to re-weight data points to improve covariate balance so that the sample is more reflective of the population, or so that a treatment variable (in this case, whether a student took the ACI) becomes independent of measured background characteristics (Powell et al., 2019).
Entropy balancing differs from other weighting methods because it allows the researcher to impose balance constraints, requiring that covariate distributions of the sample and the population match on all prespecified moments (Hainmueller, 2012). In contrast, other propensity score methods often require the researcher to manually go through multiple rounds of weighting and checking the data for covariate balance, and improving the balance on some covariates may occur at the cost of worsening balance on others (Thomas et al., 2023). Weighting using entropy balancing resulted in a dataset of students for whom ACI scores were available for which the mean/proportion on all control variables of the new weighted dataset matched the characteristics of the population of all students enrolled in math courses with standardized mean differences reduced to zero across all variables (see Table 3).
Summary Statistics After Weighting, Comparing Those for Whom ACI Scores Were Available to Total Math Course Population
Note. SD = Standard Deviation; SMD = Standardized Mean Difference; PI = Pacific Islander; AI = American Indian; NA = Native American.
Table 3 illustrates that the resulting weighted dataset used for this study has covariate distributions that reflect the full population of students enrolled in math courses in the study population.
Results
We explored whether and to what extent scores on the ACI predict mathematics course grades, completion of key courses in the STEM mathematics sequence, and completion of a STEM versus non-STEM degree. 9 We also considered whether ACI scores explain any differential outcomes by race/ethnicity or gender.
ACI Score as Predictor of Mathematics and STEM Outcomes in College
ACI score as a predictor of math course grades
Table 4 reports results of overall regression models on the full sample of all courses, using course sequence as a fixed effect; it also reports the results of separate regression models for each course type. Table 4 includes the results of base models (without controls) and full models (with controls). Base models have the benefit that they illustrate existing relationships between variables based on characteristics of the students who are actually enrolled in current courses; full models have the benefit of showing us what the relationship would theoretically be if all students in each course were compared only to other students with the same characteristics on the control variables in that course. Thus, each type of model provides different information about existing patterns.
ACI Score as a Predictor of Math Course Grade (Grade Scale: 1–4)
Note. Multi-Level Weighted Regression, fixed effects by course (first row) or separate models by courses (listed in first column), random effects by instructor, math course grade as dependent variable, ACI score as independent variable for which coefficients are reported.
Full models include control variables: GPA at start of term, number of prior credits earned by start of term, race/ethnicity, gender, age, and median household income of zipcode.
Table 4 indicates that on average, a one standard deviation (SD) increase in ACI score correlates with a roughly 0.4 grade point increase (on a scale of 1–4) for the math course in which a student is enrolled (
When looking at individual courses, ACI scores significantly predicted grades in intermediate/college algebra,
10
precalculus, and Calculus I in models both with and without controls, and in differential equations once all controls are added. (
ACI as predicting completion of math courses in the STEM major sequence
Next, we investigated the extent to which students’ scores on the ACI are predictive of whether they successfully completed math courses in the standard STEM math sequence during the study period (Table 5). We considered both base and full fixed effects regression models predicting whether students ever successfully completed elementary algebra; intermediate or college algebra; precalculus, Calculus I, or Calculus II, if enrolled. We only considered those students who had not yet successfully completed these courses prior to taking the ACI, so that scores are only used to predict future (not past) course completion. 11
ACI Score as a Predictor of Successful Completion of Core Math Courses in STEM Major Requirements
Note. Linear Probability Models Using Multi-Level Weighted Regression: fixed-effects by course sequence number of current course, random effects by instructor, ever completed (if attempted) particular math course (listed in first column) as dependent variable, ACI score as independent variable for which coefficients are reported.
Full models include control variables: GPA at start of term, number of prior credits earned by start of term, race/ethnicity, gender, age, and median household income of zipcode.
Table 5 illustrates that ACI scores significantly predicted whether a student would ever successfully complete each of the five core STEM major math requirements if they attempted them, both in base and full models; this was significant at the
Thus, ACI score was strongly predictive of whether a student would successfully complete courses in the standard STEM major math course sequence, with the strongest relationship present for elementary algebra. This aligns with the design of the ACI to focus on concepts first introduced in elementary algebra. Students were 16 percentage points more likely to successfully complete elementary algebra for every 1 SD increase in ACI score (15 percentage points after controls). Additionally, students were also significantly more likely to successfully complete intermediate/college algebra, precalculus, Calculus I or Calculus II if they ever enrolled in these courses, with every 1 SD increase in ACI score correlating with 11–15 percentage point increases in successful course completion of these classes (8–11 percentage points with controls). This illustrates the importance of algebraic conceptual understanding to completing the core mathematics courses required for a STEM major.
ACI as predicting STEM versus non-STEM degree completion
Next, we explored whether ACI scores predicted student degree attainment in STEM versus non-STEM fields. This analysis only considered those students who completed a 2- or 4-year degree during the study period (
ACI Score as Predictor of STEM Versus Non-STEM Degree Completion
Note. Linear Probability Models Using Weighted Regression: fixed-effects by course sequence number of current course, ever attained STEM (vs. non-STEM) degree as dependent variable, ACI score as independent variable.
Full models include control variables: GPA at start of term, number of prior credits earned by start of term, race/ethnicity, gender, age, and median household income of zipcode.
Table 6 shows that for every 1 SD increase in ACI score, students were 5.8 percentage points more likely to complete a STEM versus non-STEM degree when including fixed effects for course sequence number. Thus, for students at the same level of the math course sequence, those with higher ACI scores were significantly more likely to graduate with a STEM degree. Adding in controls did not change this relationship, altering its magnitude only slightly (primarily reducing the coefficient only for course grade as a predictor).
Even more notable is that ACI score separately predicted STEM degree completion, above and beyond math course grade alone. ACI score was just as good (or better) than math course grade at predicting STEM versus non-STEM degree completion: in base models, an increase in 1 SD in course grade correlated with a 3.3 percentage point increase in STEM versus non-STEM degree completion and an increase in 1 SD in ACI score correlated with a 5.9 percentage point increase (this shifted to 5.1 and 4.0 percentage points, respectively, in full models). Adding in course grade as a predictor in addition to ACI score only slightly altered the coefficient for the predictive relationship between ACI score and STEM degree completion, with 1 SD increase in ACI score still predicting over a five percentage point greater likelihood of graduating with a STEM versus non-STEM degree (when comparing only those students with the same grade in the same math course to one another). Thus, the ACI is measuring something important to STEM versus non-STEM degree completion that is not currently captured by course grades alone.
In particular, we performed an incremental validity analysis as a robustness check of these results (see Table A3 in the Appendix). Comparisons of Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) for all full models in Tables 5, 6, and 7 illustrates that there is “very strong” (Raftery, 1995, p. 140) evidence that ACI scores add significant information in predicting course grade, whether students ever completed a math course in the STEM sequence, or whether they graduated with a STEM versus non-STEM degree. In particular, for all models except the one full model from Table 5 predicting Calculus III course grade there is positive evidence that ACI scores add predictive information, and for all models except the one full model from Table 5 predicting Calculus II course grade there is “very strong” (Raftery, 1995, p. 139) evidence that ACI scores add predictive information above GPA, math course grade, and other SES and demographic variables. In particular, the “conservative” (1995, p. 139) cutoff suggested by Raftery to demonstrate “very strong” evidence of added information from an additional predictor is a difference in BIC scores greater than 10; all the BIC differences in Table A3 other than those for the full models in Table 5 for predicting Calculus II and III course grades are well above 10, ranging from 21 to 2,568. This provides further evidence that ACI scores measure something beyond course grade.
ACI Score as Mediator Between Underrepresented Minority Status (URM) a or Gender and Math Course Grade
Note. Mediation using the KHB method: Mediation of the relationship between underrepresented minority status or gender and math course grade by ACI score, fixed-effects by course sequence number, clustering by instructor.
Underrepresented minority status indicates students of color who have been traditionally underrepresented and underserved in STEM fields; this includes Black, Hispanic, and American Indian/Native Alaskan students, but not White or Asian/Pacific Islander students. bFull models included control variables: GPA at start of term, number of prior credits earned by start of term, race/ethnicity (for gender models), gender (for URM models), age, and median household income of zipcode, instructor.
ACI Score as a Predictor of Differential Outcomes by URM Status and Gender
Next, we considered whether ACI score could explain some of the observed gaps in math or STEM outcomes by URM status.
Math course grades
We begin by considering whether ACI scores mediate the relationship between URM status or gender and math grades. The results of these mediation models are reported in Table 7.
Table 7 reveals several trends. First, for URM status, the direct and indirect discrepancies are both highly significant in base and full models, suggesting that both ACI score and other factors correlated with URM status explain the lower average math course grades of URM students. ACI score accounts for 17% of the difference in math course grades by URM status in base models (proportion of total discrepancy equal to the indirect discrepancy, or
Mediation models for gender show different trends: while both the direct and indirect discrepancies are highly significant in both base and full models, the direction of the relationship is in the opposite direction. Women on average earn higher grades in math courses despite having lower ACI scores on average than men, so considering a mediation model that includes ACI scores reveals a suppressor “effect” where women on average earn even higher grades than men with equivalent ACI scores. Thus, ACI scores do not explain women’s higher math course grades, but rather women’s higher average math course grades mask that they have lower mean scores on the ACI.
Successful completion of core STEM major math courses
Next, we considered whether ACI scores mediate the relationship between URM status or gender and successful completion of commonly required math courses for STEM majors, including elementary algebra, intermediate/college algebra, precalculus, Calculus I, and Calculus II. The results of these mediation models are reported in Table 8.
ACI Score as Mediator Between Underrepresented Minority Status (URM) a or Gender and Successful Completion of Various Core Math Courses in STEM Degree Requirements
Note. Linear probability model mediation using the KHB Method: Mediation of the relationship between underrepresented minority status or gender and successful completion of core math courses in the standard STEM major course sequence (for those students who enrolled) by ACI score, fixed-effects by course sequence number, clustering by instructor.
Underrepresented minority status indicates students of color who have been traditionally underrepresented and underserved in STEM fields, this includes Black, Hispanic, and American Indian/Native Alaskan students, but not White or Asian/Pacific Islander students. bFull models included control variables: GPA at start of term, number of prior credits earned by start of term, race/ethnicity (for gender models), gender (for URM models), age, and median household income of zipcode, instructor.
For URM status, both direct and indirect discrepancies are highly significant in base and full models for all courses (except full models for elementary algebra). This suggests that both ACI score and other factors associated with URM status explain the lower probability of successfully completing courses above elementary algebra for URM students. ACI scores explain about 16%–18% of the difference for intermediate/college algebra through Calculus I in base models and 7%–11% in full models (i.e., the proportion of total discrepancy equal to the indirect discrepancy in each model).
For gender, these models again suggest a suppressor “effect” (significant for all courses except elementary algebra and Calculus II full models), where women are, on average, more likely to successfully complete core math course requirements for STEM majors, despite scoring lower on average on the ACI than men. For some models in Table 8 (e.g., for some precalculus, Calculus I and II models), the total discrepancy is not statistically significant, but the direct and indirect discrepancies are, yet in opposite directions. Thus, these relationships stay hidden when looking only at average effects, when we do not consider the mediating role of ACI score.
STEM versus non-STEM degree completion
Finally, we considered whether ACI scores mediate the relationship between race/ethnicity or gender and STEM versus non-STEM degree completion. Results are reported in Table 9.
ACI Score as Mediator Between Underrepresented Minority Status (URM) a or Gender and STEM Versus Non-STEM Degree Completion
Note. Linear probability model mediation using the KHB method: Mediation of the relationship between underrepresented minority status or gender and STEM versus non-STEM degree completion by ACI score, fixed-effects by course sequence number, clustering by instructor.
Underrepresented minority status indicates students of color who have been traditionally underrepresented and underserved in STEM fields, this includes Black, Hispanic, and American Indian/Native Alaskan students, but not White or Asian/Pacific Islander students. bFull models included control variables: GPA at start of term, number of prior credits earned by start of term, race/ethnicity (for gender models), gender (for URM models), age, and median household income of zipcode, instructor.
For URM status, neither the total nor the direct discrepancy (Table 9) is significant in base or full models, while the indirect discrepancy is. In fact, the direction of the direct discrepancy has reversed sign—thus, after accounting for the indirect relationship of ACI score as an explanatory variable for differences in STEM versus non-STEM major attainment by URM status, URM students are more likely to complete a STEM versus non-STEM degree (although not statistically significantly so). Thus, ACI score explains 100% of the total discrepancy in STEM versus non-STEM degree attainment by URM status (although absolute differences are small, at about 2 percentage points).
Exploring mediation models for gender shows a different trend: the total, direct, and indirect discrepancies are all statistically significant, with ACI score explaining about 10% of the total discrepancy in both base and full models. Thus, about 10% of differences in STEM versus non-STEM degree attainment between men and women can be explained by women’s lower average scores on the ACI, but the remaining 90% must be accounted for by other factors correlated with gender.
Limitations
While a wide range of students were represented in the data, students who completed the ACI did differ somewhat on average from the total population of students who were enrolled in math courses. However, data in this study still included good representation of all racial/ethnic and gender groups, as well as students across all GPA bands and course levels, and, for all analysis, weighting was used to adjust for all observed differences between those with an ACI score and the general population of students enrolled in math courses, with excellent balance achieved (Table 3). Further, control variables were used across all research questions to allow for comparison of students with the same values on these characteristics.
This dataset also has some limitations in how gender and race/ethnicity were recorded. CUNY institutional data had at the time only a binary category for gender and used limited federal race/ethnicity categories; therefore, these variables may not accurately represent how all students self-identify. Further research that includes more nuanced race/ethnicity and gender categories is necessary. However, in the meantime, this study may provide a starting point for understanding how certain politically and socially constructed categories can relate to structural marginalization and discrimination (Balestra & Fleischer, 2018) in higher education.
The population at CUNY, where this study was conducted, is more diverse than the average U.S. college. It includes a higher proportion of ethnic/racial minorities, foreign-born students, first-generation students, lower-SES students, and students requiring developmental coursework. We note that we did not directly analyze or control for all of these groups in this particular study (e.g., we do not include models here that consider how foreign-born or first-generation status relate to ACI scores)—considering the relationship between ACI scores and college outcomes for other groups of students (beyond gender and race/ethnicity) is an important area for future research. Further, because of CUNY’s diversity, we cannot claim that these data are nationally representative. However, these results are likely generalizable to a wider population nationally, and will be of particular interest to colleges with diverse populations. Further, this makes CUNY an excellent context for investigating the relationship between algebraic conceptual understanding and college math/STEM outcomes for groups traditionally underrepresented and underserved in higher education.
Discussion
ACI scores significantly predicted math course grade, both overall, and for almost every individual course that was analyzed in this study. For courses that depend heavily on algebra (e.g., elementary algebra, intermediate algebra, precalculus) this result was not surprising. However, the strong predictive power of ACI scores for grades in courses that do not explicitly focus on algebra, such as statistics, Calculus I, and courses for future primary teachers and medical professionals, illustrates how important algebraic concepts may be to a range of mathematical domains across different STEM fields.
We note that the content of the ACI was generated to be accessible to any student who had enrolled in Algebra I in high school or elementary algebra in college and so it did not include algebraic concepts or objects (e.g., function notation, exponential growth, trigonometric functions) that might be more relevant to higher-level mathematics courses. Thus, the observed pattern in which ACI scores were more predictive for lower-level courses (e.g., intermediate/college algebra and precalculus) than some higher-level courses (e.g., Calculus II and III) provides validity evidence for the intended interpretation of the ACI. An expanded instrument that includes concepts beyond those relevant to elementary algebra may be more predictive of grades in upper-level math courses. However, the results found here show that algebraic conceptions tested by the ACI significantly predict course grades in a wide variety of mathematics courses that are important to both STEM and non-STEM majors. This suggests that a more explicit focus in instruction at all levels on conceptual understanding of foundational classical algebra may be a critical component of preparing students to succeed in college math courses and the degrees that require these courses.
ACI scores also significantly predicted the likelihood of ever subsequently completing each of the courses on the standard STEM major math course sequence from elementary algebra to Calculus II, independent of the course in which a student was currently enrolled. This suggests that algebraic conceptual understanding as measured by the ACI is important to student completion of many different required math courses for STEM majors, even though the ACI does not include algebra concepts introduced in courses beyond elementary algebra. Findings point to the importance of algebra courses including an explicit focus on conceptual understanding if they are to prepare students to be successful in the college math courses needed for STEM degrees.
ACI scores additionally predicted the likelihood of a student completing a STEM versus non-STEM major during the study period, independent of the math course in which they were currently enrolled. It was particularly noteworthy that ACI scores predicted STEM versus non-STEM degree completion separately but equally as well as math course grade. This suggests that ACI scores capture critical information about students’ likelihood of completing STEM degrees that is not captured by existing math course grades. Perhaps existing course grades depend heavily on accurate procedural fluency and less on conceptual understanding, and/or perhaps they measure conceptual understanding on concepts outside foundational classical algebra. Regardless of the reason for the difference, this study suggests that it is critical to better understand how to provide every student interested in STEM ample opportunities to cultivate the kinds of foundational classical algebraic conceptual understanding measured by the ACI. This is likely an important task throughout the K–16 mathematics curriculum, if we want every student to have access to STEM degrees.
Further, our results suggest that instruction in algebraic conceptual understanding is a critical equity issue. ACI score explained significant differences in math course grades and STEM versus non-STEM degree completion by URM status, suggesting that differences in opportunities to develop algebraic conceptual understanding explain a significant proportion of college STEM outcome differences by race/ethnicity. It is already known that URM status correlates with differential access to high-quality mathematics instruction, including opportunities for reasoning and sensemaking (Schoenfeld, 2022; Stepter, 2023; Yeh et al., 2020). URM students may also have more marginalized experiences in mathematics classes (e.g., McGee, 2015; Ridgeway & McGee, 2018), which may reduce their opportunities to develop conceptual understanding. Thus, improving access for all students to meaningful opportunities to develop algebraic conceptual understanding in mathematics courses at all levels may be particularly crucial to addressing inequities in college STEM outcomes—this is a critical area for future research.
For gender, analysis revealed a suppressor effect, where women on average earned higher grades in math courses (and were more likely to successfully complete courses in the STEM major math sequence) despite having lower ACI scores. This may be because women on average spend more time on their studies than comparable men (Conway et al., 2021; Wladis et al., 2024a, 2024b), and therefore may earn higher grades because of the added time spent studying. Or it may relate to gendered norms that reinforce “compliance” for women and “risk-taking” for men during problem-solving (Lubienski & Ganley, 2017; Lubienski et al., 2021; Miller et al., 1996). When instruction stresses procedures (as some research suggests is common in algebra courses in college; Crooks & Alibali, 2014; Hammerman & Goldberg, 2003; Hodara, 2011), compliance may correspond to learning rote procedures that can then lead to higher grades but limit opportunities for developing conceptual understanding, whereas risk-taking may correspond to pursuing alternative problem-solving routes that are more likely to lead to increased conceptual understanding. More research is needed to explore these hypotheses and to better understand this trend.
In this study, differences in ACI scores also explained 10% of the gender gap in STEM versus non-STEM degree completion. Thus, it may be important to ensure that women and girls receive equal opportunities to develop algebraic conceptual understanding in mathematics courses. As with course grades, this may be related to marginalized experiences that women and girls have in mathematics classes (e.g., Reinholz et al., 2022), which may negatively impact their opportunities to develop conceptual understanding; or it might be related to patterns of “compliance” versus “risk-taking” in mathematics classrooms, where women and girls have been observed to be more “compliant” (Lubienski & Ganley, 2017; Miller et al., 1996) and to take fewer “risks” in mathematics problem-solving (Lubienski et al., 2021), which might reduce their opportunities to develop conceptual understanding.
Implications
Because the ACI is based on a detailed framework describing specific concepts and conceptions (or ways of thinking about those concepts) that constitute foundational algebraic conceptual understanding (Wladis et al., 2024a, 2024b; Wladis et al., manuscript under review), the ACI framework could be leveraged to inform instruction and curricula. Because ACI scores predict math and STEM outcomes in college, it may make sense to design curricula and instructional activities that focus on the particular concepts and conceptions that the ACI assesses. Thus, ACI sample items and details of the ACI framework could be an important tool for instructors, course and curriculum designers, and researchers who study undergraduate STEM education.
The fact that ACI scores predict STEM degree outcomes separately and significantly from course grade suggests that the ACI measures something that is distinct from assessments currently used in college mathematics courses but nonetheless important to STEM degree completion. This suggests that knowledge assessed by the ACI represents important algebra knowledge that is complementary to, rather than overlapping with, the knowledge measured by current course assessments. Because this conceptual knowledge appears not to be widely assessed in current college math courses, it may be that it is also not widely taught in the formal and enacted mathematics curriculum (Faulkner & Cook, 2006; Jennings & Jonathan, 2014; Sambell & McDowell, 1998). This might help to explain why mismatch has been identified between the algebra skills taught in the classroom and those perceived by workers as relevant on the job (Douglas & Attewell, 2017; Handel, 2016), even in STEM careers (Walkington et al., 2025). Research suggests that college graduates, even in STEM fields like engineering, often find more general skills like learning “how to think” (Moss-Pech, 2025, p. 117, 120), “problem-solving” (Moss-Pech, 2025, p. 120), or using the “scientific method” (Moss-Pech, 2025, p. 120) that they obtain from their college courses more relevant than narrower technical skills that they learn in their classes; thus, it is possible that algebraic conceptual understanding (which focuses on “problem-solving” and reasoning) may be particularly important in the STEM workforce. More research is needed to explore this possibility.
Relatedly, this study provides evidence that could be leveraged in ongoing developmental mathematics and college mathematics course sequence reform. These reforms have typically focused on redirecting students away from algebra into other courses (e.g., Ganga & Mazzariello, 2018; Logue et al., 2019), on compressing developmental and credit-bearing courses into a single combined course (Buckles et al., 2019; Kosiewicz et al., 2016; Merkin, 2023), or other interventions that typically keep the curriculum intact but provide students with alternate pacing or additional supports (Kalamkarian et al., 2015; Park et al., 2018; Twigg, 2011; The Century Foundation, 2016). These reforms may help with some of the barriers that students experience on their way to a degree, for example by allowing them to complete the degree in fewer terms when they are able to be successful in accelerated or compressed courses. However, these reforms have not yet critically considered what kinds of algebraic knowledge students acquire in these courses, which instruction in these courses is providing students with rich opportunities to develop algebraic conceptual understanding, or whether these courses are fulfilling their original intention of providing access to STEM careers (Wladis, Bjorkman, et al., 2023; Wladis, Makowski, et al., 2023). Evidence from this study suggests that developmental mathematics and college math sequence reform should consider what kinds of opportunities students are getting in these classes to develop algebraic conceptual understanding, and should consider including instructional activities that attend specifically to the concepts and conceptions on which the ACI was built.
Further, the mediation analysis results for race/ethnicity and gender in this research show that access to opportunities to develop algebraic conceptual understanding predicts differential math course and STEM degree outcomes and is therefore a critical equity issue. Research that investigates the relationship between equitable learning opportunities in the classroom and subsequent development or improvement of conceptual understanding could be an important area for future research. In the meantime, instructional approaches that focus more explicitly on teaching algebraic conceptual understanding may have the potential to improve both college STEM outcomes and STEM equity, as long as they are implemented in such as way that every student has sufficient opportunity to engage in rich sensemaking activities that support the development of algebraic conceptual understanding.
Conclusion
The results of this study suggest that scores on the ACI predict math course grades, successful completion of core math courses typically required for STEM majors, and STEM versus non-STEM degree attainment. They also explain some differential outcomes in these measures by race/ethnicity and gender. An increase of one standard deviation in ACI score was associated with one-third to one-half letter grade increase in math course grade for students in elementary/intermediate/college algebra, precalculus, and Calculus I; with a roughly 8–16 percentage point increase in the probability of ever completing core math courses required for most STEM majors, such as elementary/intermediate/college algebra, pre-calculus, and Calculus I/II; and a 5–6 percentage point increase in the probability of ever completing a STEM versus non-STEM degree, depending on which courses are being compared and which controls are included. These relationships between ACI score and various future math and STEM outcomes in college provide quantitative evidence to support existing descriptions of the importance of integrating instruction on algebraic conceptual understanding into both the K–12 and higher education mathematics curriculum. This suggests that a more explicit focus on conceptual understanding of foundational classical algebra may be a critical component of preparing students to succeed in college math courses and the degrees that require these courses, across many different majors.
ACI scores also explained significant portions of existing gaps in math course grades, course completion and STEM versus non-STEM completion by race/ethnicity and gender. ACI score accounted for 13%–17% of URM math grade differences, 100% of the STEM versus non-STEM degree completion gap by URM status, and 10% of the degree completion gap by gender. This suggests that access to high quality instruction in algebraic conceptual understanding is a critical equity issue, and that differences in opportunities to develop algebraic conceptual understanding explain a significant proportion of certain college STEM outcome differences by race/ethnicity and gender.
The fact that that ACI score was predictive of STEM degree completion in addition to and independently of math course grade suggests that (1) existing course assessment practices are likely not robustly capturing students’ foundational algebraic conceptual understanding; and (2) existing curricula and instruction as currently enacted may not be focusing explicitly on foundational algebraic conceptual understanding. Given the predictive power of ACI scores for college math and STEM outcomes, this suggests that it may be critical to improve curricula and instructional practice so that students in college and college-preparatory mathematics courses have sufficient opportunities to develop robust foundational algebraic conceptual understanding. The framework used to generate the ACI could be used as a starting point for identifying the components of algebraic conceptual understanding that may be worth attending to in instruction.
This study provides some of the first large-scale quantitative evidence that algebraic conceptual understanding can predict college mathematics course outcomes and STEM degree attainment. In addition to informing potential instructional and curricular changes, it points towards a critical need for future research that (1) explores which instructional approaches improve students’ algebraic conceptual understanding; (2) determines which students are provided access to instruction that supports the development of algebraic conceptual understanding; and (3) directly tests whether improved instruction in algebraic conceptual understanding can positively impact STEM degree attainment and math course completion. Better understanding these patterns appears to be critical not just to building a well-qualified STEM workforce, but also an equitable one.
Footnotes
Appendix: Robustness Checks
Incremental Validity Analysis Robustness Check: Model Fit with and without ACI Score
| With ACI Score | Without ACI Score | |||||||
|---|---|---|---|---|---|---|---|---|
| Models From Table 4 | ||||||||
| Fixed Effects by Course Sequence Number, Regression Results | ||||||||
| AIC | BIC | AIC | BIC | diff AIC | diff BIC | t for BIC diff. a | p b | |
| Sequence | 256,489 | 256,622 | 259,065 | 259,190 | 2,575 | 2,568 | 50.7 |
|
| Separate Models for Each Course Type, Regression Results | ||||||||
| AIC | BIC | AIC | BIC | diff AIC | diff BIC | t for BIC diff. a | p b | |
| Intermediate/College Algebra | 65,209 | 65,304 | 66,805 | 66,894 | 1,596 | 1,590 | 39.9 |
|
| Precalculus | 70,999 | 71,095 | 72,536 | 72,627 | 1,538 | 1,532 | 39.2 |
|
| Calculus I | 27,507 | 27,590 | 27,757 | 27,836 | 251 | 246 | 15.8 |
|
| Calculus II | 12,774 | 12,841 | 12,780 | 12,843 | 6 | 2 | 2.1 | .016 |
| Calculus III | 3,057 | 3,109 | 3,055 | 3,104 | −2 | −5 | −2.7 | .997 |
| Differential Equations | 1,215 | 1,248 | 1,483 | 1,514 | 268 | 266 | 16.4 |
|
| Mathematics Education | 9,564 | 9,625 | 9,656 | 9,713 | 92 | 88 | 9.5 |
|
| Statistics | 76,037 | 76,139 | 76,441 | 76,538 | 404 | 399 | 20.1 |
|
| Quantitative Reasoning | 22,827 | 22,903 | 22,921 | 22,992 | 94 | 89 | 9.6 |
|
| Math for Health Majors | 15,891 | 15,966 | 15,982 | 16,053 | 91 | 86 | 9.4 |
|
| Models From Table 5 | ||||||||
| Ever Completed: | AIC | BIC | AIC | BIC | diff AIC | diff BIC | t for BIC diff. a | p b |
| Elementary Algebra | 6,486 | 6,562 | 6,858 | 6,930 | 372 | 368 | 19.2 |
|
| Intermediate/College Algebra | 26,459 | 26,565 | 27,727 | 27,827 | 1,268 | 1,262 | 35.6 |
|
| Precalculus | 22,018 | 22,122 | 22,917 | 23,015 | 899 | 893 | 29.9 |
|
| Calculus I | 9,508 | 9,600 | 9,876 | 9,963 | 368 | 363 | 19.1 |
|
| Calculus II | 3,351 | 3,426 | 3,486 | 3,557 | 136 | 131 | 11.6 |
|
| Models from Table 6 | ||||||||
| Ever Attained STEM Versus Non-STEM Degree | AIC | BIC | AIC | BIC | diff AIC | diff BIC | t for BIC diff. a | p b |
| ACI Score | 2,728 | 2,830 | 2,755 | 2,851 | 27 | 21 | 5.0 |
|
Note. aCriteria for t-statistic calculation and significance are taken from (Raftery, 1995, pp. 139–140). bBolded p values indicate “strong evidence” (Raftery, 1995, p. 139) for added information due to additional variable (Akaike Information Criterion (ACI) score in this case), based on differences in Bayesian information criterion (BIC) greater than 10.
Authors’ Note
Geillan Aly is now the owner of Compassionate Math, LLC: Professional Consulting and Development in Math Education, 860-255-8783,
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by grants from the National Science Foundation (#1760491, 2300725). Opinions reflect those of the authors and do not necessarily reflect those of the granting agency.
Open Practices
1.
We note that we aim to describe an existing conceptualization used in the literature and do not endorse it. For example, students may be placed into developmental algebra because of flawed placement procedures, delayed college entry, or low K–12 mathematics instructional quality, all of which can be independent of student ability or interest in algebra, math or STEM.
2.
There are likely many knowledge types that could be listed separately in the cognitive impacts box and studied in other research; however, because of the focus of this study we have streamlined the figure to highlight algebraic conceptual understanding versus other types of knowledge. This is not intended to imply that conceptual understanding is unrelated to other knowledge types or that other knowledge types all belong in a single category.
3.
“Elementary” algebra refers to a first algebra course in college (typically classified by colleges as “developmental” and offered not-for-credit) that assumes no pre-requisite knowledge of algebra, although students who enroll have almost always already successfully completed Algebra I in high school.
4.
Some have used the term “school” algebra, but we prefer the term “classical” algebra for reasons of clarity and equity. “School” algebra is a misnomer outside the K–12 setting. Students in algebra classes in college are adults, who are developmentally different from eighth/ninth graders and have typically already successfully completed Algebra I in secondary school. They may sometimes be asked to engage with similar mathematical objects as Algebra I in secondary school, but in college are often expected to do so in ways qualitatively different from younger students. For a more in-depth discussion of this, see (Wladis, Bjorkman, et al., 2023; Wladis, Makowski, et al., 2023).
5.
For the purposes of test security, numbers and answer option order have been altered.
6.
This is slightly less than the total number of students who took the ACI during the validation study because IR data was provided before data collection in the validation study was complete, and thus students who took the ACI after IR data was obtained were not included in this analysis.
7.
A total of 89.7% of the missing zipcode cases belonged to students with at least one foreign address on file, whereas only 3.4% of students had at least one foreign address on file. Therefore, students with missing zipcode data were significantly more likely to be foreign students.
8.
We also considered potential moderating relationships by gender for all regression models reported here; however, interaction terms (for gender × ACI score) were not significant at the
for results of moderation analysis.
9.
We use the term “predict” to indicate that the ACI was administered to students before course grades were assigned and before students completed their degree; this is not intended to imply that every assessment that influenced students’ final course grades (and that in turn impacted their degree completion) occurred after the ACI was administered. As noted in the method section, 90% of students took the ACI at least 1 month prior to final exam administration and none took it during the week of final exams and thus the ACI was taken temporally prior to final examinations and largely prior to various other coursework that determined final course grades. We note that we do not see any differences in overall predictive patterns if we only consider whether ACI scores predict course outcomes in a subsequent term compared to both current and subsequent terms (see, for example, Table A2 in the
).
10.
We were unable to include elementary algebra courses in this analysis due to small n, because very few elementary algebra courses had course grades beyond pass/fail designations during the study period.
11.
As a robustness check, we also ran this analysis while only including students who were not currently enrolled in the course at the time of taking the ACI (see Table A2 in the
); this substantially reduced sample size, reducing the magnitude of the significance for intermediate/college algebra, but produced even larger coefficients (suggesting a potentially stronger rather than weaker relationship between ACI score and course grade) than those reported in Table 5 and virtually identical results (coefficients, standard errors, and p values) to those for precalculus in Table 5. See Table A2 in the Appendix for details.
Authors
CLAIRE WLADIS is the Director of the CUNY Excellence Through Education Research Group, Professor of Mathematics at BMCC/CUNY and Urban Education at the CUNY Graduate Center. They currently lead several NSF-funded projects, including one to validate an algebra concept inventory and another to investigate factors that impact enrollment in or completion of math majors at two-year colleges.
BENJAMIN SENCINDIVER is an assistant professor at the University of Texas at San Antonio. His research interests include investigating college students’ graphical and covariational thinking and factors that prompt students to change how they think about mathematical ideas.
ALYSE C. HACHEY is Chair of the Teacher Education Department, Director of the Division of Bilingual Education, Early Childhood Education, Literacy and Sociocultural Studies, and Professor-Lead Early Childhood Faculty in the College of Education at The University of Texas at El Paso. Her teaching and research interests focus on early childhood cognition and curriculum development and post-secondary online learning and retention, particularly for college populations often designated at high-risk of dropout.
NILS MYSZKOWSKI is an associate professor at Pace University. His research is focused on the application and improvement of psychometric methods to measure and understand creative, aesthetic and interpersonal skills, especially applied in occupational contexts.
KATHLEEN OFFENHOLLEY is a professor at the Borough of Manhattan Community College at the City University of New York. Her research interests include gaming in mathematics education and students’ algebraic conceptual understanding; she is a former steering committee member of the CUNY Games Network.
JASON SAMUELS is a professor at the Borough of Manhattan Community College at the City University of New York. His research interests include investigating college students’ thinking in calculus and algebra and developing innovative approaches to teaching calculus with an emphasis on quantitative reasoning.
GEILLAN ALY was previously a project manager for the Excellence Through Education Research Group at the City University of New York. She is currently focusing on her role as the founder Compassionate Math, LLC where she works to prevent and heal math trauma.
