Sage Journals: Discover world-class research

Abstract

Mathematics fact fluency is essential for proficiency in advanced topics, such as algebra. However, many students in the United States, including those in elementary and secondary grades, experience mathematics difficulties (MD) and struggle to develop fluency with mathematics facts. We synthesized findings from 35 group-design studies, reporting 178 effect sizes (ESs), conducted between 1975 and June 2024, to evaluate the efficacy of fact fluency interventions and identify key malleable moderators of intervention outcomes. Results from a Robust Variance Estimation (RVE) model revealed an educationally meaningful average ES (g = 0.76), providing evidence of the overall efficacy of fact fluency interventions. However, the prediction interval (−0.60 to 2.12) indicated substantial heterogeneity in treatment effects, warranting further investigation. To explore this variability, we conducted a meta-regression analysis to examine the role of intervention dosage indicators (e.g., frequency) and alignment indicators (e.g., grade level) while accounting for study-level confounders (e.g., publication era). Significant moderators included two dosage indicators (i.e., grouping and total sessions) and two alignment indicators (i.e., operation focus and outcome measures). We discuss these results in relation to limitations, implications for future research, and classroom practice.

Keywords

mathematics difficulties mathematics fact fluency

Mathematics proficiency is essential for academic success and long-term economic opportunity (Cogan et al., 2018; Wang et al., 2023). In response to growing demands for mathematically literate citizens, initiatives such as the Common Core State Standards for Mathematics (CCSS-M) and the National Council of Teachers of Mathematics (NCTM) have emphasized conceptual understanding alongside procedural fluency (Asempapa & Sturgill, 2017). However, despite these efforts, national assessments reveal persistent underperformance in mathematics across grade levels. For instance, only 41% of fourth graders and 34% of eighth graders scored proficient or above on the 2019 National Assessment of Educational Progress (NAEP), and scores declined further following the COVID-19 pandemic (National Center for Education Statistics [NCES], 2024). These trends are even more troubling for students with disabilities: just 17% of fourth graders and 9% of eighth graders with disabilities reached proficiency in 2019, with most scoring below the basic level. These longstanding disparities underscore the urgent need to strengthen mathematics instruction (Witzel et al., 2024), particularly for students who face persistent challenges in foundational skills.

A lack of mathematics fact fluency is frequently cited as a significant contributor to the difficulty students experience in developing mathematics proficiency (Burns et al., 2010; Codding et al., 2011). Mathematics fact fluency is the ability to quickly and accurately recall foundational calculations, such as 5 + 5 = 10 or 2 × 8 = 16 (Morano et al., 2020; Price et al., 2013). It is a well-established precursor to later mathematical proficiency (Burns et al., 2024; Witzel, 2016). Mathematics fact fluency enables students to apply their knowledge of facts to achieve computational fluency, facilitating accurate execution of operations with multi-digit numbers and procedural steps, such as regrouping (Cason et al., 2019; Geary, 2013). Students who cannot extend their proficiency in mathematics facts to computational fluency may struggle with the complex reasoning skills required as they progress through grade levels (Burns et al., 2010; Cason et al., 2019). Therefore, fostering fact fluency is critical for improving mathematical outcomes, particularly for students facing challenges, such as those with mathematics difficulties (MD).

To enhance the development and application of students’ mathematics fact fluency, it is essential to equip practitioners with evidence-based insights into not just which interventions are efficacious but also how to intensify them to meet student needs (Burns et al., 2010; Codding et al., 2022). We define interventions as structured instructional and assessment practices delivered in addition to a student’s general education instruction (Powell et al., 2022). However, standard interventions may not provide sufficient practice or support for students with MD, necessitating greater instructional intensity, especially through features such as increased dosage and closer alignment to students’ curricular and learning needs (Fuchs et al., 2017; Powell et al., 2022).

Students With Mathematics Difficulties

Students with MD are those who experience persistent challenges in mathematics, including students who are formally identified with a specific learning disability (SLD) or those at risk for one (Swanson et al., 2018). Students with a formal SLD diagnosis in mathematics often have Individualized Education Program (IEP) goals outlining specific mathematics goals and services. These students typically exhibit difficulty with various academic tasks, such as those related to mathematical calculations or mathematical reasoning (Individuals with Disabilities Education Act [IDEA], 2004).

Students at risk for SLD display similar challenges to those with an SLD in mathematics, including difficulty with mathematics fact fluency (Cirino et al., 2015; Swanson et al., 2013). These students do not have a school-identified SLD but exhibit persistent low performance in mathematics, which is typically identified using predetermined performance criteria on standardized mathematics screening measures, often set by percentile ranks (e.g., at or below the 25th percentile; G. Nelson & Powell, 2018). In addition to screening measures, students can also be classified based on below-average class performance, teacher recommendations, or their limited response to increasingly intensive support in mathematics (Clarke et al., 2020; Jitendra et al., 2018; Lembke et al., 2012). For this study, we use the term MD to refer to students with and without SLD who face challenges in mathematics, regardless of how they were identified. One key area where students with MD often require intensified support is the development of mathematics fact fluency, the ability to quickly and accurately recall basic arithmetic facts (Burns et al., 2010; Codding et al., 2011). Without proficiency in fact fluency, students struggle to engage in more complex mathematical reasoning, resulting in persistent challenges to their overall mathematics achievement (Price et al., 2013; Witzel, 2016).

Mathematics Fact Fluency

Mathematics fact fluency, often called fact fluency or whole-number combinations, refers to a student’s capacity to quickly and accurately retrieve 390 facts: 100 addition facts with single-digit addends, 100 subtraction facts with single-digit subtrahends, 100 multiplication facts with single-digit factors, and 90 division facts with single-digit divisors (Burns et al., 2012; Fuchs et al., 2006; McVancel et al., 2018). Mathematics fact fluency is foundational for achieving success in mathematics (Burns et al., 2010; Powell et al., 2025; Riccomini et al., 2017). This automatic retrieval of mathematical facts without relying on counting strategies or visual aids is critical, as it frees cognitive resources for more complex tasks (Fuchs et al., 2016; VanDerHeyden & Burns, 2005). Students with limited mathematics fact fluency often face challenges in understanding more complex mathematical concepts, which can lead to limited proficiency in rational number operations and algebra (Bailey et al., 2014; Jordan et al., 2017; Witzel, 2016). Throughout this analysis, we will refer to mathematics fact fluency as fact fluency to align with the common reference of the construct.

In contrast, computational fluency involves accurately executing addition, subtraction, multiplication, and division with multi-digit numbers (Geary, 2011). For students to develop computational fluency, they must possess proficiency in mathematical facts, a conceptual understanding of number relationships, and the ability to perform multistep mathematical procedures (Jordan et al., 2007; Witzel, 2016). For example, for a student to accurately add 25 + 35, they could begin in the ones place and add 5 + 5, relying on the mathematical fact that 5 + 5 = 10. Once they have the sum of 10, students could regroup the 10 ones into 1 ten in the tens place. Finally, students will add 2 + 3 + 1 to calculate the sum of the tens place.

Students With MD and Their Mathematics Fact Fluency

Students with MD face significant challenges developing fact fluency due to several interrelated factors. Students often struggle to retrieve mathematical facts from long-term memory, resulting in slower response times and an overreliance on inefficient strategies, such as finger counting (Geary, 2013). In addition, procedural errors, challenges in automaticity with mathematics facts, and cognitive load issues, such as the strain on working memory when managing multiple steps in a problem or the inability to efficiently retrieve information from long-term memory, further hinder their progress (Mabbott & Bisanz, 2008; Swanson et al., 2013). These students require additional practice to achieve fluency and often continue using backup strategies long after their peers have progressed to more advanced reasoning (Burns et al., 2015; Stickney et al., 2012). Moreover, limited motivation or mathematics anxiety may exacerbate these struggles (Ashcraft & Krause, 2007; Pollack et al., 2021).

Many students with MD also lack access to targeted, effective fact fluency interventions, further hindering their progress (Codding et al., 2011). To address these challenges, researchers have designed various interventions to support the mathematics fact fluency of students with MD, including incremental rehearsal, Cover-Copy-Compare (CCC), and computer-based programs (Abu-Hamour, 2019; Burns et al., 2012, 2019). A systematic evaluation of these interventions across grade levels (K–12) is needed to determine how malleable intervention indicators, such as dosage and alignment, promote instructional intensity and, in turn, impact the fluency outcomes of students with MD.

Literature Review: Summary of Previous Systematic Reviews

Several systematic reviews have examined strategies to support fact fluency development among students with MD, including five meta-analyses and four narrative reviews. The narrative reviews primarily focused on instructional approaches such as CCC (Joseph et al., 2012; Stocker & Kubina, 2016) and technology-mediated interventions (Cozad & Riccomini, 2016; Kiru et al., 2017). While these reviews provide valuable summaries of intervention strategies, they do not aggregate effect sizes (ESs) or systematically evaluate how malleable features of instructional design influence student outcomes.

In contrast, five meta-analyses have exclusively synthesized the effects of fact fluency interventions for students with MD (Burns et al., 2010, 2024; Codding et al., 2011; S. A. Kim et al., 2023; Kleinert et al., 2018). Four of these studies relied exclusively or primarily on single-case design (SCD) studies, which, while helpful in assessing individual responsiveness to intervention, limit generalizability due to small sample sizes, absence of random assignment, and variations in design features that complicate the estimation and comparison of ESs across studies (Kratochwill & Levin, 2010). Moreover, these meta-analyses often lacked formal quality appraisals and did not conduct systematic moderator analyses. Only S. A. Kim et al. (2023) included group-design studies, although they were pooled with SCDs, making it challenging to interpret effects across methodological approaches. To extend this literature, the present meta-analysis focuses exclusively on group-design studies, including randomized controlled trials (RCTs) and quasi-experimental designs (QEDs), incorporates a formal quality appraisal, and systematically examines how indicators of instructional intensity (i.e., dosage and alignment) moderate the effects of interventions. This approach aims to generate practical, evidence-based guidance for intensifying fluency instruction for students with MD.

Conceptual Foundations: Dosage and Alignment as Moderators

Although instructional intensity has gained increased attention in interventions for students with MD, the role of malleable features in shaping that intensity and moderating intervention effects remains underexplored (Myers et al., 2022). Fuchs et al. (2017) identified dosage and alignment as key dimensions of instructional intensity, emphasizing their importance in optimizing the efficacy of interventions. More recently, Myers et al. (2024) investigated how variations in these features influence word problem outcomes among students with MD, highlighting their potential as moderators. However, prior meta-analyses have not systematically analyzed these dimensions within fact fluency interventions, limiting our understanding of how instructional exposure (i.e., dosage) and focus (i.e., alignment) support fluency development for students with MD. Intervention intensity is increasingly seen as a multidimensional construct that extends beyond simple treatment-control comparisons (Fuchs et al., 2017; Vaughn et al., 2011). Identifying specific, adjustable indicators that can be modified to improve fluency outcomes is essential. Dosage and alignment seem to be key mechanisms through which instructional intensity is organized and varied (Fuchs et al., 2017; Myers et al., 2024).

Potential Impact of Dosage Indicators

Dosage refers to the amount of instructional exposure that students receive, which is shaped by several factors, including frequency, duration, total number of sessions, grouping, and setting (Myers et al., 2024). These elements influence how often and how long students engage in structured practice, scaffolding, and feedback. Group size and instructional setting often go hand-in-hand. Smaller groups or one-on-one instruction can provide more individualized support, which may enhance learning outcomes (Powell et al., 2022). However, some studies suggest that classwide interventions can be equally or even more effective, especially when structured practice and consistent scaffolding are embedded (Kleinert et al., 2018). The setting also shapes the intensity of instruction: general education classrooms may limit individualized attention, while intervention settings can support higher-dosage, tailored instruction (Powell et al., 2022). These factors indirectly affect dosage and student outcomes (Myers et al., 2024).

Session frequency, defined as the number of sessions per week, also plays a role (Myers et al., 2022). Higher-frequency interventions are often linked to better outcomes due to more consistent practice (Myers et al., 2022). Still, the optimal balance remains unclear; less frequent but longer sessions may also yield meaningful gains, depending on content and context. Instructional duration, typically measured in total hours, determines how much cumulative exposure students receive. Although longer durations may enhance learning through extended practice, research indicates diminishing returns beyond a certain threshold (Myers et al., 2022). Finding the optimal duration helps maximize engagement and instructional efficiency without unnecessary time costs. Finally, the total number of sessions contributes to overall exposure and retention. More sessions generally support stronger fluency development, but excessive repetition may not provide additional value once core skills are established (Kong et al., 2021; Powell et al., 2022). Evaluating the interaction between frequency, duration, and session count is crucial for understanding how dosage affects the efficacy of interventions (Myers et al., 2024).

Potential Impact of Alignment Indicators

Alignment refers to the extent to which an intervention’s instructional content and structure align with students’ learning needs (Fuchs et al., 2017). Myers et al. (2024) conceptualized alignment in two dimensions: content alignment and student alignment. Content alignment involves the operation and outcome measure focus. The operation focus, defined as whether the intervention targets additive operations (addition/subtraction), multiplicative operations (multiplication/division), or both, can shape efficacy, as students with MD often struggle more with certain operations (S. A. Kim et al., 2023; Myers et al., 2021). Interventions that focus on addition and subtraction may yield stronger effects, particularly at early grade levels, due to their alignment with elementary curricula (Geary, 2013). Outcome measure refers to how intervention success is assessed. Some interventions target fact retrieval (i.e., fact fluency); others aim for broader computational fluency or problem-solving. The alignment between instructional emphasis and assessment type may influence ES estimates (Myers et al., 2024). For example, tasks that require integrating fluency within multistep word problems may show weaker effects due to added cognitive demands beyond fact retrieval alone (Powell et al., 2022).

Student alignment captures whether the intervention is developmentally appropriate. Grade level serves as a proxy here, as younger students (Grades K–3) tend to respond more rapidly to fluency instruction due to greater cognitive flexibility (Bloom et al., 2008). Older students (Grades 4–12) may require more structured or intensive support, and the relative impact of fluency interventions may decrease as curricular demands shift toward higher-level reasoning (Myers et al., 2023). While each dosage and alignment indicator may influence fluency outcomes, prior studies have not systematically examined their unique contributions within a unified analytical framework. By estimating the independent effects of key intensity indicators, such as frequency, duration, operation focus, and grade level, this study provides a nuanced understanding of how specific features of intervention design relate to outcomes. These findings aim to inform the development of more precisely targeted fluency interventions for students with MD across diverse instructional settings.

Potential Study-Level Confounders

The moderating influence of dosage and alignment indicators must be interpreted in conjunction with other study-level characteristics that can shape intervention outcomes (Tipton et al., 2023). Hence, we controlled for 12 study-level confounders: publication era, ethnic composition, gender composition, MD identification method, interventionist, fidelity of implementation reporting, research design, assignment level, control condition, dependent measure type, funding status, and country.

A critical moderator to consider is publication era (Lein et al., 2020; Myers et al., 2022). The release of the NCTM Standards in 2000 has prompted a significant shift in mathematics instruction away from rote fluency practice and toward conceptual understanding, reasoning, and problem-solving (Findell et al., 2001; Schoenfeld, 2004). Consequently, curricula developed in the subsequent period often reduced time devoted to systematic fluency building. This broader curricular context may have muted the observed impact of interventions targeting fact retrieval, as students’ baseline fluency was potentially lower. Later reforms, most notably the Common Core State Standards-Mathematics (CCSS-M), reintroduced procedural fluency as a foundational expectation, although embedded within broader reasoning goals (Porter et al., 2011). Therefore, testing publication era as a moderator is essential to isolate the specific effect of the interventions from the historical changes in prevailing curricular emphasis that define the instructional backdrop of the studies.

Other study-level confounders reflect variation in participant characteristics, MD identification methods, implementation fidelity, and study design features. Ethnic and gender composition of samples may also influence outcomes, as differences in access, opportunity, and motivation have been linked to mathematics performance (Cheema & Galluzzo, 2013; Fryer & Levitt, 2010). Furthermore, the MD identification approach may influence outcomes. Students with MD may be identified through different procedures, such as percentile cutoffs, standardized test scores, or multiple criteria, which yield samples that differ in baseline responsiveness to intervention (Dennis et al., 2016). Implementation fidelity and research design features may also significantly impact the results. In terms of interventionist, outcomes may differ depending on whether interventions are delivered by researchers, teachers, or computer programs, and whether fidelity is reported (Dennis et al., 2016; Myers et al., 2021, 2022). Methodological rigor also matters, as RCTs provide stronger evidence than QEDs, and the level of assignment (student versus classroom) has been shown to influence intervention effects (Myers et al., 2022; Xin & Jitendra, 1999). Similarly, the control condition may be consequential; comparisons against active treatments typically yield smaller effects than those against business-as-usual instruction (Kroesbergen & Van Luit, 2002).

Finally, dependent measure type, funding status, and country may represent additional study-level confounders. Prior meta-analyses of math outcomes have consistently shown that researcher-developed outcome measures produce larger effects than standardized assessments (Jitendra et al., 2020; Myers et al., 2021). Funding status may also influence study quality and reporting, as sponsored projects may differ in scope or carry risks of bias (Jefferson, 2020; Ou et al., 2024). The variable country controls for cross-national differences, as educational systems vary substantially in curriculum and teacher preparation, potentially shaping both the design and effectiveness of fluency interventions (Myers et al., 2022). Hence, by including these 12 study-level confounders, we aim to reduce omitted variable bias, leading to a more precise estimation of the unique contributions of intervention design to address fact fluency outcomes among students with MD.

Rationale

Prior syntheses have provided valuable insights into the effectiveness of mathematics fact fluency interventions for students with MD, primarily drawing on SCD studies and narrative reviews. These reviews have identified promising practices and suggested generally strong effects. At the same time, inconsistent application of quality standards and limited attention to instructional moderators have left important gaps in the evidence base. The present meta-analysis addresses these gaps by focusing exclusively on group-design studies to enable quantitative synthesis using consistent ES metrics, applying formal quality appraisal based on the Council for Exceptional Children (CEC) standards, and systematically examining how malleable instructional features, specifically dosage and alignment indicators, are related to fluency outcomes.

Furthermore, to ensure that the estimates for these primary moderators (dosage and alignment) are robust, we incorporated a set of study-level confounders into our models. Variables such as publication era, country, and dependent measure type help account for methodological, contextual, and sample-related differences that might otherwise bias the results. These study-level confounders are included solely as statistical controls, not as moderators for substantive interpretation. Our central focus is on dosage and alignment indicators, which provide the clearest insight into how often fluency instruction is delivered, in what form, and for which students (Myers et al., 2024).

Purpose and Research Questions

The purpose of this meta-analysis was to extend the existing literature by evaluating the overall efficacy of mathematics fact fluency interventions for students with MD in group-design studies. In addition to assessing study quality using the CEC quality indicators (QIs), this study systematically examined how malleable instructional features, specifically dosage and alignment, influence intervention effects. By identifying the independent contributions of these features, this meta-analysis aims to provide practitioners with evidence-based recommendations for intensifying fact fluency instruction to support mathematics fluency outcomes for students with MD. Two research questions guide our analysis:

Research Question 1 (RQ1): What is the overall efficacy of mathematics fact fluency interventions for students with MD in Grades K–12, as examined in group-design studies?

Research Question 2 (RQ2): How do malleable intervention indicators related to dosage (e.g., grouping, duration) and alignment (e.g., operation focus, grade level) moderate the efficacy of fact fluency interventions for students with MD, after adjusting for study-level confounders (e.g., country, publication era, and study design)?

Method

We conducted this meta-analysis following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (Page et al., 2021) guidelines and the systematic review process outlined by Pigott and Polanin (2020). Our search spanned peer-reviewed (i.e., journal articles) and non-peer-reviewed (i.e., dissertations, theses, reports) studies published from January 1975, marking the passage of Public Law 94-142, the first federal legislation mandating special education services in the United States, through June 2024. This start date marks the beginning of systematic research on interventions for students with academic difficulties, particularly in special education, which has laid the groundwork for much of the current research on MD more broadly. To ensure comprehensive coverage, we used a multimodal search approach that included electronic database searches, forward and backward citation searches of studies identified after completing the full-text screening of electronic records, and a review of references from existing reviews and meta-analyses. First, we conducted electronic database searches in Academic Search Complete, APA PsycInfo, Education Source, and ERIC using Boolean search strings targeting interventions for fact fluency and populations at risk for mathematics difficulties (see Figure 1). These searches yielded 3,159 records. After removing duplicates, we retained 2,696 records for initial title and abstract screening.

Figure 1.

Boolean Search Terms for Electronic Database Searches.

We also conducted forward citation searches of the 33 studies retained after our screening process of the electronic search results (described below) and backward citation searches of their reference lists. These searches yielded 14 additional records, 12 of which were duplicates already retrieved through the electronic search. The remaining two records were unique and advanced to screening, where both met inclusion criteria, bringing the total number of included studies to 35. Screening of references from existing reviews and meta-analyses yielded no additional eligible records. The complete flow of studies through the review process is presented in Figure 2 (i.e., the PRISMA diagram). We describe the screening, coding, and data extraction procedures in the following sections; however, because these stages required independent coder judgment, we first explain how interrater reliability (IRR) was established and calculated.

Figure 2.

Diagram Showing the Search and Retrieval Process.

IRR Calculation

Two graduate research assistants (GRAs) independently completed all phases of screening, study coding, and ES data extraction, resulting in double-coding across the entire review process. Both had prior experience with systematic reviews and received structured training on the coding manual and procedures from a senior member of the research team prior to data collection. IRR was calculated as the number of agreements divided by the total number of ratings (agreements plus disagreements), multiplied by 100. Discrepancies were resolved in consultation with senior team members until full consensus was reached.

Screening Procedures

We screened the 2,696 records retrieved via electronic searches using Rayyan, a web-based platform designed to support systematic reviews. Titles and abstracts were evaluated against the inclusion and exclusion criteria outlined in Table 1. Of these, 2,601 records were excluded, and 95 were retained for full-text review. The IRR for this phase was 99.8%, with six discrepancies.

Table 1.

Inclusion and Exclusion Criteria Used in Screening Studies.

Category	Inclusion criteria	Exclusion criteria
Study design	To ensure comparability of effect size calculations, we included only group designs such as Randomized Controlled Trials (RCTs) and Quasi-Experimental Designs (QEDs). This decision allowed us to generate internally consistent estimates and facilitate moderator analyses across studies using similar methodological frameworks.	To maintain methodological consistency in effect size estimation, we excluded studies using single-case design (SCD). Although SCD studies provide valuable evidence and achieve generalizability through replication, their effect size metrics and comparison structures differ substantially from those used in group designs, making direct combination problematic without specialized analytic approaches.
Population	Focused on students with mathematics difficulties (MD) in Grades K–12, as defined in the introduction.	The study did not include students with MD or did not report disaggregated data for this population.
Publication date	Studies were published between January 1975 and July 2024 to ensure they fall within the period following the legal codification of special education in the United States.	The study was published before January 1975 and therefore predates the enactment of U.S. special education law, which limits its relevance to the current educational context.
Language and publication type	The study was published in English, either peer-reviewed or non-peer-reviewed (e.g., dissertations, theses). English-only studies were included to minimize translation errors and ensure consistent interpretation. Studies could be conducted in the United States or any other country.	The study was excluded if it was not published in English, to avoid translation errors and ensure consistency in interpretation.
Intervention focus	The study evaluated the impact of a whole-number mathematics fluency intervention, defined as instruction or practice opportunities provided in addition to a student’s general education instruction.	The study did not include a whole-number fluency intervention.
Dependent measures	The study focused on fact fluency in addition, subtraction, multiplication, or division as the primary dependent measure. Studies that also reported outcomes for computational fluency or word-problem solving were included only if those outcomes were linked to an intervention component explicitly targeting fact fluency. This focus ensures the analysis centers on foundational skills critical for success in more complex mathematical tasks.	Studies were excluded if they reported outcomes unrelated to mathematics (e.g., literacy or social skills) or relied solely on subjective measures such as teacher observations or qualitative assessments, which do not align with the quantitative focus of this meta-analysis.
Intervention timing	The intervention was delivered during or after the school day and was not conducted on the same day as testing.	Studies were excluded if the pre-test, intervention, and post-test all occurred on the same day, as this timing does not allow for an accurate assessment of learning or retention.
Data reporting	The study provided sufficient quantitative data (e.g., means, standard deviations, sample sizes, t- or F-values) to calculate an effect size for the primary outcome.	The study lacked sufficient data to calculate an effect size.

Study Coding and Data Extraction

We used a three-step process to code the 35 included studies: (a) evaluating study quality, (b) extracting study information to identify potential moderators, and (c) extracting data for ES calculations. The coding protocol for the first two phases was developed through an iterative process. The first and second authors collaboratively designed the initial coding manual, which was reviewed and refined in consultation with an additional author until full agreement on the final protocol was reached. For each phase, the two authors who conducted the initial screening independently coded all studies twice.

Phase 1: Study Quality Evaluation

The first phase of the coding process involved conducting a quality appraisal of each study that met our inclusion criteria, following similar procedures to those used in previous systematic reviews on interventions for students with MD (e.g., Dennis et al., 2016). We assessed study quality using a checklist of 24 QIs developed by the CEC for evaluating group-design research, which encompassed multiple dimensions, such as implementation fidelity, study context, psychometric properties of dependent measures, internal validity, and data analysis techniques (Cook et al., 2015). The two screeners independently coded each study and assigned a score of 1 (Met), 0 (Not Met), or “Not Applicable” (NA) for each QI. The NA designation was applied in cases where a specific QI did not logically pertain to a given study (e.g., fidelity monitoring indicators for fully automated, computer-based interventions that did not involve a human instructor). Across the 24 indicators coded for each of the 35 studies (840 total ratings, including NA as a valid code), IRR was 96.5% (811/840) with 100% agreement after discussion.

Previous studies (e.g., Dennis et al., 2016; Myers et al., 2022) have often used summed quality scores as a moderator in meta-analyses, implicitly assuming each indicator contributes equally to the overall quality score. However, researchers have highlighted limitations to this approach, as it may not accurately represent the structure or relative importance of individual items (McNeish & Wolf, 2020). Recognizing that not all QIs contribute equally to study quality, we chose not to aggregate the scores. Although including each QI as a separate predictor in the meta-regression model would have been ideal for explaining heterogeneity in effects, this was not feasible due to the limited number of data points relative to the number of predictors (Tipton et al., 2023). Instead, we describe patterns across the QIs and provide a descriptive summary of our assessment. This structured evaluation approach allowed us to identify recurring strengths and weaknesses in the methodological rigor of the included studies, which informed our interpretation of the meta-analytic results.

Phase 2: Study Information Data Extraction

In the second phase, we extracted study-level information for potential moderators in the meta-analysis, including alignment, dosage, and potential confounders (see Table 2 for a full list and coding criteria). The coding protocol for this phase was developed through an iterative process: the first and second authors collaboratively drafted the initial coding manual, piloted it on a subset of studies, and refined it in consultation with an additional author until full agreement was reached. This process ensured that all definitions and decision rules were clear before double-coding began. After the protocol was finalized, a senior researcher trained both coders to ensure consistent application of definitions and decision rules before double-coding began.

Table 2.

Coding Criteria for Variables Representing Dosage Indicators, Alignment Indicators, and Study-Level Confounders.

Variable	Coding criteria
Dosage
Setting	General Education Classroom = Interventions delivered in regular classroom settings. Non-General Education = Interventions delivered in specialized settings (e.g., resource rooms, special education classes).
Grouping	One-on-one = Interventions implemented with individual students. Small Group = Interventions conducted with small instructional groups (e.g., two to eight students). Large Group = Interventions delivered in groups of more than eight students or intact classrooms.
Total sessions	< 10 sessions = fewer than 10 sessions. 10–20 sessions. 21–29 sessions. 30 or more sessions.
Duration	Fewer than 10 hours = Interventions w/ less than 10 hours total instructional time. 10–16 hours = Interventions provided for 10 and 16 hours instructional time. Over 16 hours = Interventions provided for more than 16 hours total instructional time.
Frequency	1 or 2 times weekly = Interventions provided one or two times per week. 3 or 4 times weekly = interventions provided three or four times per week. Daily = Interventions provided five days per week.
Alignment
Grade level	Lower Elementary = Included students in Grades K–3. Upper Elementary and Secondary = Included students in Grades 4–12.
Operation focus	Additive = Measures including addition or subtraction tasks. Multiplicative = Measures including multiplication or division tasks. Both = Measures that incorporated both additive and multiplicative tasks.
Outcome measure	Fact fluency = Measures that assess students’ mathematics fact fluency (e.g., operations with single-digit whole numbers). Computational fluency = Measures that assess students’ mathematics computational fluency (e.g., operations with multi-digit whole numbers). Word-Problem solving = Measures that assess students’ proficiency in solving word problems involving single- or multi-digit numbers. Other = Measures that assess students’ understanding of concepts such as place value and number line tasks, which do not fit into the other categories but are essential for mathematical reasoning and fluency.
Study-level confounders
MD identification method	Percentile Rank = Students identified based on performance at or below a specific percentile (e.g., 25th and 35th). Multiple Measures = Students identified through a combination of two or more methods (e.g., standardized test scores, percentile rank, teacher reports, district/state LD criteria). Other = Studies using alternative identification criteria.
Ethnicity	> 50% Non-Minority = Studies in which more than 50% of students were ethnic non-minorities (e.g., Whites). 50% Minority = Studies in which more than 50% of students were ethnic minorities (e.g., African Americans, Hispanics). Equal Distribution = Studies in which the proportion of ethnic minorities and non-minorities was approximately equal.
Gender	Majority Male = Studies in which more than 50% of participants were male. Equal Distribution = Studies in which the proportion of males and females was approximately equal. Data Not Reported = Studies for which gender information was not provided
Interventionist	Computer = Interventions delivered via computer programs. Teacher = Interventions delivered by general or special education teachers. Researcher = Interventions delivered by researchers or graduate research assistants. Other = Interventions delivered by other individuals, such as volunteers.
Design	RCT = Used random assignment. QED = Used non-random assignment.
Assignment level	Classroom = Intact classrooms assigned to treatment conditions. Student = Individual students assigned to treatment conditions through random assignment or matched pairs.
Nature of control condition	Alternative Treatment = Control groups received a different form of instruction. Business As Usual (BAU) = Control groups received standard classroom instruction.
Country	USA = Studies conducted in the United States. Other = Studies conducted outside the United States.
Publication era	Pre-NCTM Era (Before 2000) = Studies conducted prior to widespread influence of NCTM standards. NCTM Era (2000–2009) = Studies conducted during the period shaped by NCTM standards but before release of CCSS. CCSS-M Era (2010–2024) = Studies conducted after implementation of CCSS-M (formally released in 2010; adoption and implementation varying across states in years that followed).
Implementation fidelity	Reported = Studies for which researchers assessed and reported implementation fidelity. Not Reported = Studies for which researchers did not include information about implementation fidelity.
Dependent measure type	Researcher-made = Measure was developed by researchers for the intervention. Standardized = Measure was a widely used, standardized (commercially available) assessment.
Funding status	Funded = Studies w/ external financial support through grants or other sources. Not Funded = Studies conducted w/out external funding.

Note. MD = mathematic difficulties; LD = learning disability; RCT = randomized control trial; QED = quasi-experimental design; NCTM = National Council of Teachers of Mathematics; CCSS-M = Common Core State Standards–Mathematics.

After training, the two coders independently extracted study identification information (authors, title, year of publication) as well as dosage indicators (setting, grouping, number of intervention sessions, frequency, and duration) and alignment indicators (operation focus, outcome measure focus, and grade level). In addition, they coded 12 potential study-level confounders representing participant characteristics (e.g., gender and ethnic distribution) and study design features (e.g., assignment level, country, and interventionist). Across the 20 variables coded for each of the 35 studies (700 total ratings), IRR was 93.8% (657/700) and 100% agreement after discussion.

Phase 3: ES Data Extraction

In the third phase, the same two trained coders extracted data necessary for calculating ESs for each eligible outcome. Using Excel spreadsheets, they recorded information such as the name of the dependent measure, the treatment comparison, and the nature of the dependent measure. Pre- and post-test statistics for treatment and control groups were extracted, including the number of participants, mean scores, and standard deviations. When these data were unavailable, the coders recorded other statistical information that could be used to calculate ESs (e.g., t-tests, F-tests, and unstandardized beta coefficients). We note that all studies reported means, standard deviations, and sample sizes at pre- and post-test, with the exception of one study (Kanive et al., 2014), which provided pre–post gain scores. This exception resulted in three fewer coded fields, yielding 1,065 total ratings. Across these coded fields for the 178 effects, we calculated IRR as 98.2% (1,046/1,065), with 19 discrepancies resolved through discussion to arrive at 100% agreement.

Meta-Analytic Procedures

In this section, we describe the meta-analytic procedures, including calculating ESs and their variances, estimating mean effects and heterogeneity, and testing moderators. We also outline the steps taken to evaluate potential publication bias.

ES Calculations

We used Hedges’ g as the primary ES metric, representing the bias-corrected standardized mean difference (SMD) between treatment and control groups. The ESs were first calculated as Cohen’s d and then transformed to Hedges’ g using a small-sample correction to account for bias (Pigott, 2012). ESs were computed as the difference in pre–post change scores between treatment and control groups, divided by the pooled standard deviation (Morris, 2008). For one study that reported only gain scores (Kanive et al., 2014), we used those directly. We used the escalc() function from the metafor R package (Viechtbauer, 2010) to calculate g and its variance.

To aid interpretation of our results, we drew on empirical benchmarks established for educational interventions. Hill et al. (2008) provide context for interpreting ESs in educational research, noting that effects of 0.25 and above represent educationally meaningful impacts in academic interventions. We also reference Cohen’s (1988) conventional benchmarks, where effects of 0.20, 0.50, and 0.80 are considered small, medium, and large, respectively. For educational intervention research specifically, effects in the 0.25 to 0.40 range are often considered practically significant (Hill et al., 2008). Because our meta-analysis focuses on math fact fluency, a specific, trainable skill that can show rapid improvement with targeted practice, time-based learning interpretations may not be appropriate or meaningful for this domain. Therefore, we interpret ESs based on these established statistical and educational frameworks rather than attempting to translate effects into time-based equivalents.

Meta-Analysis Technique

We used robust variance estimation (RVE) with a small-sample adjustment (Fisher et al., 2023) to estimate overall mean ESs and examine moderators while accounting for dependent ESs nested within studies. This approach mitigates inflated Type I error rates that can arise when studies contribute multiple outcomes (Hedges et al., 2010). The RVE also provides valid inference without requiring normally distributed estimates (Fisher et al., 2023). To assess robustness, we varied the assumed correlation between ESs (ρ) from 0.80 to 0.20, with no meaningful impact on the results. We report degrees of freedom (df) for all estimates, treating results with df ≥ 4 as reliable and interpreting those with df below this threshold with caution due to limited power (Vembye et al., 2023).

Mean ES Calculation and Heterogeneity Evaluation

We used an intercept-only model (without predictors) to estimate the mean ES, addressing RQ1. To assess heterogeneity, we calculated a 95% prediction interval, computed as the mean ES ± t * √(τ² + SE²; Borenstein, 2023), which provides the range in which future effects are likely to fall. Consistent with recent methodological recommendations (Borenstein, 2023), we emphasize prediction intervals over traditional heterogeneity statistics (e.g., τ²), as they offer a more practical and policy-relevant interpretation of variability across contexts.

Moderator Analysis

To address RQ2, examining how dosage indicators, alignment indicators, and study-level confounders contribute to heterogeneity in effects, we incorporated these variables into a single meta-regression model using RVE. We used this forced entry approach to reduce the risk of inflated Type I error rates, which are more likely to arise in traditional methods, such as multiple single-variable meta-regressions and subgroup analyses, compared to approaches that model all moderators jointly (Pigott & Polanin, 2020; Tanner-Smith et al., 2016; Tipton et al., 2023). By including all predictors simultaneously, we estimated the effect of each moderator while controlling for the influence of the others. Because all predictors were categorical, we assigned each a reference group, with the estimated β coefficients representing differences between the reference group and other levels of the moderator, controlling for all other variables.

We conducted subgroup analyses to complement the meta-regression findings by partitioning the data according to dosage and alignment indicator levels, thereby estimating the mean effects for each category. We used intercept-only models to obtain the mean effect specific to each dosage and alignment moderator, allowing for a more direct interpretation of their impact. This approach clarifies how variations in dosage and alignment indicators influence intervention outcomes. However, examining each moderator independently introduces statistical multiplicity, which increases the risk of a Type I error when conducting multiple tests (Li et al., 2017). Therefore, these estimates should be interpreted as supplementary and exploratory in nature. We performed all analyses using R version 4.4.0 (R Core Team, 2024) with the robumeta package (Fisher et al., 2023).

Publication Bias

We assessed publication bias using a modified Egger’s test (Rodgers & Pustejovsky, 2021), the Trim-and-Fill Method (Duval & Tweedie, 2000), and PET-PEESE (Stanley, 2017). Publication bias occurs when studies with significant results are more likely to be published, potentially skewing the evidence base (Pigott & Polanin, 2020). The modified Egger’s test, incorporating RVE, showed no significant asymmetry (alt = 1.05, p > .05), indicating no evidence of publication bias. The Trim-and-Fill Method revealed no imputed studies, further supporting the absence of publication bias. The PET-PEESE method showed some evidence of potential bias, identifying a significant positive relationship between ES and standard error (SE = 5.58, p < .05), suggesting small-study effects. However, this result diverged from other methods, such as the modified Egger’s test and the Trim-and-Fill Method, which identified no evidence of publication bias. In addition, PET-PEESE may be unreliable due to its susceptibility to data heterogeneity (Stanley, 2017). Based on the overall consistency of findings across multiple methods and our inclusion of gray literature, we concluded that publication bias was unlikely to have occurred in our study.

Data Screening and Preparation

Before performing our analyses, we screened the data for outliers and missingness. Statistical tests confirmed the presence of extreme values, particularly at the upper end of the distribution (range = 6.24; min = −0.81; max = 5.43). Skewness was 2.32, and kurtosis was 6.98, exceeding the commonly recommended threshold of ±2 (H. Y. Kim, 2013). However, given that mathematics fluency measures often exhibit high variability due to differences in skill acquisition, large effects are expected, especially among students with MD (Kleinert et al., 2018). Other methods for addressing outliers, specifically winsorizing, were not appropriate, as setting the upper limit at g = 1.66 would have excluded educationally meaningful ESs (e.g., g = 1.75), which are considered typical in intervention studies involving students with MD (Hill et al., 2008). Therefore, we retained all effects in the final dataset. There were no missing data.

Results

We included 35 studies published between 1990 and 2023, representing a comprehensive set of fact fluency interventions spanning three decades. Of these, 19 studies targeted upper elementary to secondary school (Grades 4 to 8), while the remaining 16 examined younger students in lower elementary grades (Grades K to 3). The sample sizes varied substantially: treatment groups ranged from 8 to 416 participants, control groups ranged from 7 to 259 participants, and the total sample sizes ranged from 15 to 675. Table 3 summarizes key study characteristics, organized by the primary moderators of interest: dosage and alignment indicators. Descriptive information for additional study-level variables used as statistical controls to reduce potential confounding has been moved to Supplemental Table S3 to improve readability and maintain analytic transparency. The complete reference list for included studies is also provided in the Supplemental Materials.

Table 3.

Descriptive Summary of the 35 Studies Included in Analysis of Dosage and Alignment Indicators.

Study	Dosage indicators (n = 5)					Alignment Indicators (n = 3)
Study	Setting	Grouping	Duration (in hours)	Frequency (# times/wk)	Number of sessions	Grade level	Operation focus	Outcome measures
Abu-Hamour (2019)	Non-Gen. Ed	Whole	>16	3–4 X	≥ 30	Grades 4 +	Multi.	FF
Agaliotis & Teli (2016)	Non-Gen. Ed	Small	< 10	1–2 X	10–20	Grades 4 +	Multi.	FF & CF
Baroody et al. (2013)	Non-Gen. Ed	1-on-1	10–16	1 to 2X	10–20	Grade K–3	Add.	FF & Other
Bryant et al. (2011)	Non-Gen. Ed	Small	>16	Daily	≥ 30	Grade K–3	Add.	FF
Burns et al. (2012)	Gen. Ed	1-on-1	< 10	3–4 X	≥ 30	Grades 4 +	Both	FF & CF
Burns et al. (2019)	Non-Gen. Ed	1-on-1	< 10	3–4 X	< 10	Grades 4 +	Multi.	FF & Other
Christensen & Gerber (1990)	Non-Gen. Ed	1-on-1	< 10	Daily	10–20	Grades 4 +	Add.	FF
Claussen & Thaut (1997)	Non-Gen. Ed	1-on-1	< 10	1–2 X	< 10	Grades 4 +	Multi.	FF & CF
Codding et al. (2022)	Non-Gen. Ed	Small	10–16	1–2 X	21–29	Grades 4 +	Both	FF & Other
Fuchs et al. (2006)	Gen. Ed	1-on-1	< 10	3–4 X	≥ 30	Grade K–3	Add.	FF
Fuchs et al. (2008)	Non-Gen. Ed	1-on-1	10–16	3–4 X	≥ 30	Grade K–3	Add.	FF & CF
Fuchs et al. (2009)	Non-Gen. Ed	1-on-1	>16	3–4 X	≥ 30	Grade K–3	Add.	FF & Other
Fuchs et al. (2010)	Non-Gen. Ed	1-on-1	>16	3–4 X	≥ 30	Grade K–3	Add.	FF
Fuchs et al. (2013)	Non-Gen. Ed	1-on-1	>16	3–4 X	≥ 30	Grade K–3	Add.	FF & CF
Fuchs et al. (2021)	Non-Gen. Ed	1-on-1	>16	3–4 X	≥ 30	Grade K–3	Add.	FF & Other
Holmes & Dowker (2013)	Non-Gen. Ed	1-on-1	< 10	1–2 X	≥ 30	Grade K–3	Add.	FF
Kanive et al. (2014)	Non-Gen. Ed	1-on-1	< 10	1–2 X	< 10	Grades 4 +	Multi.	FF & CF
Koponen et al. (2018)	Non-Gen. Ed	Small	>16	1–2 X	21–29	Grades 4 +	Add.	FF & Other
Kroesbergen & van Luit (2002)	Non-Gen. Ed	Small	10–16	1–2 X	≥ 30	Grades 4 +	Multi.	FF
McTiernan et al. (2016)	Non-Gen. Ed	Small	10–16	3–4 X	21–29	Grades 4 +	Multi.	FF & CF
Menesses & Gresham (2009)	Non-Gen. Ed	1-on-1	< 10	3–4 X	10–20	Grades 4 +	Both	FF & Other
P. M. Nelson et al. (2013)	Non-Gen. Ed	Small	< 10	Daily	< 10	Grades 4 +	Multi.	FF
Okolo (1992)	Non-Gen. Ed	Small	< 10	1–2 X	< 10	Grades 4 +	Both	FF & CF
Omizo et al. (2006)	Gen. Ed	Whole	< 10	Daily	< 10	Grade K–3	Add.	FF & Other
Pixner et al. (2023)	Non-Gen. Ed	1-on-1	< 10	1–2 X	< 10	Grades 4 +	Multi.	FF
Powell et al. (2009)	Non-Gen. Ed	1-on-1	10–16	3–4 X	≥ 30	Grade K–3	Add.	FF & CF
Powell et al. (2023)	Non-Gen. Ed	1-on-1	>16	3–4 X	≥ 30	Grade K–3	Add.	FF & Other
Re et al. (2020)	Non-Gen. Ed	1-on-1	>16	3–4 X	≥ 30	Grade K–3	Multi.	FF
Reed et al. (2015)	Gen. Ed	Whole	< 10	Daily	10–20	Grades 4 +	Multi.	FF & CF
Salminen et al. (2015)	Non-Gen. Ed	1-on-1	< 10	Daily	10–20	Grade K–3	Add.	FF & Other
Sarrell (2014)	Non-Gen. Ed	1-on-1	< 10	3–4 X	21–29	Grades 4 +	Both	FF
Tournaki (2003)	Non-Gen. Ed	1-on-1	< 10	Daily	< 10	Grade K–3	Add.	FF & CF
van Galen & Reitsma (2010)	Gen. Ed	Whole	< 10	3–4 X	10–20	Grade K–3	Add.	FF & Other
Van Luit & Naglieri (1999)	Non-Gen. Ed	Small	>16	3–4 X	≥ 30	Grades 4 +	Multi.	FF
Woodward (2006)	Gen. Ed	Whole	< 10	Daily	10–20	Grades 4 +	Multi.	FF & CF

Note. Gen. Ed = general education; Add. = Additive; Both = Additive and Multiplicative; FF = Fact Fluency; CF = Computational Fluency; Mult. = Multiplicative; 1–2 X = once or twice per week; 3–4 X = three to four times per week; Daily = 5 times per week; Whole = whole group. Small = small group; Grades 4+ = upper elementary to secondary grades. Grade K–3 = lower elementary.

Study Quality Summary

Table 4 presents the results of the quality appraisal across the 35 included studies, based on the 24 quality standards (QI) outlined by the CEC (Cook et al., 2015). Overall, the studies demonstrated strong methodological rigor, with all meeting core indicators related to the definition of disability or risk status (QI-3), specification of intervention procedures (QI-6), appropriate statistical analyses (QI-23), and reporting of ESs (QI-24). Indicators related to study design, such as group assignment procedures (QI-14) and descriptions of the comparison group (QI-12), were also consistently met. Although participant demographic characteristics (QI-2) were reported in all studies, the level of detail varied considerably. While gender was consistently documented, 16 studies did not provide information on race or ethnicity. This incomplete reporting limits the extent to which findings can be generalized to diverse populations and constrains the ability to examine demographic moderators of intervention effects, a crucial consideration given the known sources of heterogeneity in educational outcomes (Cheema & Galluzzo, 2013).

Table 4.

Adherence to Council for Exceptional Children Group Design Quality Indicators Across the 35 Included Studies.

Quality indicator	Abbreviated description	Studies meeting criteria
Quality indicator	Abbreviated description	#	%
QI-1	Contextual features described	29	83
QI-2	Participant demographics described	35	100
QI-3	Disability/risk status defined	35	100
QI-4	Intervention agent role described	24	69
QI-5*	Interventionist training described	19	66
QI-6	Intervention procedures described	35	100
QI-7	Instructional materials described	35	100
QI-8	Fidelity—adherence assessed	23	66
QI-9	Fidelity—dosage assessed	21	60
QI-10	Fidelity—timing/coverage assessed	22	63
QI-11	Independent variable controlled	33	94
QI-12	Comparison condition described	35	100
QI-13	Comparison group access restricted	34	97
QI-14	Group assignment described/controlled	35	100
QI-15	Low overall attrition	35	100
QI-16	Low/control of differential attrition	35	100
QI-17	Socially important outcomes	35	100
QI-18	DV(s) defined and described	35	100
QI-19	All outcomes reported	35	100
QI-20	Frequency/timing of measures appropriate	35	100
QI-21	Evidence of score reliability	20	57
QI-22	Evidence of validity	34	97
QI-23	Appropriate statistical analysis	35	100
QI-24	Effect sizes reported or calculable	35	100

Note. N = 35. This table summarizes the number and percentage of studies meeting each quality indicator (QI) criterion, based on the quality appraisal tool described in Cook et al. (2015). Supplemental Table S1 reports study-level coding for each indicator. Full descriptions of the QIs are available in the cited source; QI-5 (Interventionist training) was not applicable to six studies involving computer-implemented interventions. These studies were excluded from the denominator for this indicator, resulting in a total of 29 eligible studies and a denominator of 29 for QI-5.

Indicators related to implementation fidelity were met less frequently. Fidelity of dosage (QI-9) and timing or coverage (QI-10) were reported in only 60% and 63% of studies, respectively, while adherence-related fidelity data (QI-8) appeared in just 66%. In addition, only 57% of studies provided evidence of score reliability (QI-21), highlighting persistent gaps in measurement rigor. Detailed, study-level quality scores are presented in Supplemental Table S1, which offers a comprehensive breakdown of each study’s adherence to QIs. Despite these limitations, the overall high quality of the included studies supports the validity of the meta-analytic findings.

Overall Mean Effect of Fact Fluency Interventions

The intercept-only model using RVE indicated a positive and significant average treatment effect (g = 0.76, 95% CI = [0.46, 1.06], p < .001) across 35 studies contributing 178 ESs. However, heterogeneity was substantial. The 95% prediction interval was wide (PI = −0.60 to 2.12), indicating that the true effect of a new, similar study could plausibly range from moderately negative to highly positive (Borenstein, 2023). This considerable spread suggests that intervention outcomes are meaningfully influenced by malleable factors, such as dosage and implementation alignment, as well as study-level characteristics (e.g., student demographics, research design). These findings underscore the critical importance of investigating these potential moderators via meta-regression analysis.

Meta-Regression Analysis

Tables 5 and 6 present complementary perspectives on the moderator analyses. Table 5 provides descriptive mean ESs for each level of the malleable factors, with significant indicators highlighted for clarity. These estimates are exploratory and should be interpreted cautiously, as they do not account for overlap among moderators and carry an increased risk of Type I error from multiple comparisons. Table 6 reports the results of the meta-regression model, which simultaneously tested malleable intervention factors (i.e., dosage and alignment indicators) alongside study-level confounders. In this model, two dosage indicators (grouping and total sessions) and two alignment indicators (operation focus and outcome measure) emerged as significant moderators. However, their effects were potentially confounded by three study-level variables (MD identification method, ethnicity, and publication era), suggesting that contextual features may have influenced the observed relationships. For the full set of results, including all moderators (both malleable and study-level confounders), see the Supplemental Materials (see Tables S2A and S2B).

Table 5.

Descriptive Statistics: Mean Effect of Categories Within Each Dosage and Alignment Indicator.

Moderator	Level	n	k	g	SE	df	p	CI	95% PI
Dosage indicators
Setting	General education classroom	6	23	0.57	0.35	4.9	.166	[−0.34, 1.49]	[−1.40, 2.54]
	Non-general education classroom	29	155	0.80	0.17	27.3	>.001*	[0.46, 1.14]	[−0.57, 2.17]
Grouping	One-on-one	21	103	0.72	0.16	20.4	>.001*	[0.40, 1.04]	[−0.62, 2.06]
	Small	9	53	0.54	0.19	8.8	.020*	[0.11, 0.97]	[−0.48, 1.56]
	Large	5	22	1.35	0.82	4.0	.176	[−0.93, 3.64]	[−7.21, 9.91]
Duration	Fewer than 10 hours	19	56	0.81	0.23	17.6	.002*	[0.33, 1.28]	[−0.91, 2.53]
	10–16 hours	6	54	0.37	0.05	4.8	.001*	[0.24, 0.49]	[−0.38, 1.12]
	Over 16 hours	10	68	0.97	0.33	8.9	.017*	[0.22, 1.72]	[−1.03, 2.97]
Frequency	Daily	8	28	0.71	0.36	7.0	.090	[−0.14, 1.57]	[−2.27, 3.69]
	Once or twice per week	10	62	0.58	0.23	8.5	.033*	[0.06, 1.10]	[−0.60, 1.76]
	3–4 times per week	17	88	0.88	0.22	15.8	.001*	[0.42, 1.34]	[−0.76, 2.52]
Total sessions	< 10 sessions	8	22	1.52	0.44	6.9	.011*	[0.48, 2.57]	[−0.91, 3.95]
	10–20 sessions	8	44	0.32	0.28	6.9	.291	[−0.34, 0.98]	[−1.68, 2.32]
	21–29 sessions	4	20	0.43	0.13	2.8^a	.051*	[0.00, 0.86]	[NA, NA]
	30 or more sessions	15	92	0.76	0.21	13.8	.003*	[0.31, 1.22]	[−0.77, 2.29]
Alignment Indicators
Grade level	Lower elementary (Grades 1 & 2)	16	105	0.59	0.14	14.7	.001*	[0.29, 0.89]	[−0.85, 2.03]
	Upper elementary & secondary (Grades 3+)	19	73	0.94	0.26	17.6	.002*	[0.39, 1.49]	[−0.83, 2.71]
Operation focus	Additive	17	115	0.49	0.15	15.7	.005*	[0.17, 0.81]	[−0.89, 1.87]
	Multiplicative	13	49	1.27	0.37	11.9	.005*	[0.47, 2.07]	[−1.38, 3.92]
	Both	5	14	0.69	0.26	4.0	.059	[−0.04, 1.41]	[−0.72, 2.10]
Outcome measure^b	Fact fluency	29	92	0.89	0.18	27.3	>.001*	[0.52, 1.26]	[−0.55, 2.33]
	Computational fluency	14	35	0.70	0.19	13.0	.002*	[0.30, 1.10]	[−0.94, 2.34]
	Word-problem solving	9	29	0.25	0.06	5.0	.011*	[0.09, 0.41]	[0.11, 0.39]
	Other	7	22	0.32	0.10	5.8	.021*	[0.07, 0.56]	[−0.35, 0.99]

Note. n = number of studies; k = number of effect sizes; CI = 95% confidence interval; PI = 95% prediction interval; [NA, NA] = Too few degrees of freedom to obtain prediction intervals. Bolded moderators were statistically significant predictors of effect size in the full meta-regression model (see main text). While subgroup differences are illustrated here, the bolded values indicate significance in the simultaneous meta-regression, which controls for other moderators. All estimates should be interpreted cautiously due to increased risks of Type I errors.

Estimate is not reliable (df < 4). ^b Total number of studies for the outcome measures indicator will not sum to 35 because the categories are not mutually exclusive; some studies reported multiple types of outcome measures and were therefore coded in more than one category.

Significant at p < .05.

Table 6.

Meta-Regression Results: Dosage and Alignment Indicators Only.

Moderator and levels	β	SE	T	df	p	CI
Intercept	3.28	2.21	1.48	6.2	.187	[−2.09, 8.66]
Dosage indicators
Setting (Ref. Gen. Ed. classroom)
Non-General Education	0.52	0.77	0.68	6.0	.522	[−1.36, 2.41]
Grouping (Ref. One-on-One)
Small	−0.89	0.31	−2.88	4.8	.036*	[−1.69, −0.09]
Large	3.02	0.98	3.06	5.9	.023*	[0.60, 5.43]
Duration (Ref. < 10 hours)
10–16 hours	0.41	0.80	0.52	5.2	.627	[−1.62, 2.45]
Over 16 hours	−0.66	0.81	−0.81	6.2	.450	[−2.63, 1.32]
Frequency (Ref. Daily)
Once or twice per week	−0.05	0.50	−0.10	6.2	.922	[−1.27, 1.17]
3–4 times per week	−0.58	0.39	−1.48	5.6	.192	[−1.56, 0.40]
Total sessions (Ref. < 10 sessions)
10–20 sessions	−0.86	0.47	−1.82	6.3	.117	[−2.01, 0.29]
21 to 29 sessions	0.41	1.22	0.33	5.8	.751	[−2.60, 3.42]
30+ sessions	1.83	0.71	2.58	5.4	.046*	[0.04, 3.62]
Alignment Indicators
Grade level (Ref. Lower elementary)
Upper elementary and secondary	−0.22	0.67	−0.32	6.1	.758	[−1.85, 1.42]
Operation focus (Ref. Both)
Additive	−1.90	0.64	−2.95	6.6	.023*	[−3.44, −0.36]
Multiplicative	−0.59	0.58	−1.02	6.0	.349	[−2.02, 0.83]
Outcome measure (Ref. Computational fluency)
Fact fluency	−0.12	0.14	−0.89	10.3	.394	[−0.43, 0.19]
Word-problem solving	−0.66	0.23	−2.90	10.9	.015*	[−1.16, −0.16]
Other	−0.34	0.37	−0.90	4.7	.413	[−1.32, 0.65]

Note. N = 35 studies. Number of effect sizes (k) = 178. Bolded = significant moderator; β = estimated difference between reference group and other levels of the moderator, controlling for effects of other moderators in the model; T = T statistic; CI = 95% confidence interval; Ref. = reference category.

Significant at p < .05.

Intervention Dosage

Two of the five dosage indicators (grouping and total sessions) were significantly associated with the effects, while the remaining variables (frequency, setting, and duration) were not. For grouping, we classified studies based on the size of the instructional groups used in the intervention: one-on-one, small-group, and large-group. Results revealed that interventions delivered in small groups were associated with lower ESs (β = −0.89, p = .036) compared to one-on-one instruction, the reference group. In addition, interventions delivered to large groups were associated with larger ESs than one-on-one instruction (β = 3.02; p = .023).

For total sessions, we compared the number of intervention sessions across four categories: fewer than 10 sessions, 10–20 sessions, 21–29 sessions, and 30 or more sessions. Results indicated that interventions with 30 or more sessions produced higher effects than those with fewer than 10 sessions (β = 1.83, p = .046), the reference category. Estimates for the other comparisons (10–20 sessions and 21–29 sessions) were not significant.

Intervention Alignment

Two alignment indicators, representing alignment with the curriculum (operation focus and outcome measure), influenced intervention effects. However, the student alignment indicator examined (grade level) was not a significant moderator. For operation focus, we grouped studies based on whether interventions targeted addition/subtraction (additive), multiplication/division (multiplicative), or both. Estimates revealed that the only significant difference in effect was for interventions targeting addition and subtraction (β = −1.90, p = .023), which were associated with smaller ESs compared to interventions addressing both additive and multiplicative operations, the reference category. To examine the impact of outcome measures, we grouped studies into four categories: fact fluency, computational fluency, word-problem solving (assessing multi- or single-digit operations embedded in word problems), and “other” (targeting additional skills such as number sense and numeracy). Results indicated that interventions measuring word-problem solving produced significantly smaller effects than those measuring computational fluency, the reference category (β = −0.66, p = .015). None of the remaining comparisons were significant.

Discussion

Students’ fact fluency forms a foundation for success in more advanced mathematics, including algebra and fractions (Powell et al., 2025; Witzel, 2016). However, many students with MD find it challenging to develop fluency and need focused, intensive instruction (Burns et al., 2010; Codding et al., 2011). To strengthen instructional decision-making, it is essential to identify which fact fluency interventions are most effective for students with MD and to determine the instructional conditions, such as grouping, frequency, and alignment with students’ grade level and operation focus, that influence their success. This meta-analysis is the first to focus exclusively on group-design studies aimed at improving fact fluency among students with MD. We applied a rigorous methodological framework and statistical controls for 12 study-level confounders to investigate the moderating effects of five dosage and three alignment indicators. This design enabled us to identify conditions associated with stronger intervention effects and to provide evidence that supports more tailored and practical instruction for students with MD.

RQ1: Mean Effect Across Interventions

Our findings indicate that mathematics fact fluency interventions yield substantial and significant benefits for students with MD (g = 0.76). According to Lipsey et al. (2012), this estimate suggests that approximately 78% of students in the treatment group would score above the mean of students in the comparison group (as distinct from other ES interpretations such as distributional overlap), providing compelling evidence of educational impact. Our estimate exceeds conventional benchmarks for educational interventions (Hill et al., 2008) and represents a large effect according to Cohen’s (1988) standards. This magnitude of effect is particularly meaningful for students with MD, who often struggle with persistent difficulties in mathematics despite receiving typical classroom instruction. These findings underscore the practical value of targeted fluency-focused interventions for this population, demonstrating that focused intervention can produce substantial improvements in foundational mathematical skills that are critical for broader mathematical competence. Although our analysis focuses specifically on fact fluency, the average effect aligns with findings from broader mathematics intervention research. For example, Myers et al. (2022) reported an ES of g = 1.01 (g = 0.81 after outlier removal) for word-problem interventions among students with MD, and Stevens et al. (2018) reported comparable impacts across other mathematics domains. These parallels reinforce the robustness and generalizability of our findings. While substantial heterogeneity remains, the overall findings provide robust evidence of the benefits of fluency interventions, supported by the strong methodological quality of the included studies and the appropriate modeling of dependent effects.

The wide prediction interval (−0.60 to 2.12) indicates substantial variation in effects across studies. This variability likely reflects differences in malleable instructional features, such as group size, frequency, and other dosage and alignment indicators, as well as broader contextual factors. To account for these sources of variability and reduce potential bias, we examined all 20 moderators in our meta-regression model, encompassing dosage indicators, alignment indicators, and study-level confounders. Although study-level confounders were not the central focus of our analysis, one notable finding was that publication era emerged as a significant moderator. Studies conducted prior to the release of the NCTM Standards yielded larger ESs than those from the NCTM era (2000–2009) or the CCSS-M era (2010–2024). However, this difference was significant only when comparing the pre-NCTM era to the CCSS-M era (see Supplemental Tables S2A and S2B). It is important to note that the number of pre-NCTM studies was small (n = 4) relative to the NCTM (n = 22) and CCSS-M (n = 9) eras, which limits the confidence in this finding. Nonetheless, the observed pattern is consistent with major shifts in mathematics education: the NCTM Standards (2000) emphasized conceptual understanding, reasoning, and problem-solving over rote fluency (Schoenfeld, 2004), a shift that often led to reduced time for systematic fluency practice. Later reforms, including the Common Core State Standards, reinstated fluency as a foundational goal but embedded it within broader reasoning objectives. Therefore, accounting for publication era in math intervention research may be necessary to disentangle intervention effects from historical changes in curricular emphasis (Lein et al., 2020; Myers et al., 2022). More research is needed to clarify how such large-scale policy shifts have shaped both opportunities for fluency instruction and the effectiveness of interventions over time.

RQ2: Moderating Effect of Dosage and Alignment Indicators

To investigate sources of variation in intervention effects, we conducted a meta-regression focused on malleable instructional features, specifically, dosage and alignment indicators. We also included a comprehensive set of study-level covariates identified in prior research as potential confounders, which helped isolate the effects of the focal moderators and reduce bias due to omitted variables (Tipton et al., 2023). Our analysis revealed that grouping and number of sessions significantly moderated intervention effects among the dosage indicators, while operation focus and outcome measure were significant among the alignment indicators. In addition, three study-level covariates (MD identification method, student ethnicity, and publication era) were associated with variations in ESs. These findings suggest that both modifiable instructional features and broader study characteristics influence the efficacy of fact fluency interventions, with implications for research and practice.

Intervention Dosage

We assessed how intervention effects varied across five indicators related to dosage (Myers et al., 2024), including setting, grouping, duration, frequency, and total sessions. Our analysis revealed significant moderating effects for two: grouping and total sessions.

Grouping

Our analysis identified the grouping format as a significant moderator of intervention effects. Interventions delivered in large-group formats yielded larger effects than those delivered in one-on-one and small-group formats, with the latter producing the smallest effects. This pattern is consistent with findings from Myers et al. (2022), who reported similarly higher effects for large-group formats in a meta-analysis of word-problem interventions, and with work by Barrett and VanderHeyden (2020), which demonstrates the effectiveness and cost efficiency of classwide math interventions. Although counterintuitive given the emphasis on individualized instruction for students with MD, these findings suggest that group size may interact with other features (e.g., peer dynamics, instructional structure) in ways that warrant further investigation. However, because only five studies in our sample used large-group formats, these results should be interpreted cautiously and replicated in future research.

Total Sessions

Among the three dosage indicators we examined, only the total number of intervention sessions significantly moderated ESs, although these findings require nuanced interpretation given the complexity revealed in our data. When controlling for other moderators in our meta-regression analysis, interventions implemented for 30 or more sessions produced significantly larger effects than those with fewer than 10 sessions (β = 1.83, p = .046). However, our descriptive statistics revealed a seemingly contradictory pattern, with interventions of fewer than 10 sessions showing the highest raw ES (g = 1.52). This apparent contradiction underscores important complexities in dosage research that align with findings from Codding et al. (2016), who directly examined intervention frequency while holding total dosage constant. Their study found that four-times-weekly sessions (resulting in fewer total sessions) outperformed less frequent but longer interventions, particularly for basic computation skills, supporting the use of distributed over massed practice for simple mathematical tasks.

These patterns suggest that the relationship between session count and effectiveness is not straightforward. Short-duration interventions may appear highly effective because students reached mastery criteria and the intervention was appropriately discontinued, rather than because brief interventions are inherently superior. In addition, the optimal dosage likely varies across students; some may benefit from sustained practice over multiple sessions, while others may require more intensive modeling before engaging in extended practice. Given these complexities, our findings on total session counts should be interpreted with considerable caution. Future research should directly manipulate the total number of sessions to develop more precise dosage recommendations for mathematics fluency interventions for students with MD. Until such evidence accumulates, our findings on total session counts should be regarded as tentative rather than as prescriptive guidance for practice.

Intervention Alignment

We examined three alignment indicators categorized into student alignment (grade level) and content alignment (operation focus and outcome measure; Myers et al., 2024). Meta-regression results demonstrated that the content alignment indicators significantly influenced intervention effects, emphasizing the importance of aligning instructional content with the type of mathematical operation targeted and the outcome being assessed.

Operation Focus

Regarding the operation focus, estimates indicated that interventions addressing both additive (i.e., addition or subtraction) and multiplicative (i.e., multiplication or division) tasks produced significantly larger effects than those focused solely on additive operations. No significant difference emerged between interventions targeting only multiplicative operations and those that addressed both. These findings are consistent with prior work (S. A. Kim et al., 2023) and suggest that interventions integrating multiple operation types may promote broader fluency development. However, due to the small number of studies addressing both additive and multiplicative operations (n = 5, k = 14), these results should be interpreted with caution. More research is needed to clarify whether combining operation types consistently enhances the effectiveness of mathematics fluency interventions for students with MD.

Outcome Measure

For the type of outcome, results showed that measures evaluating students’ word-problem-solving performance yielded significantly smaller effects than those assessing computational fluency. This pattern likely reflects the increased complexity and linguistic demands of word problems, which often require integrating reading comprehension, vocabulary, problem representation, and multistep reasoning, skills that are particularly challenging for students with MD (Benz & Powell, 2020; Cirino et al., 2015; Fuchs et al., 2006). Still, the significant average effect observed for word-problem outcomes (g = 0.25) suggests that fact fluency interventions may support transfer to more complex mathematical applications (Powell et al., 2023).

Estimates indicated no significant difference between fact fluency and computational fluency outcomes. This lack of difference may not be surprising, as both rely on foundational number sense, arithmetic, and procedural strategies (Burns et al., 2010; Geary, 2013; Witzel, 2016) and often incorporate similar instructional approaches such as mental math techniques and number decomposition. A small number of studies (n = 7) assessed outcomes classified as “other” (e.g., number line tasks, place value), and these also did not significantly differ from fluency or computation outcomes. However, given the limited number of studies in this category, further research is needed to clarify how the nature of the outcome assessed influences the observed effects of fluency interventions.

Limitations and Caveats

Our meta-analysis examining fact fluency interventions for students with MD offers valuable insights but is subject to several limitations. First, the limited sample size, comprising 35 studies that contributed 178 ESs, restricted the scope and precision of our moderator analyses. In some cases, sparse data within moderator categories necessitated collapsing levels (e.g., merging multiple outcome measures into an “Other” category), which reduced the specificity of our findings. In addition, the modest sample size prohibited us from testing other potentially important moderators, such as instructional features (e.g., modeling, feedback, or use of representations), which may influence intervention effects (Dennis et al., 2016; Jitendra et al., 2018). While degrees of freedom for malleable moderators in our meta-regression met the threshold for statistical reliability (df ≥ 4), the breadth and granularity of moderator analyses were still constrained by the available data.

Second, it is essential to recognize that RVE generally has lower statistical power compared to traditional meta-analytic approaches. This reduced power may have limited our ability to detect some meaningful differences between moderator categories, particularly for variables that showed non-significant effects (Pigott, 2012). However, given that most of our key moderators achieved adequate degrees of freedom as noted above, this power limitation is less concerning for our primary findings. The power limitations are most relevant when interpreting null findings, where the absence of significant effects could reflect either a true lack of association or insufficient power to detect existing differences.

Third, we restricted the analysis to group-design studies and excluded SCD studies due to conceptual differences in comparison conditions (Ledford & Gast, 2018). While this improved internal consistency, it limits the generalizability of findings across research designs (Kratochwill & Levin, 2010). Fourth, although we made efforts to include gray literature (e.g., dissertations and theses), most included studies were peer-reviewed, which introduces potential publication bias (Pigott & Polanin, 2020). This imbalance may have led to the underrepresentation of null or negative results. Fifth, demographic reporting was inconsistent across studies, particularly for ethnicity, which was not reported in 16 studies. This omission limited our ability to evaluate the efficacy of the intervention across diverse populations. Finally, although we included a broad set of study-level covariates to mitigate omitted variable bias, we did not fully explore or interpret their effects, potentially overlooking meaningful patterns, such as those related to publication era (Lein et al., 2020; Myers et al., 2022).

Implications for Research

Addressing the limitations outlined above is essential for advancing the evidence base on fact fluency interventions for students with MD. First, the limited number of group-design studies restricts the precision and scope of moderator analyses. Expanding the volume of high-quality, group-based experimental and quasi-experimental research is critical for enabling more granular moderator testing, particularly for instructional features such as modeling, feedback, and the use of representations, components that remain underexamined due to current sample size constraints (Dennis et al., 2016; Jitendra et al., 2018). Future studies should be adequately powered to detect moderator effects and explore their interactions with confounding variables, thereby informing how interventions can be more precisely tailored to meet student needs.

Second, future research should incorporate more advanced statistical techniques to address the challenges posed by high heterogeneity and limited sample sizes. Non-parametric, machine learning-based methods such as random forests and MetaForest offer a promising solution because they do not rely on traditional distributional assumptions or large sample sizes (Van Lissa, 2017). These approaches are particularly well-suited for meta-analyses with numerous potential moderators and complex interaction structures, as they can detect nonlinear effects and interactions that may be overlooked by conventional meta-regression. Applying these flexible analytic tools could help uncover more nuanced patterns in intervention effectiveness and support the design of more tailored, data-driven educational strategies.

Third, given the conceptual and methodological differences between group-design studies and SCDs, future research is advised to examine and report findings from each design type separately. Analyzing them independently will enable clearer, context-specific conclusions without conflating fundamentally different comparison structures. Moreover, SCDs often include more detailed demographic information (e.g., ethnicity), which can enhance understanding of intervention effects for specific subgroups of students with MD. Finally, future research should prioritize consistent and transparent reporting of demographic characteristics, particularly ethnicity. As noted by G. Nelson et al. (2023) and reflected in our quality analysis, many studies failed to report participants’ ethnicity, limiting the ability to assess the generalizability of their findings to diverse populations. This omission restricts efforts to examine the potential moderating effects of ethnicity on intervention outcomes, an important consideration given the known sources of heterogeneity in educational achievement. Addressing these gaps will enhance the field’s capacity to evaluate the efficacy of interventions across student subgroups with MD.

Implications for Practice

Despite the limitations outlined above, this meta-analysis provides valuable insights into how intervention dosage and content alignment can be used to enhance math fact fluency instruction for students with MD (Fuchs et al., 2017). The large, positive, and statistically significant mean effect indicates that fact fluency interventions are linked to broader improvements in skills such as multidigit computation and word-problem solving. Effective interventions typically included strategies, such as CCC, computer-based programs, incremental rehearsal, and other structured approaches aimed at building fluency. These methods focus on repeated, distributed practice to foster automaticity in foundational operations, a crucial aspect of mathematical development (Geary, 2013). Incorporating such strategies into regular instruction may improve students’ accuracy, speed, and confidence in fact retrieval (Burns et al., 2010; Codding et al., 2022).

Intervention Dosage: Session Quantity and Grouping

Although our regression analysis indicated that interventions with 30 or more sessions produced the largest effects once other moderators were controlled, the practice-facing results point to a different pattern: shorter, mastery-based interventions sometimes yielded especially large effects. Importantly, large-group (classwide) formats consistently emerged as the most effective grouping condition across both descriptive and regression analyses. At the same time, one-on-one and small-group formats also produced positive effects, underscoring that multiple grouping structures can be beneficial when fluency instruction is structured and responsive to student needs. In practice, grouping decisions should be responsive to students’ individual learning profiles and behavioral needs (Benz & Powell, 2020), as placing students with substantial behavioral challenges in large groups or extending instruction for too many sessions may be counterproductive. Taken together, these results suggest that teachers may be able to deliver fluency instruction efficiently in classwide settings and, in some cases, over relatively few sessions, provided instruction is mastery-based (Burns et al., 2010; Codding et al., 2016) and student progress is carefully monitored across both academic and behavioral domains (Benz & Powell, 2020). Ultimately, the ideal number of sessions and grouping format should be determined by ongoing progress monitoring, ensuring instruction continues only as long as necessary for students to reach mastery.

Content Intervention Alignment: Operation Focus and Task Type

Our results underscore the importance of content alignment in enhancing fluency instruction. Interventions targeting both additive and multiplicative content yielded larger effects than those focused solely on addition and subtraction. This pattern suggests that addressing a broader range of operations may support greater computational flexibility and facilitate transfer across mathematical tasks, particularly as students advance beyond basic arithmetic. Although relatively few studies incorporated both operation types, the consistency of this effect points to a promising direction for designing fluency instruction that builds across domains.

Differences in effects also emerged based on the outcome assessed. Interventions evaluated with computational fluency measures showed stronger effects than those assessed with word-problem solving, which often involve higher cognitive and linguistic demands and may be particularly challenging for students with MD, especially those with co-occurring reading difficulties (Cirino et al., 2015; Powell et al., 2009). While fact fluency provides an essential foundation, interventions focused solely on fluency may not fully support students in solving more complex applied problems. These findings highlight the importance of aligning instructional goals with the intended learning outcomes. When targeting broader mathematical competencies, fluency-building efforts may need to be paired with more comprehensive supports to ensure that gains in foundational skills extend to more complex mathematical reasoning.

Conclusion

Our findings underscore the importance of intensifying fluency interventions for students with MD by making strategic adjustments to intervention dosage and aligning content (Fuchs et al., 2017). Increasing the number of sessions and optimizing grouping structures enhance instructional intensity by providing more opportunities for distributed practice and retrieval of facts. In addition, aligning intervention content with students’ specific learning needs, particularly by targeting suitable operations and emphasizing computational fluency over complex word problems, may improve efficacy (Myers et al., 2022; Powell et al., 2009). These results suggest that reducing cognitive demands and focusing on foundational fluency skills can strengthen instructional impact. However, findings should be interpreted with caution given sample limitations and the potential influence of unmeasured confounding variables. Further research with larger and more diverse samples is needed to confirm and extend these conclusions.

Supplemental Material

sj-docx-1-ldx-10.1177_00222194261424914 – Supplemental material for A Meta-Analysis of Mathematics Fact Fluency Interventions for Students With Mathematics Difficulties (MD)

Supplemental material, sj-docx-1-ldx-10.1177_00222194261424914 for A Meta-Analysis of Mathematics Fact Fluency Interventions for Students With Mathematics Difficulties (MD) by Grace P. Douglas, Jonté A. Myers, Kathleen K. Mason, Sarah R. Powell and Danielle O. Lariviere in Journal of Learning Disabilities

Supplemental Material

sj-docx-2-ldx-10.1177_00222194261424914 – Supplemental material for A Meta-Analysis of Mathematics Fact Fluency Interventions for Students With Mathematics Difficulties (MD)

Supplemental material, sj-docx-2-ldx-10.1177_00222194261424914 for A Meta-Analysis of Mathematics Fact Fluency Interventions for Students With Mathematics Difficulties (MD) by Grace P. Douglas, Jonté A. Myers, Kathleen K. Mason, Sarah R. Powell and Danielle O. Lariviere in Journal of Learning Disabilities

Supplemental Material

sj-docx-3-ldx-10.1177_00222194261424914 – Supplemental material for A Meta-Analysis of Mathematics Fact Fluency Interventions for Students With Mathematics Difficulties (MD)

Supplemental material, sj-docx-3-ldx-10.1177_00222194261424914 for A Meta-Analysis of Mathematics Fact Fluency Interventions for Students With Mathematics Difficulties (MD) by Grace P. Douglas, Jonté A. Myers, Kathleen K. Mason, Sarah R. Powell and Danielle O. Lariviere in Journal of Learning Disabilities

Footnotes

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

ORCID iDs

Grace P. Douglas

Jonté A. Myers

Kathleen K. Mason

Sarah R. Powell

Danielle O. Lariviere

Supplemental Material

Supplemental material for this article is available at .

References

Abu-Hamour

(2019). Improving basic multiplication fact recall for students with mathematics learning disabilities. Mu’tah Lil-Buhuth wad-Dirasat, Humanities and Social Sciences Series, 34(5), 2–54.

Agaliotis

Teli

(2016). Teaching arithmetic combinations of multiplication and division to students with learning disabilities or mild intellectual disability: The impact of alternative fact grouping and the role of cognitive and learning factors. Journal of Education and Learning, 5(4), 90. https://doi.org/10.5539/jel.v5n4p90

Asempapa

Sturgill

(2017). How did we get here? The path to our current K-12 mathematics education curriculum in the United States. Aurco Journal, 23, 1–10. https://aurco.org/journals/AURCO_Journal_2017/Current%20K%2012%20Mathematics%20Education%20Curriculum%20p1-10.pdf

Ashcraft

M. H.

Krause

J. A.

(2007). Working memory, math performance, and math anxiety. Psychonomic Bulletin & Review, 14(2), 243–248. https://doi.org/10.3758/BF03194059

Bailey

D. H.

Siegler

R. S.

Geary

D. C.

(2014). Early predictors of middle school fraction knowledge. Developmental Science, 17, 775–785. https://doi.org/10.1111/desc.12155

Baroody

A. J.

Eiland

M. D.

Purpura

D. J.

Reid

E. E.

(2013). Can computer-assisted discovery learning foster first graders’ fluency with the most basic addition combinations? American Educational Research Journal, 50(3), 533–573. https://doi.org/10.3102/0002831212473349

Barrett

C. A.

VanDerHeyden

A. M.

(2020). A cost-effectiveness analysis of classwide math intervention. Journal of School Psychology, 80, 54–65. https://doi.org/10.1016/j.jsp.2020.04.002

Benz

S. A.

Powell

S. R.

(2020). The influence of behavior on performance within a word-problem intervention for students with mathematics difficulty. Remedial and Special Education, 42(3), 182–192. https://doi.org/10.1177/0741932520923063

Bloom

H. S.

Hill

C. J.

Black

A. R.

Lipsey

M. W.

(2008). Performance trajectories and performance gaps as achievement effect-size benchmarks for educational interventions. Journal of Educational Effectiveness, 1(4), 289–328. https://doi.org/10.1080/19345740802400072

10.

Borenstein

(2023). How to understand and report heterogeneity in a meta-analysis: The difference between I-squared and prediction intervals. Integrative Medicine Research, 12(4), 1–8. https://doi.org/10.1016/j.imr.2023.101014

11.

Bryant

D. P.

Bryant

B. R.

Roberts

Vaughn

Pfannenstiel

K. H.

Porterfield

Gersten

(2011). Early numeracy intervention program for first-grade students with mathematics difficulties. Exceptional Children, 78(1), 7–23. https://doi.org/10.1177/001440291107800101

12.

Burns

M. K.

Aguilar

L. N.

Young

Preast

J. L.

Taylor

C. N.

Walsh

A. D.

(2019). Comparing the effects of incremental rehearsal and traditional drill on retention of mathematics facts and predicting the effects with memory. School Psychology, 34(5), 521–530. https://doi.org/10.1037/spq0000312

13.

Burns

M. K.

Codding

R. S.

Boice

C. H.

Lukito

(2010). Meta-analysis of acquisition and fluency math interventions with instructional and frustration level skills: Evidence for a skill-by-treatment interaction. School Psychology Review, 39(1), 69–83. https://doi.org/10.1080/02796015.2010.12087791

14.

Burns

M. K.

Duesenberg-Marshall

M. D.

Romero

M. E.

Sussman-Dawson

K. J.

Singell

(2024). Meta-Analysis of the effect of technology-based mathematical fact practice on mathematics outcomes. Journal of Special Education Technology, 40(3), 332–343. https://doi.org/10.1177/01626434241288199

15.

Burns

M. K.

Kanive

DeGrande

(2012). Effect of a computer-delivered math fact intervention as a supplemental intervention for math in third and fourth grades. Remedial and Special Education, 33(3), 184–191. https://doi.org/10.1177/0741932510381652

16.

Burns

M. K.

Ysseldyke

Nelson

P. M.

Kanive

(2015). Number of repetitions required to retain single-digit multiplication math facts for elementary students. School Psychology Quarterly, 30(3), 398–405. https://doi.org/10.1037/spq0000097

17.

Cason

Young

Kuehnert

(2019). A meta-analysis of the effects of numerical competency development on achievement: Recommendations for mathematics educators. Investigations in Mathematics Learning, 11(2), 134–147. https://doi.org/10.1080/19477503.2018.1425591

18.

Cheema

J. R.

Galluzzo

(2013). Analyzing the gender gap in math achievement: Evidence from a large-scale US sample. Research in Education, 90(1), 98–112. https://doi.org/10.7227/RIE.90.1.7

19.

Christensen

C. A.

Gerber

M. M.

(1990). Effectiveness of computerized drill and practice games in teaching basic math facts. Exceptionality, 1(3), 149–165. https://doi.org/10.1080/09362839009524751

20.

Cirino

P. T.

Fuchs

L. S.

Elias

J. T.

Powell

S. R.

Schumacher

R. F.

(2015). Cognitive and mathematical profiles for different forms of learning difficulties. Journal of Learning Disabilities, 48(2), 156–175. https://doi.org/10.1177/0022219413494239

21.

Clarke

Doabler

C. T.

Turtura

Smolkowski

Kosty

D. B.

Sutherland

Kurtz-Nelson

Fien

Baker

S. K.

(2020). Examining the efficacy of a kindergarten mathematics intervention by group size and initial skill: Implications for practice and policy. The Elementary School Journal, 121(1), 125–153. https://doi.org/10.1086/710041

22.

Claussen

D. W.

Thaut

M. H.

(1997). Music as a mnemonic device for children with learning disabilities. Canadian Journal of Music Therapy, 5, 55–66.

23.

Codding

R. S.

Burns

M. K.

Lukito

(2011). Meta-analysis of mathematic basic-fact fluency interventions: A component analysis. Learning Disabilities Research & Practice, 26(1), 36–47. https://doi.org/10.1111/j.1540-5826.2010.00323.x

24.

Codding

R. S.

Nelson

P. M.

Parker

D. C.

Edmunds

Klaft

(2022). Examining the impact of a tutoring program implemented with community support on math proficiency and growth. Journal of School Psychology, 90, 82–93. https://doi.org/10.1016/j.jsp.2021.11.002

25.

Codding

R. S.

VanDerHeyden

A. M.

Martin

R. J.

Desai

Allard

Perrault

(2016). Manipulating treatment dose: Evaluating the frequency of a small group intervention targeting whole number operations. Learning Disabilities Research & Practice, 31(4), 208–220. https://doi.org/10.1111/ldrp.12120

26.

Cogan

L. S.

Schmidt

W. H.

Guo

(2018). The role that mathematics plays in college- and career-readiness: Evidence from PISA. Journal of Curriculum Studies, 51(4), 530–553. https://doi.org/10.1080/00220272.2018.1533998

27.

Cohen

(1988). Statistical power analysis for the behavioral sciences (2nd ed.). Routledge. https://doi.org/10.4324/9780203771587

28.

Cook

B. G.

Buysse

Klingner

J. K.

Landrum

T. J.

McWilliam

R. A.

Tankersley

Test

D. W.

(2015). CEC’s standards for classifying the evidence base of practices in special education. Remedial and Special Education, 36(4), 220–234. https://doi.org/10.1177/0741932514557271

29.

Cozad

L. E.

Riccomini

P. J.

(2016). Effects of digital-based math fluency interventions on learners with math difficulties: A review of the literature. The Journal of Special Education Apprenticeship, 5(2), 1–19. https://doi.org/10.58729/2167-3454.1053

30.

Dennis

M. S.

Sharp

Chovanes

Thomas

Burns

R. M.

Custer

Park

(2016). A meta-analysis of empirical research on teaching students with mathematics learning difficulties. Learning Disabilities Research & Practice, 31(3), 156–168. https://doi.org/10.1111/ldrp.12107

31.

Duval

Tweedie

(2000). Trim and fill: A simple funnel-plot–based method of testing and adjusting for publication bias in meta-analysis. Biometrics, 56(2), 455–463. https://doi.org/10.1111/j.0006-341X.2000.00455.x

32.

Findell

Swafford

Kilpatrick

(Eds.). (2001). Adding it up: Helping children learn mathematics. National Academies Press.

33.

Fisher

Tipton

Zhipeng

Chen

(2023). robumeta: Robust variance meta regression (R package version 2.0). https://CRAN.Rproject.org/package=robumeta

34.

Fryer Jr

R. G.

Levitt

S. D

. (2010). An empirical analysis of the gender gap in mathematics. American Economic Journal: Applied Economics, 2(2), 210–240. https://doi.org/10.1257/app.2.2.210

35.

Fuchs

L. S.

Fuchs

Hamlet

C. L.

Powell

S. R.

Capizzi

A. M.

Seethaler

P. M.

(2006). The effects of computer-assisted instruction on number combination skill in at-risk first graders. Journal of Learning Disabilities, 39(5), 467–475. https://doi.org/10.1177/00222194060390050701

36.

Fuchs

L. S.

Fuchs

Malone

A. S.

(2017). The taxonomy of intervention intensity. Teaching Exceptional Children, 50(1), 35–43. https://doi.org/10.1177/00400599187581666

37.

Fuchs

L. S.

Geary

D. C.

Compton

D. L.

Fuchs

Schatschneider

Hamlett

C. L.

DeSelms

Seethaler

P. M.

Wilson

Craddock

C. F.

Bryant

J. D.

Luther

Changas

(2013). Effects of first-grade number knowledge tutoring with contrasting forms of practice. Journal of Educational Psychology, 105(1), 58–77. https://doi.org/10.1037/a0030127

38.

Fuchs

L. S.

Gilbert

J. K.

Powell

S. R.

Cirino

P. T.

Fuchs

Hamlett

C. L.

Seethaler

P. M.

Tolar

T. D.

(2016). The role of cognitive processes, foundational math skill, and calculation accuracy and fluency in word-problem solving versus prealgebraic knowledge. Developmental Psychology, 52(12), 2085–2098. https://doi.org/10.1037/dev0000227

39.

Fuchs

L. S.

Seethaler

P. M.

Sterba

S. K.

Craddock

Fuchs

Compton

D. L.

Geary

D. C.

Changas

(2021). Closing the word-problem achievement gap in first grade: Schema-based word-Problem intervention with embedded language comprehension instruction. Journal of Educational Psychology, 113(1), 86–103. https://doi.org/10.1037/edu0000467

40.

Fuchs

L. S.

Powell

S. R.

Hamlett

C. L.

Fuchs

Cirino

P. T.

Fletcher

J. M.

(2008). Remediating computational deficits at third grade: A randomized field trial. Journal of Research on Educational Effectiveness, 1(1), 2–32. https://doi.org/10.1080/19345740701692449

41.

Fuchs

L. S.

Powell

S. R.

Seethaler

P. M.

Cirino

P. T.

Fletcher

J. M.

Fuchs

Hamlett

C. L.

(2010). The effects of strategic counting instruction, with and without deliberate practice, on number combination skill among students with mathematics difficulties. Learning and Individual Differences, 20(2), 89–100. https://doi.org/10.1016/j.lindif.2009.09.003

42.

Fuchs

L. S.

Powell

S. R.

Seethaler

P. M.

Cirino

P. T.

Fletcher

J. M.

Fuchs

Hamlett

C. L.

Zumeta

R. O.

(2009). Remediating number combination and word problem deficits among students with mathematics difficulties: A randomized control trial. Journal of Educational Psychology, 101(3), 561–576. https://doi.org/10.1037/a0014701

43.

Geary

D. C.

(2011). Cognitive predictors of achievement growth in mathematics: A 5-year longitudinal study. Developmental Psychology, 47(6), 1539–1552. https://doi.org/10.1037/a0025510

44.

Geary

D. C.

(2013). Early foundations for mathematics learning and their relations to learning disabilities. Current Directions in Psychological Science, 22(1), 23–27. https://doi.org/10.1177/0963721412469398

45.

Hedges

L. V.

Tipton

Johnson

M. C.

(2010). Robust variance estimation in meta-regression with dependent effect size estimates. Research Synthesis Methods, 1(1), 39–65. https://doi.org/10.1002/jrsm.5

46.

Hill

C. J.

Bloom

H. S.

Black

A. R.

Lipsey

M. W.

(2008). Empirical benchmarks for interpreting effect sizes in research. Child Development Perspectives, 2(3), 172–177. https://doi.org/10.1111/j.1750-8606.2008.00061.x

47.

Holmes

Dowker

(2013). Catch up numeracy: A targeted intervention for children who are low-attaining in mathematics. Research in Mathematics Education, 15(3), 249–265. https://doi.org/10.1080/14794802.2013.803779

48.

Individuals with Disabilities Education Improvement Act of 2004 Pub. L. No. 108-446, 118 Stat. 2647. (2004).

49.

Jefferson

(2020). Sponsorship bias in clinical trials: Growing menace or dawning realisation? Journal of the Royal Society of Medicine, 113(4), 148–157. https://doi.org/10.1177/0141076820914242

50.

Jitendra

A. K.

Alghamdi

Edmunds

McKevett

N. M.

Mouanoutoua

Roesslein

(2020). The effects of tier 2 mathematics interventions for students with mathematics difficulties: A meta-analysis. Exceptional Children, 87(3), 307–325. https://doi.org/10.1177/0014402920969187

51.

Jitendra

A. K.

Lein

A. E.

S. H.

Alghamdi

A. A.

Hefte

S. B.

Mouanoutoua

(2018). Mathematical interventions for secondary students with learning disabilities and mathematics difficulties: A meta-analysis. Exceptional Children, 84(2), 177–196. http://doi.org/10.1177/0014402917737467

52.

Jordan

N. C.

Kaplan

Locuniak

M. N.

Ramineni

(2007). Predicting first–grade math achievement from developmental number sense trajectories. Learning Disabilities Research & Practice, 22(1), 36–46. https://doi.org/10.1111/j.1540-5826.2007.00229.x

53.

Jordan

N. C.

Resnick

Rodrigues

Hansen

Dyson

(2017). Delaware longitudinal study of fraction learning: Implications for helping children with mathematics difficulties. Journal of Learning Disabilities, 50(6), 621–630. https://doi.org/10.1177/0022219416662

54.

Joseph

L. M.

Konrad

Cates

Vajcner

Eveleigh

Fishley

K. M.

(2012). A meta-analytic review of the cover-copy-compare and variations of this self-management procedure. Psychology in the Schools, 49(2), 122–136. https://doi.org/10.1002/pits.20622

55.

Kanive

Nelson

P. M.

Burns

M. K.

Ysseldyke

(2014). Comparison of the effects of computer-based practice and conceptual understanding interventions on mathematics fact retention and generalization. The Journal of Educational Research, 107(2), 83–89. https://doi.org/10.1080/00220671.2012.759405

56.

Kim

H. Y.

(2013). Statistical notes for clinical researchers: Assessing normal distribution (2) using skewness and kurtosis. Restorative Dentistry & Endodontics, 38(1), 52–54. https://doi.org/10.5395/rde.2013.38.1.52

57.

Kim

S. A.

Bryant

D. P.

Bryant

B. R.

Shin

M. W.

(2023). A multilevel meta-analysis of whole number computation interventions for students with learning disabilities. Remedial and Special Education, 44(4), 332–347. https://doi.org/10.1177/07419325221117293

58.

Kiru

E. W.

Doabler

C. T.

Sorrells

A. M.

Cooc

N. A.

(2017). A synthesis of technology-mediated mathematics interventions for students with or at risk for mathematics learning disabilities. Journal of Special Education Technology, 33(2), 111–123. https://doi.org/10.1177/0162643417745835

59.

Kleinert

W. L.

Codding

R. S.

Minami

Gould

(2018). A meta-analysis of the taped problems intervention. Journal of Behavioral Education, 27(1), 53–80. https://doi.org/10.1177/0162643417745835

60.

Kong

J. E.

Yan

Serceki

Swanson

H. L.

(2021). Word-problem-solving interventions for elementary students with learning disabilities: A selective meta-analysis of the literature. Learning Disability Quarterly, 44(4), 248–260. https://doi.org/10.1177/073194872199484

61.

Kratochwill

T. R.

Levin

J. R.

(2010). Enhancing the scientific credibility of single-case intervention research: Randomization to the rescue. Psychological Methods, 15(2), 124–144. https://doi.org/10.1037/a0017736

62.

Koponen

T. K.

Sorvo

Dowker

Räikkönen

Viholainen

Aro

(2018). Does multi-component strategy training improve calculation fluency among poor performing elementary school children? Frontiers in Psychology, 9, 1187. https://doi.org/10.3389/fpsyg.2018.01187

63.

Kroesbergen

E. H.

van Luit

J. E. H.

(2002). Teaching multiplication to low math performers: Guided versus structured instruction. Instructional Science, 30(5), 361–378. https://doi.org/10.1023/a:1019880913714

64.

Ledford

J. R.

Gast

D. L.

(2018). Single case research methodology: Applications in special education and behavioral sciences (3rd ed.). Routledge.

65.

Lein

A. E.

Jitendra

A. K.

Harwell

M. R.

(2020). Effectiveness of mathematical word problem solving interventions for students with learning disabilities and/or mathematics difficulties: A meta-analysis. Journal of Educational Psychology, 112(7), 1388–1408. https://doi.org/10.1037/edu0000453

66.

Lembke

E. S.

Hampton

Beyers

S. J.

(2012). Response to intervention in mathematics: Critical elements. Psychology in the Schools, 49(3), 257–272. https://doi.org/10.1002/pits.21596

67.

Taljaard

Van den Heuvel

E. R.

Levine

M. A.

Cook

D. J.

Wells

G. A.

Devereaux

P. J.

Thabane

(2017). An introduction to multiplicity issues in clinical trials: The what, why, when and how. International Journal of Epidemiology, 46(2), 746–755. https://doi.org/10.1093/ije/dyw320

68.

Lipsey

M. W.

Puzio

Yun

Hebert

M. A.

Steinka-Fry

Cole

M. W.

Roberts

Anthony

K. S.

Busick

M. D.

(2012). Translating the statistical representation of the effects of education interventions into more readily interpretable forms (NCSER 2013-3000). National Center for Special Education Research, Institute of Education Sciences, US Department of Education.

69.

Mabbott

D. J.

Bisanz

(2008). Computational skills, working memory, and conceptual knowledge in older children with mathematics learning disabilities. Journal of Learning Disabilities, 41(1), 15–28. https://doi.org/10.1177/0022219407311003

70.

McNeish

Wolf

M. G.

(2020). Thinking twice about sum scores. Behavior Research Methods, 52(6), 2287–2305. https://doi.org/10.3758/s13428-020-01398-0

71.

McTiernan

Holloway

Healy

Hogan

(2016). A randomized controlled trial of the morningside math facts curriculum on fluency, stability, endurance and application outcomes. Journal of Behavioral Education, 25(1), 49–68. https://doi.org/10.1007/s10864-015-9227-y

72.

McVancel

S. M.

Missall

K. N.

Bruhn

A. L.

(2018). Examining incremental rehearsal: Multiplication fluency with fifth-grade students with math IEP goals. Contemporary School Psychology, 22(3), 220–232. https://doi.org/10.1007/s40688-018-0178-x

73.

Menesses

K. F.

Gresham

F. M.

(2009). Relative efficacy of reciprocal and nonreciprocal peer tutoring for students at-risk for academic failure. School Psychology Quarterly, 24(4), 266–275. https://doi.org/10.1037/a0018174

74.

Morano

Randolph

Markelz

A. M.

Church

(2020). Combining explicit strategy instruction and mastery practice to build arithmetic fact fluency. Teaching Exceptional Children, 53(1), 60–69. https://doi.org/10.177/0040059920906455

75.

Morris

S. B.

(2008). Estimating effect sizes from pretest-posttest-control group designs. Organizational Research Methods, 11(2), 364–386. https://doi.org/10.1177/1094428106291059

76.

Myers

J. A.

Arsenault

T. L.

Powell

S. R.

Witzel

B. S.

Tanner

Pigott

T. D.

(2024). Considerations for intensifying word-problem interventions for students with MD: A qualitative umbrella review of relevant meta-analyses. Journal of Learning Disabilities, 58(2), 83–111. https://doi.org/10.1177/00222194241281293

77.

Myers

J. A.

Brownell

M. T.

Griffin

C. C.

Hughes

E. M.

Witzel

B. S.

Gage

N. A.

Peyton

Acosta

Wang

(2021). Mathematics interventions for adolescents with mathematics difficulties: A meta-analysis. Learning Disabilities Research & Practice, 36(2), 145–166. https://doi.org/10.1111/ldrp.12244

78.

Myers

J. A.

Hughes

E. M.

Witzel

B. S.

Anderson

R. D.

Owens

(2023). A meta-analysis of mathematical interventions for increasing the word problem solving performance of upper elementary and secondary students with mathematics difficulties. Journal of Research on Educational Effectiveness, 16(1), 1–35. https://doi.org/10.1080/19345747.2022.208013

79.

Myers

J. A.

Witzel

B. S.

Powell

S. R.

Pigott

T. D.

Xin

Y. P.

Hughes

E. M.

(2022). A meta-analysis of mathematics word-problem solving interventions for elementary students who evidence mathematics difficulties. Review of Educational Research, 92(5), 695–742. https://doi.org/10.3102/00346543211070049

80.

National Center for Education Statistics. (2024). The nation’s report card: Mathematics performance of students in the US. US Department of Education, Institute of Education Sciences. https://www.nationsreportcard.gov/reports/mathematics/2024/g4_8/?grade=4

81.

Nelson

Carter

Boedeker

Knowles

Buckmiller

Eames

(2023). A meta-analysis and quality review of mathematics interventions conducted in informal learning environments with caregivers and children. Review of Educational Research, 94(1), 112–152. https://doi.org/10.3102/00346543231156182

82.

Nelson

Powell

S. R.

(2018). A systematic review of longitudinal studies of mathematics difficulty. Journal of Learning Disabilities, 51(6), 523–539. https://doi.org/10.1177/0022219417714773

83.

Nelson

P. M.

Burns

M. K.

Kanive

Ysseldyke

J. E.

(2013). Comparison of a math fact rehearsal and a mnemonic strategy approach for improving math fact fluency. Journal of School Psychology, 51(6), 659–667. https://doi.org/10.1016/j.jsp.2013.08.003

84.

Okolo

C. M.

(1992). The effect of computer-assisted instruction format and initial attitude on the arithmetic facts proficiency and continuing motivation of students with learning disabilities. Exceptionality, 3(4), 195–211. https://doi.org/10.1080/09362839209524815

85.

Omizo

M. M.

Cubberly

W. E.

Cubberly

R. D.

(2006). Modelling techniques, perceptions of self-efficacy, and arithmetic achievement among learning disabled children. The Exceptional Child, 32(2), 99–105. https://doi.org/10.1080/0156655850320206

86.

Zhao

Zuo

(2024). Effects of research funding on the academic impact and societal visibility of scientific research. Journal of Informetrics, 18(4), 101592. https://doi.org/10.1016/j.joi.2024.101592

87.

Page

M. J.

McKenzie

Bossuyt

Boutron

Hoffmann

Mulrow

C. D.

(2021). The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. British Medical Journal, 372, Article 71. https://doi.org/10.1136/bmj.n71

88.

Pigott

T. D.

(2012). Advances in meta-analysis. Springer.

89.

Pigott

T. D.

Polanin

J. R.

(2020). Methodological guidance paper: High-quality meta-analysis in a systematic review. Review of Educational Research, 90(1), 24–46. https://doi.org/10.3102/0034654319877153

90.

Pixner

Moeller

Kraut

(2023). Tigro-M: A program for automatization of multiplication facts. Psychology, 14(05), 857–879. https://doi.org/10.4236/psych.2023.145046

91.

Pollack

Wilmot

Centanni

T. M.

Halverson

Frosch

D’Mello

A. M.

Romeo

R. R.

Imhof

Capella

Wade

Al Dahhan

N. Z.

Gabrieli

J. D. E.

Christodoulou

J. A.

(2021). Anxiety, motivation, and competence in mathematics and reading for children with and without learning difficulties. Frontiers in Psychology, 12, 1–13. https://doi.org/10.3389/fpsyg.2021.704821

92.

Porter

McMaken

Hwang

Yang

(2011). Common Core standards: The new U.S. intended curriculum. Educational Researcher, 40(3), 103–116. https://doi.org/10.3102/0013189X11405038

93.

Powell

S. R.

Akther

S. S.

Yoon

N. Y.

Berry

K. A.

Nemcek

Fall

Roberts

(2023). The effect of addition and subtraction practice within a word-problem intervention on addition and subtraction outcomes. Learning Disabilities Research & Practice, 38(3), 182–198. https://doi.org/10.1111/ldrp.12319

94.

Powell

S. R.

Barnes

M. A.

Root

Hughes

E. M.

Ketterlin-Geller

Nelson

Rojo

Allsopp

D. H.

Witzel

B. S.

Myers

J. A.

Flores

M. M.

Lembke

E. S.

Burns

M. K.

Namkung

Poncy

Ennis

R. P.

Morin

L. L.

Arsenault

T. L.

Doabler

C. T.

. . .Peltier

(2025). The NCTM/CEC position statement on teaching mathematics to students with disabilities: What’s in it and what’s not. Research in Special Education, 2, 1–37. https://doi.org/10.25894/rise.2796

95.

Powell

S. R.

Benz

S. A.

Mason

E. N.

Lembke

E. S.

(2022). How to structure and intensify mathematics intervention. Beyond Behavior, 31(5), 5–15. https://doi.org/10.1177/10742956211072267

96.

Powell

S. R.

Fuchs

L. S.

Fuchs

Cirino

P. T.

Fletcher

J. M.

(2009). Effects of fact retrieval tutoring on third-grade students with math difficulties with and without reading difficulties. Learning Disabilities Research & Practice, 24(1), 1–11. https://doi.org/10.1111/j.1540-5826.2008.01272.x

97.

Price

G. R.

Mazzocco

M. M. M.

Ansari

(2013). Why mental arithmetic counts: Brain activation during single digit arithmetic predicts high school math scores. The Journal of Neuroscience, 33(1), 156–163. https://doi.org/10.1523.JNEUROSCI.2936-12.2013

98.

R Core Team. (2024). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/

99.

A. M.

Benavides-Varela

Pedron

De Gennaro

M. A.

Lucangeli

(2020). Response to a specific and digitally supported training at home for students with mathematical difficulties. Frontiers in Psychology, 11, 2039. https://doi.org/10.3389/fpsyg.2020.02039

100.

Reed

H. C.

Gemmink

Broens-Paffen

Kirschner

P. A.

Jolles

(2015). Improving multiplication fact fluency by choosing between competing answers. Research in Mathematics Education, 17(1), 1–19. https://doi.org/10.1080/14794802.2014.962074

101.

Riccomini

P. J.

Stocker

J. D.

Morano

(2017). Implementing an effective mathematics fact fluency practice activity. Teaching Exceptional Children, 49(5), 318–327. http://doi.org/10.1177/0040059916685053

102.

Rodgers

M. A.

Pustejovsky

J. E.

(2021). Evaluating meta-analytic methods to detect selective reporting in the presence of dependent effect sizes. Psychological Methods, 26(2), 141–160. https://doi.org/10.1037/met0000300

103.

Salminen

Koponen

Räsänen

Aro

(2015). Preventive support for kindergarteners most at-risk for mathematics difficulties: Computer-assisted intervention. Mathematical Thinking and Learning, 17(4), 273–295. https://doi.org/10.1080/10986065.2015.1083837

104.

Schoenfeld

A. H.

(2004). The math wars. Educational Policy, 18(1), 253–286. https://doi.org/10.1177/0895904803260042

105.

Stanley

T. D.

(2017). Limitations of PET-PEESE and other meta-analysis methods. Social Psychological and Personality Science, 8(5), 581–591. https://doi.org/10.1177/1948550617693062

106.

Stevens

E. A.

Rodgers

M. A.

Powell

S. R.

(2018). Mathematics interventions for upper elementary and secondary students: A meta-analysis of research. Remedial and Special Education, 39(6), 327–340. https://doi.org/10.1177/0741932517731887

107.

Stickney

E. M.

Sharp

L. B.

Kenyon

A. S.

(2012). Technology-enhanced assessment of math fact automaticity. Assessment for Effective Intervention, 37(2), 84–94. https://doi.org/10.1177/1534508411430321

108.

Stocker

J. D.

Kubina

R. M.

(2016). Impact of cover, copy, and compare on fluency outcomes for students with disabilities and math deficits: A review of the literature. Preventing School Failure: Alternative Education for Children and Youth, 61(1), 56–68. https://doi.org/10.1080/1045988X.2016.1196643

109.

Swanson

H. L.

Lussier

Orosco

(2013). Effects of cognitive strategy interventions and cognitive moderators on word problem solving in children at risk for problem solving difficulties. Learning Disabilities Research & Practice, 28(4), 170–183. https://doi.org/10.1111/ldrp.12019

110.

Swanson

H. L.

Olide

A. F.

Kong

J. E.

(2018). Latent class analysis of children with math difficulties and/or math learning disabilities: Are there cognitive differences? Journal of Educational Psychology, 110(7), 931–951. https://doi.org/10.1037/edu0000252

111.

Tanner-Smith

E. E.

Tipton

Polanin

J. R.

(2016). Handling complex meta-analytic data structures using robust variance estimates: A tutorial in R. Journal of Developmental Life Course Criminology, 2, 85–112. https://doi.org/10.1007/s40865-016-0026-5

112.

Tipton

Bryan

Murray

McDaniel

Schneider

Yeager

D. S.

(2023). Why meta-analyses of growth mindset and other interventions should follow best practices for examining heterogeneity: Commentary on Macnamara and Burgoyne (2023) and Burnette et al. (2023). Psychological Bulletin, 149(3–4), 229–241. https://doi.org/10.1037/bul0000384

113.

Tournaki

(2003). The differential effects of teaching addition through strategy instruction versus drill and practice to students with and without learning disabilities. Journal of Learning Disabilities, 36(5), 449–458. https://doi.org/10.1177/00222194030360050601

114.

VanDerHeyden

A. M.

Burns

M. K.

(2005). Using curriculum-based assessment to guide elementary mathematics instruction: Effect on individual and group accountability scores. Assessment for Effective Intervention, 30(3), 15–31. https://doi.org/10.1177/073724770503000302

115.

van Galen

M. S.

Reitsma

. (2010). Learning basic addition facts from choosing between alternative answers. Learning and Instruction, 20(1), 47–60. https://doi.org/10.1016/j.learninstruc.2009.01.004

116.

Van Lissa

C. J

. (2017). MetaForest: Exploring heterogeneity in meta-analysis using random forests. Open Science Framework. https://doi.org/10.17605/OSF.IO/KHJGB

117.

Van Luit

J. E.

Naglieri

J. A

. (1999). Effectiveness of the master program for teaching special children multiplication and division. Journal of Learning Disabilities, 32(2), 98–107. https://doi.org/10.1177/002221949903200201

118.

Vaughn

Wexler

Leroux

Roberts

Denton

Barth

Fletcher

(2011). Effects of intensive reading intervention for eighth-grade students with persistently inadequate response to intervention. Journal of Learning Disabilities, 45(6), 515–525. https://doi.org/10.1177/0022219411402692

119.

Vembye

M. H.

Pustejovsky

J. E.

Pigott

T. D.

(2023). Power approximations for overall average effects in meta-analysis with dependent effect sizes. Journal of Educational and Behavioral Statistics, 48(1), 70–102. https://doi.org/10.3102/10769986221127379

120.

Viechtbauer

(2010). Conducting meta-analyses in R with the metafor package. Journal of Statistical Software, 36(3), 1–48. https://doi.org/10.18637/jss.v036.i03

121.

Wang

X. S.

Perry

L. B.

Malpique

Tobias

(2023). Factors predicting mathematics achievement in PISA: A systematic review. Large-Scale Assessments in Education, 11(24), 24–43. https://doi.org/10.1186/s40536-023-00174-8

122.

Witzel

(2016). Bridging the gap between arithmetic and algebra. Council for Exceptional Children.

123.

Witzel

Myers

Root

Freeman-Green

Riccomini

Mims

(2024). Research should focus on improving mathematics proficiency for students with disabilities. The Journal of Special Education, 57(4), 240–247. https://doi.org/10.1177/00224669231168

124.

Woodward

(2006). Developing automaticity in multiplication facts: Integrating strategy instruction with timed practice drills. Learning Disability Quarterly, 29(4), 269–289. https://doi.org/10.2307/30035554

125.

Xin

Y. P.

Jitendra

A. K.

(1999). The effects of instruction in solving mathematical word problems for students with learning problems: A meta-analysis. The Journal of Special Education, 32(4), 207–225. https://doi.org/10.1177/002246699903200402

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.03 MB

0.07 MB

0.05 MB