Abstract
Years ago, a groundbreaking review of student data from the 2013–2014 school year indicated that Black students were overrepresented among those experiencing punishment in a variety of contexts. In the intervening decade, new data has emerged, schools have implemented policies to reduce racial disparities, researchers have highlighted new methods of measuring disparities, and pundits have reignited debates about the degree and pervasiveness of disparities. Clarity is needed. Are Black students experiencing more exclusion and punishment than their peers? If so, of what kinds and in what contexts? This article responds by reviewing the most recent federal data, measuring Black overrepresentation across six types of punishment, three comparison groups, 16 subpopulations, and seven types of measurement. We generate 1,581 unique estimates of Black overrepresentation and find evidence that, no matter how you slice it, Black students are overrepresented among those punished. We conclude with policy recommendations to reduce widespread and enduring racial disparities.
Keywords
Introduction
A consensus is emerging that exclusionary discipline—such as in-school suspensions, out-of-school suspensions, and expulsions—not only fail to deter misbehavior, but may encourage defiance and misbehavior; harm academic achievement, school climate, and mental health; and even have criminogenic effects, leading students down a “school to prison pipeline” (Bacher-Hicks et al., 2019; Duarte et al., 2023; Eyllon et al., 2022; Lacoe & Steinberg, 2019; LiCalsi et al., 2021; Pesta, 2021; Shollenberger, 2015; Way, 2011). There is little doubt among developmental psychologists and education researchers that school-based corporal punishment (e.g., spankings) can harm students’ cognitive and emotional development (Gershoff et al., 2018; Gershoff et al., 2019); and those in the fields of criminology have long warned that school-based referrals to law enforcement and school-based arrests can engender and accelerate opportunities for students to be exposed to the harms of the juvenile justice system (Shollenberger, 2015). Yet despite the documented harms of these forms of exclusion and punishment, schools throughout the country continue to rely on these practices. While exposure to these practices may, itself, be harmful, new evidence suggests that when Black students perceive racial disparities in exclusion and punishment, the
That exclusion and punishment are harmful, and that related disparities can exert a unique psychological harm, both raise the question, Are Black students experiencing more exclusion and punishment than their peers? An influential, albeit dated, report by the Government Accountability Office (GAO; 2018) explored that question by reviewing data from millions of U.S. public school students from the 2013–2014 school year. The report found, nearly a decade ago and using one measure of disparity, that Black students were suspended—and otherwise punished and excluded—far more often than similarly situated White students. However, in the decade since the 2013–2014 school year, and even in the 6 years since the GAO report was published, much has changed.
First, federal, state, and local entities have invested millions of dollars to implement school practices (such as restorative practices) that are designed in part to reduce the use of exclusionary discipline for all students and to reduce racial disparities in exclusion and punishment (Office of Elementary and Secondary Education, 2019). Some research suggests that these investments may have catalyzed substantial increases in the adoption of restorative practices (Darling-Hammond, 2023). And extant research presents a mixed picture regarding whether these investments have reduced racial disparities in discipline (Darling-Hammond, 2020). It thus remains unclear whether federal investments have yielded reductions in exclusionary discipline disparities.
Second, perhaps in response to these policy investments, exclusionary discipline rates for students generally, and for Black students specifically, have declined (Leung Gagné et al., 2022). As suspension and expulsion rates for students of different backgrounds shift, so too might disparities, unsettling the conclusion that Black students are overrepresented among those disciplined.
Third, new methods of measuring racial disparities in discipline have emerged, with scholars not only applying new techniques to statewide datasets, but arguing that the conclusions one draws about disparities may partially be a function of the measure of disparity one utilizes (Bottiani et al., 2023; Curran, 2020; Girvan et al., 2019; Rodriguez & Welsh, 2022).
Fourth, there have been significant shifts in the use of school resource officers, which may shift racial disparities in police-related outcomes (such as referrals to law enforcement and school-related arrests). While the GAO (2018) report is primarily cited to highlight racial disparities in out-of-school suspensions, it also surfaced racial disparities in school-policing-related measures. However, in the years since the report was published, and the many more years since the data underlying the report were collected, schools have faced alternating waves of pressure to reduce and increase the use of school resource officers (Arango, 2023; Goldstein, 2020). These shifts unsettle our understanding of whether racial disparities in referrals to law enforcement and school-related arrests persist.
Fifth, debates about the degree, pervasiveness, and policy import of racial disparities in discipline have reemerged, with thinktanks and pundits arguing that Black–White disparities might largely be a function of economic or school factors (e.g., Eden, 2019).
And, finally, the U.S. Department of Education has released more recent federal data. Still, peer-reviewed research has only scratched the surface of leveraging this data to explore the degree, pervasiveness, and persistence of racial disparities in discipline, exclusion, and punishment.
Given that new data exists and has not been leveraged, that the policy landscape has shifted and may have shifted rates of exclusion and punishment, and that new methods have been championed that might generate unique conclusions, it appears unclear whether old conclusions are still supported. A fresh and thorough review of the most up-to-date data is warranted, and could provide new and timely clarity. This article thus reviews the most recent federal data on student discipline, exclusion, and punishment, applying a variety of modern metrics to ascertain whether Black students remain overrepresented among those exposed to the harms of exclusion and punishment.
The Uniqueness and Importance of Black Discipline
One may wonder why we focus, here, on the disciplinary experiences of Black students. Students of all backgrounds are excluded from school (GAO, 2018), and research has documented that exposure to exclusionary discipline is harmful for students of all backgrounds (e.g., Bacher-Hicks et al., 2019). Our focus on Black student discipline is rooted in notions of equality and equity. First, prior research has documented that Black students experience uniquely high rates of exposure to exclusionary discipline (GAO, 2018)—rates that are markedly higher than those experienced by White students, and substantially higher than those experienced by other non-White students (e.g., Hispanic students). Notions of equality demand that we ascertain if these prior trends persist.
However, another, perhaps more pressing, reason to renew and maintain attention on Black discipline rates is that research has demonstrated that the beliefs and behaviors of school personnel play a role in Black students being punished more harshly than their peers. In a vignette study, Okonofua and Eberhardt (2015) demonstrated that teachers randomly assigned to review instances of misbehavior by a Black student recommended harsher discipline than teachers randomly assigned to review identical instances of misbehavior by a White student. Notably, this vignette study is one step removed from real-world conditions. However, researchers have found that Black students receive more, and harsher, punishment than non-Black peers even when the students have misbehaved a similar number of times, when they are engaged in the same incident of misbehavior (i.e., in a conflict with one another), when the students have similar prior behavioral histories, and when the students are in schools with similar racial compositions (Barrett et al., 2021; Gregory et al., 2016; Huang & Cornell, 2017; Owens & McLanahan, 2020; Shi & Zhu, 2022). In one study, authors considered factors that might contribute to Black–White disparities in exclusionary discipline rates. They concluded that differences in behavior account for 9% of Black–White disparities in discipline, school sorting accounts for 21% of the discipline gap, and differential treatment accounts for 46% (Owens & McLanahan, 2020). Thus, research indicates that Black students are being treated more harshly than non-Black peers. When combined with research demonstrating the harmfulness of both discipline and discipline disparities, this research militates toward careful inspection of the persistence, pervasiveness, and degree of racial disparities in scholastic exclusion punishment.
Evaluating Progress in a Pivotal Policy Moment
The need to ascertain whether racial disparities persist is augmented by the need to evaluate substantial policy shifts designed to reduce racial disparities in exclusion and punishment. Many states, districts, and schools have abandoned zero-tolerance policies (which require that students be disciplined if they engage in certain acts of misbehavior and have been attributed with increasing Black–White discipline disparities); many have adopted culturally responsive professional development; many have shifted their disciplinary frameworks towards relational models—such as positive behavioral interventions and supports and restorative practices—that deprioritize punishment and focus on building emotional skills and social bonds; and many have even banned the use of exclusionary discipline for specific acts of misbehavior or for particular grade levels. However, implementation of these policies has been uneven, at best, and some districts and schools have responded to instances of student violence by adopting more, rather than less, punitive practices. It is unclear whether efforts to curb discipline disparities have borne fruit and, if so, in which contexts.
Below, we discuss the data available to explore the persistence and pervasiveness of Black overrepresentation; we explore the dimensions one must consider when analyzing Black overrepresentation; and we present our analyses, which include over 1,000 analytic permutations to evaluate Black overrepresentation. Finally, we discuss the implication of our analyses, which indicate, with consistency, that Black students experience markedly more scholastic punishment and exclusion than their peers.
The Office for Civil Rights Data Collection
Under the auspices of the U.S. Department of Education, the Office for Civil Rights is tasked with, among other things, ensuring that educational entities satisfy their obligations under state and federal laws to provide equitable opportunities to students of different racial backgrounds. A pillar of their strategy is to require that schools periodically provide detailed information about the disciplinary experiences of their students. Data for five school years (2011–2012, 2013–2014, 2015–2016, 2017–2018, and 2020–2021) has been collected and disseminated. However, in 2018, the GAO reviewed the 2013–2014 data to produce a thorough review of the disciplinary experiences of students stratified by various demographic, social, and school characteristics (GAO, 2018). The resulting report provided a deep and thorough dive into discipline rates for students based on their race, gender, disability status, school poverty level, school grade level, and even based on two-way combinations of these characteristics (providing discipline rates for students who were, for example, “Black and male” and “Hispanic and receiving special education”).
While seminal, the GAO (2018) report has not been replicated using more recent federal data. And while the U.S. Department of Education has disseminated public-use data files for more recent school years, one would need to conduct many data operations to replicate the estimates provided in the GAO report. A significant barrier thus impedes researchers, educators, and policymakers from mapping more recent trends in discipline. Perhaps due to these data obstacles, many papers published as recently as 2023, and in top-tier journals, rely on the 2013–2014 data (e.g., Graves & Wang, 2023; Samimi et al., 2023; Tolliver et al., 2023), and no research team has reproduced the helpful estimates of Black overrepresentation in discipline presented in this earlier, seminal federal report.
Four Dimensions of Measuring Disparity
We believe that reproducing the GAO (2018) report would, in and of itself, represent a contribution to the field. However, we also believe that to truly understand whether and where Black students are overrepresented among those punished and excluded, one must attend to multiple dimensions of disparity measurement. We have identified four dimensions one might consider when ascertaining the existence and degree of overrepresentation: the type of punishment, the comparison group, the subpopulation of interest, and the type of measurement.
Dimension 1: Type of Exclusionary or Punitive Discipline
Perhaps due to research indicating their harmfulness (Bacher-Hicks et al., 2019) or commonness (Darling-Hammond et al., 2023), many analyses of disciplinary rates focus exclusively on out-of-school suspensions (e.g., Graves & Wang, 2023). However, research has demonstrated that a variety of types of school-based punishment and exclusion can cause harm (Yaluma et al., 2022) and may exhibit racial disparities (GAO, 2018). Federal data includes at least six types of punishment or exclusion one might consider:
In-school suspension—when a student is excluded from typical classroom experiences
Out-of-school suspension—when a student is excluded from the school grounds entirely
Expulsion—when a student is permanently excluded from the school
Corporal punishment—when a student is physically assaulted (e.g., spanked) by school personnel
Referral to law enforcement—when a student is referred for criminal processing to law enforcement personnel (including a 911 call or the transference of a student’s file to a juvenile processing institution)
School-related arrest—when a student is arrested by law enforcement on school grounds
In the GAO (2018) review of the 2013–2014 data, Black students were more likely to have experienced each of these types of punishment than White students. However, the degree to which Black students might be deemed “overrepresented” depended on the type of punishment one reviewed. For example, while the percentage of Black students experiencing out-of-school suspensions was 10.5 percentage points higher than the percentage of White students experiencing out-of-school suspensions, the percentage of Black students experiencing
Dimension 2: The Comparison Group
To understand whether Black students are overrepresented among those disciplined, one option is to compare the experiences of Black students to the experiences of the general student body (e.g., GAO, 2018). In this approach, the “comparison group” that is contrasted to Black students is “all students.” However, there are at least two other “comparison group” options that are often utilized.
Myriad research articles reviewing discipline disparities have compared the discipline rate among Black students in a given context to the discipline rate among White students in the same context (e.g., Barrett et al., 2021; Bottiani et al., 2023; Curran, 2020; Girvan et al., 2019; Gregory et al., 2016; Huang & Cornell, 2017; Owens & McLanahan, 2020; Rodriguez & Welsh, 2022; Shi & Zhu, 2022). This may be an extension of early research demonstrating a racial bias whereby Black students were treated more harshly than White students when behavior was held constant (Okonofua & Eberhardt, 2015). Or it may have grown out of theoretical literature suggesting that the way Black students are treated in schools is a historical vestige of the mores that permeated American society during slavery and the Jim Crow era whereby Black people (including Black students) were often punished to subjugate Black people and protect the interests (money, property, and opportunity) of White people (Parker & Stovall, 2004).
More recently, and perhaps in response to the growth in discourse regarding how Black people uniquely experience “anti-Blackness” across a variety of contexts and domains (Curry & Curry, 2018; Dancy et al., 2018; Edwards et al., 2023; Jenkins, 2021; Williams & Mohammed, 2013), researchers (e.g., Azam et al., 2022; McIntosh et al., 2021) have compared the experiences of Black individuals to those of non-Black individuals. In sum, Black students can be, and often are, compared to three different reference groups:
White students
Non-Black students
All students
Dimension 3: The Subpopulation
When ascertaining the degree to which Black students are overrepresented among those disciplined, one can look at the experiences of all students, or one can look within subpopulations of students. For example, focusing on the male student population, one can compare the
Related to spurious association, imagine that students from low-income backgrounds are more likely to be suspended, and that Black students are more likely than White students to come from low-income backgrounds. If this is so, then a Black–White disparity in discipline
Related to double jeopardy, imagine that Black students generally experience more discipline than White students, and that students receiving special education services (SPED students) experience more discipline than non-SPED students. If the structural forces driving higher discipline rates among Black students and among SPED students are unique, than the confluence of these two characteristics (Black and SPED) may result in an even higher discipline rate than emerges among Black students or SPED students. Double jeopardy phenomena may also reflect unique structural vulnerabilities that exist at intersecting axes of marginalization. One can vet the potential for double jeopardy by measuring the discipline rate among Black and SPED students and comparing it to the discipline rate among White and SPED students to see if the latter is higher than the former.
The Civil Rights Data Collection (CRDC) and the Common Core of Data (CCD) allow for the measurement of discipline rates, by racial group, and within a variety of subpopulations, including the following:
All students
Male students
Female students
Special education students
Students in “poor” schools (those where 75% or more of students receive free or reduced-priced lunches)
Students in “semi-poor” schools (those where 50%–75% of students receive free or reduced-priced lunches)
Students in “semi-rich” schools (those where 25%–50% of students receive free or reduced-priced lunches)
Students in “rich” schools (those where 25% or fewer of students receive free or reduced-priced lunches)
Students in traditional public schools
Students in magnet schools
Students in charter schools
Students in alternative schools
Preschool students
Elementary school students
Middle school students
High school students
Dimension 4: The Metric of Disparity
A number of research teams have recently opined on the many ways that one can measure racial disparities in discipline, and have applied various measurement strategies to state-level data. In a recent article, after applying several measurement strategies to data from Maryland public schools, Curran (2020) claimed that the measurement approach employed can have “large practical implications” (p. 385) regarding the policy conclusions one might draw. Our literature scan identified five research articles that discuss various methods of measuring Black overrepresentation in depth (Bottiani et al., 2023; Curran, 2020; GAO, 2018; Girvan et al., 2019; Rodriguez & Welsh, 2022) and which, together, present seven unique means of measuring Black overrepresentation:
1) Risk difference
2) Risk ratio
3) Raw differential representation
4) Difference in standardized risk
5) Disproportionality ratio
6) Disproportionality difference
7) e-Formula
These formulas rely on what can be termed “counts” and “rates.” A “count” is the unduplicated count of individuals experiencing a given phenomenon. So, for example, the “Black Count” is the total population of Black students in a given population or educational context. Relatedly, the “Black suspension count” would indicate the
As noted, we use the letter “O” to indicate the “other” or “comparison group.” Thus, if we were comparing the experiences of Black students to those of White students, then the White count would be designated as OC (“other count”), the White discipline count would be ODC, and the White discipline rate would be ODR. And, if we were comparing the experiences of Black students to those of non-Black students, the non-Black count would be OC (“other count”), the non-Black discipline count would be ODC, and the non-Black discipline rate would be ODR. In both cases, we would have ODR = ODC / OC.
With counts and rates for Black and “other” (or comparison) group students as our building blocks, we can understand these seven measures of disparity. In the explanations below, unless otherwise indicated, we will use White students as the comparison group. However, note that non-Black students or all students can be supplanted in many cases.
The risk difference (i.e., the “absolute risk difference”) can theoretically range from −1 to 1, with negative values indicating that White students are disciplined more (per capita) than Black students, positive values indicating that Black students are disciplined more (per capita) than White students, and a value of “zero” indicating that Black and White students have identical disciplinary experiences (per capita).
The risk ratio can theoretically range from 0 to infinity with values between 0 and 1 indicating that White students experience more discipline (per capita) than Black students, a value of 1 indicating that Black and White students have identical disciplinary experiences (per capita), and values above 1 indicating that Black students are disciplined at a higher rate than White students. When it returns values above 1, the risk ratio has the benefit of being able to be interpreted in a multiplicative manner. In other words, a risk ratio if “2” would indicate that “Black students are disciplined at a rate that is two times higher than that of White students.” For this reason, scholars (e.g., Darling-Hammond, 2023) have often favored the risk ratio as a means of conveying inequity in disciplinary experiences.
If the size of the Black student population is BC, raw differential representation can theoretically range from negative BC to positive BC. Negative values indicate that White students receive proportionately more discipline than Black students, a value of 0 indicates that Black and White students have similar disciplinary experiences, and positive values indicate that Black students receive proportionately more discipline than White students. When raw differential representation is positive, it can be understood as a measure of
In their article, Girvan et al. (2019) discussed and advocated for a new measure of Black overrepresentation called the difference in standardized risks (i.e., probit d’). The measure is constructed by first standardizing the Black discipline rate relative to an inverse normal distribution so that it is on a z-score distribution, then repeating the same step for the White discipline rate. Finally, one then subtracts the standardized White discipline rate from the standardized Black discipline rate. Negative values indicate White overrepresentation, a value of zero indicates parity, and positive values indicate Black overrepresentation. However, because related parameters have been standardized along a z distribution, interpretation relies on reference to conventions related to Cohen’s d, a measure of standardized effect size whose classic interpretation (see, e.g., Cohen, 1998; Lakens, 2013) can be summarized as follows:
values above 0.2 indicate a “small” effect,
values above 0.5 indicate a “medium” effect, and
values above 0.8 indicate a “large” effect
While these ranges provide helpful benchmarks, it is important to note, as many scholars have warned, that even “small effect sizes can have large consequences” (Chamberlain et al., 2014, p. 102; King et al., 2016, p. 294; Lakens, 2013, p. 3; Prairie et al., 2023, p. 205; Ratcliffe et al., 2019, p. 224).
As discussed above, when one calculates the Black discipline rate, they divide the Black discipline count by the total Black count (BDR = BDC / BC). The Black discipline rate indicates the proportion of Black students who experience discipline. However, some metrics instead have an inverted focus, and indicate the proportion of all disciplined individuals who are Black (PDB). For example, to understand the proportion of disciplined individuals who are Black, one divides the Black disciplined count over the total disciplined count. Thus, if we use “T” to indicate “total population” (i.e., all students, including Black students), we have “the proportion of disciplined individuals who are Black” is equal to Black discipline count divided by the total discipline count, or PDB = BDC / TDC.
In the same way that we can calculate the proportion of the disciplined population who are Black, we can calculate the proportion of the total student population who are Black. Thus, we get that the “proportion of the population who are Black” is equal to the Black count divided by the total count, or PPB = BC / TC.
With these two new building blocks, we can generate two new measures of disparity: the disproportionality ratio and the disproportionality difference.
The discipline disproportionality ratio (Rodriguez & Welsh, 2022) can theoretically range from 0 to infinity. A value of 0 would indicate that the Black discipline rate is 0. Values between 0 and 1 indicate that while
One can calculate the disproportionality difference using the same two targets as one utilizes to calculate the related ratio. This measure can range from −1 to 1. With this measure, a negative value indicates that Black students are underrepresented among those disciplined, a value of 0 indicates equality, and a positive value indicates that Black students are overrepresented. In the most extreme case, imagine that all disciplined students are Black (so the PDB = 1), and imagine that Black students only represent 1 % of the student population (0.01). The disproportionality difference would therefore be 1 – 0.01 = 0.99—extremely close to 1 and indicating a very high degree of disparity.
For example, the GAO (2018) used the measure to describe the degree to which Black students were overrepresented among those receiving suspensions in the 2013–2014 school year. The report indicated that about 39% of students who received suspensions were Black (a proportion of 0.39), and that about 16% of the overall student population was Black (a proportion of 0.16). Thus, the report concluded that Black students experienced an “overrepresentation” (or disproportionality difference) of about 23% (or 39% – 16%) which, in proportion terms, would be a 0.23 (or 0.39 – 0.16).
Bottiani et al. (2023) also provide a novel means of determining Black overrepresentation known as the “e-Formula.” The e-Formula, in essence, compares the proportion of suspended students who are Black to an allowable cutoff that is a function of the proportion of all students who are Black. If the proportion of suspended students who are Black is substantially higher than the proportion of all students who are Black, then the e-formula will return a high value. One can use the e-Formula to calculate an e-Formula score. The e-Formula score indicates
Technically, the e-Formula score can range from negative infinity to positive infinity, with negative scores indicating that Black students are underrepresented among those disciplined, a score of 0 indicating that Black students are neither underrepresented nor overrepresented (i.e., that the proportion of suspended students who are Black is equal to the proportion of students who are Black), and positive scores indicating that Black students are overrepresented, and a score above 1 indicating that the level of Black overrepresentation is “over the allowable threshold.” In Bottiani et al.’s (2023) paper, they depict e-Formula scores as ranging from 0 to 5, suggesting that scores above 5 would represent extreme levels of disparity.
Critically, in Bottiani et al. (2023) examples, the total number of disciplined students was quite low (just over 100). In our data, the total number of disciplined students generally is in the thousands and sometimes approaches one million. This is an important detail as one critical facet of the e-Formula score is that it is determined in part by the total number of disciplined individuals. In cases where the total number of disciplined individuals is quite large, the SE value will be quite small and the e-Formula score will be incredibly large (as the e-Formula score is determined by dividing by the SE value). As such, we believe that when the total number of disciplined individuals grows beyond a certain point (and the resulting e-Formula score is in the hundreds, or even thousands) the e-Formula score ceases to have practical significance. We nonetheless calculate e-Formula scores across potential permutations to demonstrate how this measure would operate in the context of large-scale federal data.
A Final Consideration: Which Year of Data
Above, we have described four key considerations that can be used to organize the many estimates of Black overrepresentation one might generate using more recent federal data. One final consideration is which
Still, the major drawback of our choice to use 2017–2018 data rather than 2020–2021 data is that the latter, while anomalous, is substantially more recent. However, given that current scholarship largely relies on the 2013–2014 data, we believe generating estimates using the 2017–2018 will increase the recency and relevance of the data available. In addition, the 2017–2018 data would reflect shifts generated by a critical period of policy activity during which, for example, many states banned the use of discipline in certain contexts and required schools to collect and report data regarding discipline disparities, and during which the federal government implemented rethinking school discipline (2014–2019)—a package of policies and funding streams designed to reduce racial disparities in discipline.
Methods
Accessing Data and Calculating Counts
We first validated our method of generating discipline counts and rates by leveraging 2013–2014 data and reproducing the estimates presented in the GAO (2018) report. We largely perfectly reproduced the GAO estimates. Where estimates diverged, they did so extremely marginally and likely as a result of the fact that whereas GAO uses restricted-use data, we leveraged publicly available data which is modified slightly to ensure researchers cannot identify individual students. We therefore applied our methods to the 2017–2018 and 2020–2021 data.
We next produced counts and rates of Black students, White students, All students, and non-Black students for each of the six types of punishment and for each of the 16 student subpopulations described above. Finally, we produced measures of overrepresentation. In most cases, each permutation (of comparison group by punishment type by subpopulation by measure) is unique. However, certain permutations are mathematically identical to one another (see Supplement in the online version of the journal), and duplicative permutations are excluded. Finally, federal data does not include information about in-school suspensions, referrals to law enforcement, or school-related arrests for preschool students.
Calculating Measures of Overrepresentation
We used Microsoft Excel to calculate each of the seven measures of overrepresentation for each combination of punishment type by comparison group by subpopulation. Calculations for six of the seven measures involve simple algebraic functions, so we do not provide detailed instructions for calculating these measures. However, calculating the difference in standardized risk (4) involves the standardization of discipline rates using an inverse normal distribution. Consistent with Girvan et al.’s (2019) guidance, there are many approaches one can take to achieve this standardization, and we used Microsoft Excel’s NORM.INV function. To validate our use of the function, we first reproduced an example provided in Girvan et al.’s paper before applying the same procedure to our data.
Results
Are Black Students Disciplined and Punished More Than Their Peers?
As noted above, we measured Black overrepresentation across six types of punishment, three comparison groups, 16 populations/subpopulations, and seven types of measurement. Reviewing 2017–2018 data, we generated a total of 1,581 unique estimates of Black overrepresentation. And reviewing each of these estimates against the benchmarks described above (e.g., “a risk difference greater than 0 indicates overrepresentation”), we find evidence that Black students are overrepresented among those disciplined in 99% of our estimates (1,564/1,581 analytic permutations). 1
Among all the permutations we reviewed, the only ones that
Do Different Measures of Overrepresentation Indicate Different Findings?
We see the same pattern of Black overrepresentation across each of our seven measures of overrepresentation (see Supplement in the online version of the journal). However, to provide a general sense of the pattern of findings that pervades each measure, we present a visual representation of our estimates of the difference in standardized risk across comparison groups, punishment types, and subpopulations. Because the measure is standardized, one can use the difference in standardized risk to compare estimates across varied contexts and visually ascertain where Black overrepresentation is more severe. As seen in Figure 1, Black students are overrepresented among those experiencing various forms of punishment in varied contexts and across varied subpopulations, and this finding holds whether one compares Black students to White students, non-Black students, or all students.

Black overrepresentation among those disciplined as difference in standardized risk estimates across six types of punishment, three comparison groups, and 16 subpopulations.
While the overall conclusion one draws (that Black students are overrepresented among those excluded and punished) does not vary depending on the measure of overrepresentation one selects, the nuances regarding where overrepresentation appears most severe depend on the measure selected. Specifically, when one selects a measure rooted in differences (e.g., the risk difference), disparities will look more severe among subpopulations that have a higher general rate of discipline and exclusion. Specifically, because boys generally have a higher discipline rate than girls, the “risk difference” comparing discipline rates for Black students and comparison group members will generally look more extreme for boys than for girls. The same is not necessarily true when one selects a measure rooted in ratios (e.g., the risk ratio). For these measures, differences between rates drive the scope of disparities observed. As a result, when reviewed as a risk ratio, the racial disparity in discipline among girls appears more concerning than the disparity among boys.
How Much More Are Black Students Punished: The Risk Ratio Perspective
While the difference in standardized risk provides an elegant means of comparing measures of overrepresentation across myriad permutations, it lacks a straightforward real-world interpretation. Thus, in Figure 2, we also provide estimates of the Black–White risk ratio for each subpopulation and punishment type. Using this measure, we see that, relative to White students, Black students were 3.6 times more likely to have been suspended out of school, 2.5 times more likely to be have been suspended in school, 3.4 times more likely to have been expelled, 2.4 times more likely to have been referred to law enforcement, 2.9 times more likely to have experienced a school-based arrest, and 2.3 times more likely to have been corporally punished. Disparities emerge as early as preschool, where, for example, Black students were 2.8 times more likely to have been suspended out of school and 2.4 times more likely to have been expelled. They persist in early grades, with Black elementary school students being 5.0 times more likely to have been suspended out of school, and 4.9 times more likely to have experienced corporal punishment. Particularly jarring disparities emerge in alternative schools, where Black students were 3.1 times more likely to have experienced a school-based arrest, and

Black overrepresentation among those disciplined as Black–White risk ratio estimates across six types of punishment and 16 subpopulations.
Which Kinds of Discipline and Punishment Exhibit the Largest Disparities?
Black students are more severely overrepresented among those experiencing certain kinds of discipline and punishment—specifically, out-of-school suspensions and in-school suspensions, and, in certain cases, corporal punishment (e.g., in alternative schools and when reviewing ratio-based measures like the risk ratio). However, Black students are clearly, albeit less severely, overrepresented when considering other kinds of discipline and punishment as well (referrals to law enforcement, expulsions, and school-related arrests).
Which Subpopulations See the Largest Disparities?
Where overrepresentation is the most severe, it tends to involve one of three subpopulations: students in wealthy schools, students in semi-wealthy schools, and students in alternative schools. In wealthy and semi-wealthy schools, we see the largest disparities in out-of-school suspensions in-school suspensions, and expulsion rates. Meanwhile, in alternative schools, we see the most severe disparities in corporal punishment rates. Critically, however, there is no subpopulation where we do
How Do Different Comparison Groups Impact Measured Overrepresentation?
Consistently, we see that estimates that rely on White students or non-Black students as the comparison group suggest a larger degree of overrepresentation than estimates that rely on all students as the comparison group. This is not surprising when one considers that Black students are included within “all” students, so comparing Black students to all students (including Black students) results in somewhat of a washout of the unique experiences of Black students.
Future Research on Multiple Instances and Multiple Forms of Punishment
It is important to note that the measures of exclusion and punishment available in federal data are presented in terms of the
Raw Data for Future Research
We conclude our review of our results by presenting raw counts and rates for Black, White, non-Black, and all students across each of the punishments and subpopulations described above (see Table 1). We present these data both to allow others to reproduce our estimates and to help researchers, media personnel, and policymakers access more recent, and more nuanced, data regarding the disciplinary experiences of Black students.
Counts and Rates of Disciplinary Experiences for Black, White, Non-Black, and All Students, by Punishment Type and Subpopulation
Discussion
No matter how you slice it, Black students are overrepresented among those punished and excluded. One arrives at this conclusion whether they compare Black students to White students, to non-Black students, or to all students; whether they look at in-school suspensions, out-of-school suspensions, expulsions, corporal punishment, referrals to law enforcement, or school-related arrests; whether they look at all students, male students, female students, SPED students, students in wealthy schools, students in poor schools, students in traditional public schools, students in magnet schools, students in charter schools, students in alternative schools, students in preschools, students in elementary schools, students in middle schools, or students in high schools; and whether they calculate the risk difference, risk ratio, raw differential representation, difference in standardized risk, disproportionality ratio, disproportionality difference, or e-Formula. Disparities persist. Disparities are widespread. And disparities are pronounced.
These data are not simply mathematical abstractions, but distillations of real human experiences—of students, families, and communities. One aptly named measure that helps contextualize these human experiences is raw differential representation—a measure that indicates
The Manner of Measurement
A core finding is that regardless of the mode of measurement, Black students are clearly overrepresented among those excluded and punished. However, while each measurement approach indicates that racial disparities exist within
Imagine, for example, a principal in a school with 100 Black boys, 100 Black girls, 100 White boys, and 100 White girls; and imagine that, among these students, the numbers who are disciplined are 50 Black boys, 20 Black girls, 25 White boys, and 5 White girls. In this case, the risk difference would be 0.25 for boys and 0.15 for girls. This might suggest that there is a larger “discipline disparity” problem among boys. And indeed, one could argue that this is so given that Black boys represent a huge share of the total disciplined population in the school (as they are 25% of the student population, but 63% of the disciplined population). However, consider how the conclusions drawn might differ had the principal focused on the risk ratio. In that case, the measure of overrepresentation would be a risk ratio of 2 for boys and of 4 for girls, suggesting a larger “discipline disparity” problem among girls. And again, a case could be made that this is so, particularly considering that for every one White girl who is disciplined, one would expect
From a policy perspective, one major implication of this research is that racial disparities in exclusion and punishment—which pervade all levels and types of punishment—are woven into the fabric of the Black student experience. This finding provides weight to the arguments of many scholars who have depicted federal educational policies as treating Black students as if they are culturally deficient or intellectually inferior (e.g., Love, 2023). Given historical roots and enduring trends, efforts to shift our K–12 paradigm, and ensure disparities (and their attendant harms) are not a fundamental aspect of the lives of Black youth, may require concerted federal investments.
Federal Data, Federal Policies
It is worth noting that there have been prior federal initiatives designed to actively reduce racial disparities. In 2014, after federal data demonstrated stark discipline disparities, then–Secretary of Education Arne Duncan unveiled an ambitious imperative called “Rethinking School Discipline” that was designed to close the discipline gap. The plan combined carrots (millions of dollars of grant funding to implement alternatives to exclusionary discipline such as restorative practices) and sticks (threats to investigate and potentially withhold Title I funding from schools with large Black–White disparities in discipline) to catalyze widespread shifts in school practices. In the school years following the introduction of Rethinking School Discipline, research suggests that schools increased their use of restorative practices (Darling-Hammond, 2023) and that racial disparities in discipline declined (Leung Gagné et al., 2022).
However, in 2018, then-Secretary of Education Betsy Devos rescinded Rethinking School Discipline, and in the intervening years, many school districts have embraced more punitive policies that may encourage disparities to grow (Arango, 2023). In light of these trends, some have called on the Department of Education to reinstate Rethinking School Discipline or promulgate an updated policy package designed to reduce racial disparities in exclusion and punishment (Losen & Martinez, 2020). It is, of course, beyond the scope of this research to determine the precise policy prescription that can combat persistent racial disparities in discipline and punishment. However, we believe this research (which uses federal data to document a federally pervasive problem) indicates that a
State and Local Approaches
Thinking beyond federal policy, it is worth noting that many educational institutions (from schools to state education departments) have invested substantial time, effort, and resources toward reducing racial disparities in discipline, and many of these efforts have been carefully researched. We conclude by highlighting research-backed approaches to reducing racial disparities. One such practice comes from Okonofua, Goyer, et al. (2022), who recently reported on findings from a longitudinal field experiment testing whether providing teachers with professional development in the form of a brief “empathic mindset” intervention might reduce racial disparities in discipline. Their intervention reduced Black–White disparities in discipline (measured as risk differences) from 10.6 percentage points to 5.9 percentage points—a 45% reduction. Moreover, reductions persisted in the subsequent school year, suggesting that student exposure to an empathic teacher had enduring effects. Okonofua has written in other research articles (Okonofua & Eberhardt, 2015; Okonofua, Harris, et al., 2022; Okonofua et al., 2020) that the empathic-mindset intervention is not designed to
Researchers have also found that for Black students, quasi-random assignment to schools with
Theorists and policymakers have argued that a key strategy for reducing racial disparities in discipline is the school-wide implementation of alternatives to exclusionary discipline. One common alternative is Positive Behavioral Interventions and Supports (PBIS), which is an expansive framework of professional development, monitoring, and evaluation designed to help educators (a) teach students social and emotional skills to improve students’ interpersonal behavior and improve school climates; (b) create paradigms that celebrate good behavior to incentivize youth to make good decisions; and (c) leverage evidence-based, developmentally appropriate, educational, and nonpunitive responses to misbehavior. While research has consistently found that PBIS implementation relates to lower discipline rates, generally, some research has suggested that implementing PBIS in its most common form is unlikely to reduce racial disparities (Barclay et al., 2022). However, a recent school-level randomized controlled trial found that a disparity-conscious version of PBIS successfully reduced racial disparities (McIntosh et al., 2021). Researchers attributed the success of their intervention to the inclusion of a program called “ReACT” which seeks to achieve “Racial equity through Assessing data for vulnerable decision points, Culturally responsive behavior strategies, and Teaching about implicit bias and how to neutralize it” (McIntosh et al., 2021, p. 434). Taken together, PBIS may, when implemented in a race-conscious manner, provide a pathway for reducing racial disparities in exclusion and punishment.
Another alternative to exclusionary discipline that is often described as a potential pathway to reducing racial disparities in discipline is restorative practices (RP). RP includes two categories of practices: (a) community-building activities designed to proactively improve school relationships so that misbehavior and misunderstandings are less common; and (b) harm repair activities designed to repair relationships and help misbehaving students develop contrition, empathy, and intrinsic motivation to avoid misbehavior in the future. While school-level studies of the impact of restorative practices on exclusionary discipline disparities have provided mixed results (Darling-Hammond, 2020), recent evidence suggests that when students see
Pathways to Equity
Taken together, research provides clues regarding how schools might forge pathways toward steady reductions in racial disparities in exclusion and punishment. These pathways might leverage a mix of teacher professional development in equity and empathy, educator workforce diversity, equity-oriented PBIS, and expansive RP. However, whatever shape these pathways could take, what is clear is that there is an urgent need to begin building them, so that we can combat the persistent, pervasive, and pernicious disparities documented in the present research. We hope that educational leaders will explore means of incentivizing, guiding, and supporting educators to create schools where students of all backgrounds experience equity, dignity, and opportunity.
Supplemental Material
sj-docx-1-ero-10.1177_23328584241293411 – Supplemental material for No Matter How You Slice It, Black Students Are Punished More: The Persistence and Pervasiveness of Discipline Disparities
Supplemental material, sj-docx-1-ero-10.1177_23328584241293411 for No Matter How You Slice It, Black Students Are Punished More: The Persistence and Pervasiveness of Discipline Disparities by Sean Darling-Hammond and Eric Ho in AERA Open
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Notes
Authors
SEAN DARLING-HAMMOND is an assistant professor at the University of California, Berkeley School of Public Health, 2121 Berkeley Way, Berkeley, CA 94704; e-mail:
ERIC HO is a lecturer in the University of California, Los Angeles Department of Education, Moore Hall, 457 Portola Plaza, Los Angeles, CA 90095; e-mail:
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
