Abstract
How scholars name different racial groups has powerful salience for understanding what researchers study. We explored how education researchers used racial terminology in recently published high-profile peer-reviewed studies. Our sample included 1,427 original empirical studies published in the nonreview AERA journals from 2009 to 2019. We found that two thirds of articles used at least one racial category term, with an increase from about half to almost three quarters of published studies between 2009 and 2019. Other trends include the increasing popularity of the term Black, the emergence of gender-expansive terms such as Latinx, the popularity of the term Hispanic in quantitative studies, and the paucity of studies with terms connoting missing race data or including terms describing Indigenous and multiracial peoples.
Discussions referencing specific racial groups are ubiquitous in education research. 1 These descriptions are often superficial, such as when a quantitative study includes race indicators as covariates, or tangential, as is the case where race is mentioned but not critically explored (Garcia & Mayorga, 2018; James, 2001; Ladson-Billings, 2012; Tabron & Thomas, 2023). However, the ways in which we name and describe different racial groups have powerful salience for understanding what researchers believe and what they study (Gillborn et al., 2018; Ladson-Billings, 2012; Ma et al., 2007; Turner et al., 2024).
Racialization, the application of racial meaning to a “relationship, social practice, or group” in the United States is a complex process that, while socially constructed, can result in tangible consequences (Omi & Winant, 2014, p. 111). Race is a sociopolitical and historical categorization structure that can differ across cultures and over time (e.g., Omi & Winant, 2014; Roth et al., 2023; Viano & Baker, 2020). As noted in Howell and Emerson (2017), “no one can identify ‘true’ racial classifications, because they do not exist” (p. 15). These shifts in racial classification systems should not be considered natural or value neutral. The variations in sets of racial categories have often sprung from the need to maintain White supremacy, leading, within the United States, to enslavement, the stealing of land and resources from Indigenous peoples, Jim Crow segregation, explicit barring of Asian people from immigration and citizenship, and eugenics, among other policy decisions (Omi & Winant, 2014; Saperstein & Penner, 2012). Incorporating racial group status into research without deeper engagement with the reasoning behind the categories thus can reinforce harms and may lead to research that is used to oppress groups of people. For instance, as described by O’Connor, Lewis, and Mueller (2007), a variable indicating Black racial identification is often included with a long list of covariates in statistical models with little theorization or engagement with prior literature to explain why the researchers think it useful to include the information. In doing so, identifying or being perceived as Black is reduced to a category whose membership is mediated through what are seen as sociocultural deficiencies while also obfuscating the process of racialization that sorts groups of people into the category of being Black (Ladson-Billings, 2012; O’Connor et al., 2007; Omi & Winant, 2014; Zuberi & Bonilla-Silva, 2008). Consequently, the decisions about how to name and describe race in education research have powerful salience for interpretation of the findings.
Even though racial data are expected measures in most analytic frameworks (e.g., “race without racism”; Harper, 2012, some areas of the education research community have little critical engagement with understanding contemporary usages of these terms in our research studies (Denton & Deane, 2010; Johnston-Guerrero, 2017). Better understanding of how education researchers use language to describe racial groups is paramount to broader efforts to create an inclusive education research community (Galvez & Muñoz, 2020; Salinas, 2020) as well as communicating education research more effectively.
Although uncommon in education, researchers in other fields (e.g., biomedicine and demography) regularly examine how racial data are collected, categorized, and used (e.g., Caulfield et al., 2009; Rachul et al., 2011; Shanawani et al., 2006; Stevens et al., 2015). For example, Lee (2009) reviewed National Cancer Institute–supported publications for their terminology related to race, finding that these studies commonly invoked racial terminology but rarely described definitions or operationalization. The education research community similarly could benefit from an explicit examination of how race is being categorized/discussed in scholarship because this, in turn, shapes how future researchers employ racial categorization and frame the implications of their findings.
Unlike prior work that critically analyzes how education research often avoids recognizing racism as the cause of racial gaps (Harper, 2012; Kohli et al., 2017), this kind of exploration of the literature seeks to understand what specific language education researchers are using to describe racial categories. How researchers write about racial categories is likely heterogeneous within fields for a variety of reasons, including the expectations of journal editors, common practices of reporting by method, and trends in how others are writing about racial categories. First, both Harper (2012) and Kohli et al. (2017) recommend that journals play a role in reinforcing requirements or norms around the discussion of racial categories, supporting the hypothesis that the journal publishing the article might partially determine how racial categories are described (Lee, 2009; Ma et al., 2007). Second, racial categorizations are dynamic, with categories changing based on shifting social understandings about the boundaries between groups and fluctuating expectations around what is considered appropriate terminology (Denton & Deane, 2010; Ma et al., 2007; Williams, 1999). For example, the term Asian American was created in the 1960s as a political resistance tactic (e.g., Ishizuka, 2018), Irish people were not originally considered White on immigration to the United States (e.g., Brodkin, 1998), and, this year, the U.S. Office of Management and Budget (OMB) announced Middle Eastern and North African as a new racial category (previously included under White) for the standard racial categories for the federal government (Wang, 2024). One might expect to see some shifts in language that might, nevertheless, fail to perpetuate fieldwide.
Third, the differences in how qualitative and quantitative researchers theorize and analyze racial differences are well documented, with substantial criticism of the ways in which quantitative research often reinforces racial hierarchies instead of supporting justice-focused efforts to eliminate racial inequality (Kohli et al., 2017; O’Connor et al., 2007; Turner et al., 2024; Zuberi, 2001). These critiques particularly highlight the erasure of student groups with smaller population sizes, such as Indigenous students (Shotton et al., 2012). To our knowledge, these observations have not yet been paired with an analysis of how qualitative or quantitative studies operationalize racial categories. Exploring racial category usage across all three of these complicating factors would allow for a heightened understanding of how educational research has recently conceptualized these categories and potential reasons why these categorizations vary.
In this study we explore how educational researchers have used racial terminology in published peer-reviewed studies between 2009 and 2019. We focus on original research published in journals from the American Educational Research Association (AERA, https://www.aera.net/Publications/Journals) because these journals attract a wide variety of articles of different orientations and publish articles at the forefront of education research. Our work addresses the following research questions:
To what extent are there trends over time in the rate of inclusion of racial categories in published education research overall, within journal, and by research method?
What terminology does published education research use to describe racial categories? Does that terminology differ by journal, research method, or over time?
By systematically exploring these published articles, we seek to make the following contributions: (1) quantify how often researchers acknowledge or omit race in education research, (2) provide a baseline understanding of which racial categories education researchers use and their frequency, (3) investigate differences among journals, (4) explore trends over time for how language on racial groupings is changing, and (5) understand differences across methodologies. In doing so, we seek to spark a conversation about how we, as a community of education researchers, can represent racial groups in our research in ways that are both representative of our participants and reflective of broader efforts toward antiracism and decolonization.
Racial Categorization/Classification
Classifying and categorizing are part of the human social experience (Bowker & Star, 2000). Humans draw lines and classify most everything in our social world—we divide land into towns, counties, states, and nations in similar fashion to how we categorize people by language, ancestry, religion, beliefs, and other characteristics. Categories are social phenomena, laden with the contextual, political, and social understandings of the people who create and use them. Consequently, categories play a central role in our lives. They are often treated as static when, in fact, categorization is quite dynamic. Race categories in the United States, and around the world, are impacted by histories of settler colonialism, enslavement, xenophobia, and systemic anti-Black, anti-Indigenous, and anti-immigrant legacies (Mills, 1997; Omi & Winant, 2014; Saucier & Woods, 2018; Zuberi & Bonilla-Silva, 2008). These significant social forces shape how race categories are constructed and have real, material impacts on peoples’ lives—their access to healthcare, schooling, employment, and housing (Smedley & Smedley, 2005).
Increasingly, race scholars in sociology and political science are recognizing race as having multiple dimensions (Morning, 2009; Rockquemore et al., 2009; Roth, 2018; Sen & Wasow, 2016). A person has a racial identity—a rich tapestry of being and experiencing. A person has one or more racial categories with which they self-identify when confronted with a survey that prescribes discrete options (Rockquemore et al., 2009; Johnston et al., 2014). A person is racially appraised (or identified by observers) when they enter a classroom, walk down a street, or are seen at a hospital (López et al., 2018; Telles, 2014). A person’s racial identity, racial category, and “street race” may all align or may differ significantly depending on the context (López et al., 2018). In many cases, there is fluidity among and between these dimensions of race over the life course and depending on the context.
In education scholarship, researchers employ racial categories as shorthand to describe a rich, complex web of racializing experiences. Although these categories help to highlight group-level processes (places where systems of oppression are faced), the categories themselves hold power. Education researchers who use race categories without reflection risk reifying notions of innate racial difference. For instance, when studying the so-called Black–White test score gap, it can be easy for categorization to suggest that the gap is produced by differences innate to Black and White students rather than differences produced via systemic racism in and outside the schoolhouse (Ladson-Billings, 2006, 2007; O’Connor et al., 2007; Welner & Carter, 2013). Aligning with this concern, scholars have found evidence that using the framing of achievement gaps causes an increase in the negative perception of Black students among members of the general public and K–12 teachers (Quinn, 2020; Quinn & Desruisseaux, 2022). Researchers too often measure the effects of racism and attribute them to racial difference—reifying race and racism (Sewell, 2016; Zuberi, 2001). In summary, racial categories are created by people, either imposed or claimed, and are related to political and material interests. Racial categories have tangible impacts on peoples’ lives, including the kinds of resources and opportunities that are available to them.
Other Fields’ Consideration of Racial Categorization
Although uncommon in education, systematic and critical reviews of which racial categories are being used by researchers is common in other fields. Demography, epidemiology, media studies, political science, and criminology have regularly examined how racial data are collected, categorized, and used in research (e.g., Covington, 1995; Garcia, 2017; Gomez & Glaser, 2006; Hahn et al., 1996; Kanakamedala & Haga, 2012; Kelley at al., 1996; Megyesi et al., 2011; Shrikant & Sambaraju, 2021). We will describe two examples from other fields to illustrate this point.
From epidemiology, Ma et al. (2007) systematically reviewed every research article published in the Annals of Internal Medicine, the Journal of the American Medical Association, Lancet, and the New England Journal of Medicine between 1999 and 2003 (n = 1,152) for the included racial categories. They found that researchers referred to White using 16 terms, 13 terms for Black, 16 terms for Asian, and 11 terms related to Hispanic ancestry (Ma et al., 2007). After describing their results by journal and year, Ma et al. connected their systematic review of racial categories to larger issues of policy and practice in epidemiology. They wrote that to meet the goals set forth by the National Institute of Health (2005) and the American College of Physicians (Groman & Ginsburg, 2004), reflection and consistency across studies about how racial categories are used were a necessary “initial step to closing the health care gap” (Ma et al., 2005, p. 577).
A second illustrative example comes from demography. Stevens et al. (2015) analyzed the historical censuses of the United States, Canada, and Australia to examine how the racial categories have shifted over time. These three nations were selected because they all share a similar history of having Indigenous inhabitants, settler colonialism, and huge waves of global immigration in the 20th century. The authors identified three trends shared across these nations. First, new categories and new groupings of categories are added over time and reflect immigration patterns. For example, in the United States, the category of “Hispanic” was introduced in 1980, Canada began to use the concept of “visible minority” in 1990, and Australia began collecting parents’ birthplace as well as birthplace of the individual. Second, European ethnic groups (e.g., Italian and German) were once used as different racial categories. As racial difference among White people of direct European descent became less socially salient, a single White category became more common (although in Canada and Australia, the national signifiers “Canadian” and “Australian” are increasingly popular).
Third, each nation moved through several permutations of how to classify and categorize people with mixed-race heritage. In the first century of each census, the enumerator (door-to-door census taker) would visually observe, assess, and record a single race of each person. Later, government agencies provided decision rules for how to classify the race of the individual. In the United States, the race of the father was used in 1970 for a mixed-race person, but then the race of the mother was deemed a better indicator in 1980. Prior to the 1970s in Australia, the government asked fractions to be provided (e.g., “one half Aboriginal, one half Chinese”; Stevens et al., 2015, p. 25). Between 1990 and 2000, the census categories for the three countries changed to accommodate the selection of two or more races. Stevens et al. (2015) concluded that these three countries have different conceptions of race measurement, but they all share common confusion because of the volatility and complexity of racial identification.
Scholars in fields outside of education value these previous analyses of how race categories vary because where race boundaries are drawn shapes how people understand their social realities and also shapes the lessons scholars can take away from the research. We use these insights to inform our analysis of published education research, as described in the following section.
This Study
This review of prior research has argued that it would be shortsighted for education researchers to take racial categories for granted. Doing so makes invisible power and political/social mores, leads to conceptual and methodologic confusion, and, in some cases, can lead to faulty conclusions that can misinform policy/practice. This review of prior work reiterates that categories reflect the political and social interests of people and that, historically, race categories have been used to turn difference into a way to exclude and justify the oppression of people whose origins are not White/European.
In addition to drawing from theories and prior work on categorization/classification, we ground our study in the sociology of knowledge that focuses on studying the daily, taken for granted, shared assumptions and rules about social life (e.g., Garfinkel, 1967). Sociologists of knowledge argue that the deepest insights about the social world could be observed in studying what others may view as mundane, such as the social norms, routines, and beliefs guiding daily life (Berger & Luckmann, 1967; Boutros, 2024; Garfinkel, 1967; Go, 2020; Kuhn, 1970; Merton, 1972; Morning, 2011; Riley et al, 2021; Roberts, 2011; Swidler & Arditi, 1994; Zuberi & Bonilla-Silva, 2008). Scholars can examine people’s lived experiences and motivations that gird daily life or the behaviors that result from those views. We focus on the latter in the topic motivating this study—the race categories researchers routinely use. As Omi and Winant (2014) highlight, “race operates as a ‘common-sense’ concept, a basic component of social cognition, identity, and socialization. . . . Race seems obvious and in some ways superficial” (p. 4). Therefore, the sociology of knowledge offers a valuable opportunity to investigate the racial classification system used by education scholars.
In line with this theory tradition, we are denaturalizing the categories by examining them closely and carefully not as objective reality but as decisions that people make and remake until they shape our social reality. What categories are being used in education research? By what kinds of methodologic researchers? How has this changed over time? These questions may seem basic, though, given our theoretical frame, troubling the basic is where the deep social insights reside. Indeed, our research questions are foundational because the social process of categorization precedes all other research processes (Hirschman et al., 2016).
Given that race categories are not “natural” but rather the products of social and political decisions, we argue that researchers must pay attention to which categories are used and how they change, approaching these categories as political and social creations rather than “facts of biology and/or fate” (Gillborn et al., 2018, p. 172). Of note, when examining classification through this lens, it is also imperative to consider who has the power and authority to determine which categories are appropriate in a given time and place because this can play a central role in the creation of categories and their impact (e.g., Collins, 2015; Morning, 2009, 2011; Roberts, 2011; Swidler & Arditi, 1994). Because there is no “true” set of racial categories for any society, we believe that the role played by key stakeholders (e.g., the OMB, that determines which racial categories will be used by the federal government, and editors of academic journals) must be part of how we make meaning of the use of racial categories in this study. In contemporary education research, most empirical analyses use fixed categories for race without much consideration or reflection. Although education journals have not engaged in this kind of examination of which race categories are used, we demonstrate that this need not be the case as other fields carefully examine how scholars use racial categories in research. This study aims to bring this kind of systematic analysis and careful thinking about racial categories to education research.
Research Methods
We provide an overview of our data-collection and analysis methods here. We include a positionality statement in supplementary material for this article (see Supplement A in the online version of the journal).
Data Collection
A member of the research team compiled a list of all potential research articles published from 2009 to 2019 in the AERA journals that do not exclusively publish reviews: AERA Open, American Educational Research Journal (AERJ), Educational Evaluation and Policy Analysis (EEPA), Educational Researcher (ER), and Journal of Educational and Behavioral Statistics (JEBS). 2 We include original empirical research to allow the authors of those articles more control over the racial categorization, since review articles likely would use the language of the articles they were reviewing. This search resulted in 1,623 articles. 3
A different research team member created a coding frame focused on (1) whether the article included original empirical research, (2) key facts about the article (e.g., keywords, first author’s academic affiliation), (3) research methodology, (4) racial categories included for Asian/Pacific Islander, Black, Native American, Latinx, White, two or more races, and missing/unknown, and (5) whether the article studied U.S.-based populations (see Supplement B in the online version of the journal for the full coding frame, which included additional items not analyzed in this study). For the racial categories, we began with a list of categories based on the U.S. Census during the analytical timeframe (2009–2019). As an example, for the Asian/Pacific Islander racial category, the form prompted: “Race/ethnicity category(ies) for Asian and Pacific Islander (used anywhere in the paper, check all that apply): (1) Asian, (2) Asian American, (3) Native Hawaiian, (4) Pacific Islander, (5) N/A, (6) Other (Free response).” Coders selected all options that applied to each article with original empirical research.
To begin, all authors coded five randomly selected articles and met to reach consensus on all items and revise the coding frame based on the initial coding. 4 Once the coding frame was finalized, the four authors split the remaining articles and coded them separately. Two authors completed their coding with the aid of trained research assistants. Once the research team completed coding the articles, a research team member created a 10% random sample from the list of articles assigned to each of the four authors (132 articles). A trained research assistant who had not conducted any of the original coding then double-coded all the articles in the random sample, and we compared the two sets of coding to assess interrater reliability. The coding had an overall reliability of 93% (the primary and secondary coders had identical responses for 93% of their codes) with reliability by coder ranging from 84% to 93%. Because the 84% reliability was an outlier (the next lowest was 91%), two members of the research team completed a second round of coding for all the articles originally coded by the member with the lowest reliability rating. The first author then reconciled the original and second rounds of coding (either by retaining codes that matched across both coders or reviewing the article herself and making a final decision).
Once we removed articles that did not include original empirical research, our final dataset included 1,427 articles. Most articles (1,267; 88%) analyzed U.S. domestic data. The number of articles published per year has been increasing from 85 in 2009 to 221 in 2019 (includes the addition of AERA Open in 2015). The overwhelming majority of empirical articles used some type of quantitative method (~80%). Finally, AERJ published the most articles during the analytical time period (456), although AERA Open has published an extraordinary number since it came into existence in 2015, with 224 articles (ER has 200, JEBS 262, and EEPA 285).
Analysis Methods
With the final analytic set of articles coded, we created measures of the different racial categories. Creating these categories was an iterative process that involved the first author creating measures based on categories in the coding frame as well as the text written as free response by individual coders. The first author would then present these categories, along with the full list of all raw codes, to the research team to discuss refinement of the analytic categories. Based on this iterative process, completed three different times, we created a final set of analytic categories that we outline in Table 1. Beyond the categories based on the U.S. Census, we included options such as “non-Hispanic” (e.g., “non-Hispanic Black students”), gender-expansive terms for Latinx (e.g., “Latino/a” and “Latino/a/x”), and ethnic group, nationality, or region for Asian or Pacific Islanders (e.g., “Vietnamese,” “Filipino,” and “Chinese”). 5 We created binary variables that equal 1 if the article mentions a term within each category. 6
Analytical Categories
Note. For region or country within the Asian or Pacific Islander category, we included the following options: Vietnamese, Filipino, Chinese, Japanese, Korean, Malay, Indian, East, South, Hmong, Aleutian, Lao, Maori, Nepalese, Bengali, Pakistani, Sri Lankan, Cambodian, Guamanian, Thai, Indonesian, Fijian, Marshallese, Polynesian, Tahitian, Samoan, and Desi. For gender-expansive terms within the Latinx category, we included the following options: Latino/a, Latino(a), Latinas(os), Latino/a/x, Latinx, Latina/o, Latinas/Latinos, Latinos/as, and Latinos/Latinas. For decline-to-report terms within the missing category, we included the following options: decline, respond (e.g., did not respond or not responding), prefer (e.g., prefer not to say), indicate (e.g., not indicated), not report (e.g., not reported), and not provided. We determined the options based on the coded responses for the articles included in the sample.
We descriptively investigated the analytic set of articles using summary statistics focused on how frequently articles used terms for different racial categories. 7 In addition to the racial categories, we included indicators of the journal outlet (AERA Open, AERJ, EEPA, ER, or JEBS), year published, and research method (i.e., qualitative, quantitative, or mixed). We then explored how the frequency of different racial categorizations differed by journal, year published, and method. We conducted all the analysis focused on specific racial categories solely on the articles that at least partially used data from the United States. We added this additional sample restriction for the analysis due to the variation in the sociopolitical context for race and ethnicity across countries (Marquardt & Herrera, 2015; Stevens et al., 2015).
Findings
In this section we address both research questions while examining the use of any racial categories and then specific terms.
Any Racial Category
We first review the trends in published education research using any racial category. Approximately two thirds of all the empirical articles published from 2009 through 2019 in AERA journals included at least one racial category. Figure 1 shows that a larger share of more recent articles used any racial category, from 53% in 2009 to 73% in 2019. Still, there was significant variation when we examined article characteristics and use of racial categories. A little more than 80% of the articles published by AERJ and EEPA used any racial category, with AERA Open following closely (75%). These journals published noticeably more articles that used any racial category compared with ER (63%) and JEBS (13%).

Percentage of Articles that Used at Least One Racial Category Over Time
Turning to temporal trends by outlet, Figure 2 is a heat plot that uses a color gradient to visually show the percentage of articles that used at least one racial category over time by journal outlet. The 0% to 10% range is shown in yellow, with the color gradient transitioning to green tones after 10%, followed by a transition to blue and then purple tones at 50% and above. The darkest purple tone indicates that 100% of articles in that journal in that timeframe included racial categories. In AERA Open, which began publishing in 2015, consistently more than 50% of its articles used a racial category. This was similar for both AERJ and EEPA. We can visually observe the increase in the proportion of ER articles using racial categories over time with the cells between 2009 and 2011 being more of a yellow-greenish hue and the articles between 2016 and 2019 having a purple hue. In contrast, JEBS decreased its share of articles that use racial categories, with cells between 2009 and 2013 having a greenish hue compared with articles after 2014 generally being yellow. In 2009, 18% of JEBS articles used any racial category, and this decreased to 8% by 2019. Although all research methodologies had a majority of articles that used a racial category, quantitative articles had a smaller share than qualitative or mixed-methods articles (64% vs 75% and 71%). There is little evidence of a temporal trend across the different methods.

Percentage of Articles that Used at Least One Racial Category Over Time by Journal
Specific Racial Categories
Now that we have provided an overview of any racial category use, we turn to the specific categories we created for different racial groups looking solely at domestic data. Figure 3 shows the share of articles in each year that mentioned any of the terms in each racial category (as described in Table 1). Overall, use of the different racial categories has increased over time. The largest share of articles mentioned terms for White, Black, and Latinx categories and then the Asian or Pacific Islander category, with Indigenous and two or more races categories following. Terms denoting race missing were the smallest percentage every year.

Percentage of Articles that Mentioned at Least One Term in Each Racial Category
Turning to the individual categories, we expanded on the categories listed in Table 1 by examining data on the percentage of articles using the terms in each row within columns (e.g., the proportion of articles using each term for Asian or Pacific Islander) overall, by year, by journal, and by method in Tables 2 through 8 with one table for each column of Table 1. We structure our discussion of the results by racial category alphabetically using our selected term representing each broader category. Table 2 shows the share of articles that mentioned any terms in the Asian or Pacific Islander category. Each row presents percentages for each subsample, starting with the overall total, and then each publication year, the journal outlets, and research method. Interpreting one estimate as an example, we found that ~43% of articles used any Asian or Pacific Islander term, with the majority term being “Asian” (42%) as compared with “Asian American” (8%), “Native Hawaiian” (4%), “Pacific Islander” (11%), “non-Hispanic” (e.g., “non-Hispanic Asian,” <1%), and ethnic group/nationality/region (3%). As shown in Figure 3, the use of any Asian or Pacific Islander term increased over time from 35% of articles including any Asian/Pacific Islander term in 2009 to 50% of articles in 2019. Although AERJ had the largest share of articles using any terms for the Asian category (58%), AERA Open had the largest share of articles using “Native Hawaiian” and “Pacific Islander” (8% and 18%, respectively). Quantitative research had the largest share of articles using “Asian,” “Native Hawaiian,” “Pacific Islander,” and “non-Hispanic,” whereas qualitative research was more likely to use the terms “Asian American” and ethnic group/nationality/region-specific terminology.
Term Use for the Asian or Pacific Islander Category
Note. Each column reports the percentage of articles that used the respective column’s term(s) for each row’s subsample. “Mixed” under “Method” refers to mixed-methods research.
Term Use for the Black Category
Note. Each column reports the percentage of articles that used the respective column’s term(s) for each row’s subsample. “Mixed” under “Method” refers to mixed-methods research.
Term Use for the Indigenous Category
Note. Each column reports the percentage of articles that used the respective column’s term(s) for each row’s subsample. “Mixed” under “Method” refers to mixed-methods research.
Term Use for the Latinx Category
Note. Each column reports the percentage of articles that used the respective column’s term(s) for each row’s subsample. “Mixed” under “Method” refers to mixed-methods research.
Term Use for the Race-Missing Category
Note. Each column reports the percentage of articles that used the respective column’s term(s) for each row’s subsample. “Mixed” under “Method” refers to mixed-methods research.
Term Use for the Two-or-More-Races Category
Note. Each column reports the percentage of articles that used the respective column’s term(s) for each row’s subsample. “Mixed” under “Method” refers to mixed-methods research.
Term Use for the White Category
Note. Each column reports the percentage of articles that used the respective column’s term(s) for each row’s subsample. “Mixed” under “Method” refers to mixed-methods research.
Table 3 shows the Black categories. We found that ~65% of all articles used any term for Black, with a general increase in use over time from 55% in 2009 to 73% in 2019. The largest share of articles used the term “Black” compared with “African American” and “non-Hispanic” (51% vs 36% and 2%, respectively). We can see that the increase in the mention of any category for Black was driven by the term “Black,” which appeared in only 36% of articles in 2009 and rose to 63% in 2019, whereas use of “African American” declined from a high of 43% of articles in 2010 to a low of 29% in 2018. Turning to journal outlets, we found that AERJ generally had the highest percentage of articles that mentioned any term for the Black category. Although articles published in AERA Open and EEPA were more likely to use the term “Black,” articles published in AERJ were more likely to use the term “African American.” While quantitative research had the smallest percentage of articles using any Black category term (64% compared with 69% of qualitative and 68% of mixed methods), those articles had the highest percentage for “non-Hispanic” (2%). Articles using quantitative research methods also had the largest share of articles that used the term “Black” (nearly 52% vs almost 48% for qualitative and 46% for mixed methods).
We explore the Indigenous category in Table 4. About a fifth of publications used any term for Indigenous, with the most popular term being “Native American” (11%), closely followed by “American Indian” (10%), with “Alaskan Native” (5%), “Indigenous” (0.3%), and “non-Hispanic” (0.2%) being less popular. Unlike the use of terms for Black and Asian, we did not observe a consistent increase in terms for Indigenous, with the peak percentage of articles including an Indigenous category occurring in 2010 (26%) and fluctuating between 15% and 25% thereafter. There were similar inconsistencies by term with the use of “Native American” and “American Indian” fluctuating up and down by year, although the use of “Alaskan Native” and “Indigenous” increased over time. Similar to the other racial categories, although AERJ had the largest share of articles using any term, AERA Open had the largest percentages for “American Indian” and “Alaskan Native.” Quantitative research methods again had the highest percentage for “non-Hispanic” as well as “American Indian” and “Alaskan Native,” and only qualitative articles used the term “Indigenous.”
Turning to the Latinx category, Table 5 presents the trends. Although 60% of articles used any term for Latinx, the highest percentage of articles used “Hispanic” followed by “Latino” (49% and 23%, respectively). While use of each term generally (although inconsistently) increased over time, “Latinx” and the gender-expansive category had an exceptional increase in representation in 2019 (from 3% in 2018 to 17% in 2019 for Latinx and 4% in 2018 to 20% in 2019 for any gender-expansive term). Unlike the prior racial categories, AERA Open published the largest share of articles mentioning any Latinx term. Although articles published in AERJ typically mentioned the previous categories at the highest rates, in the Latinx category, only “Latino” was the highest for AERJ. Continuing the reverse, quantitative methods had the highest share of articles using any term for Latinx categories. This reversal in the trend was overwhelmingly driven by use of the term “Hispanic,” whereas qualitative and mixed-methods articles were more likely to use the other terms.
Table 6 includes the usage of terms for the race-missing category. The most consistent trend was that terms for the race-missing category were infrequently used, appearing in only 2% of articles. The share of articles mentioning a race-missing category increased over time from 0% to 4%–5% (depending on the year). Articles published in AERA Open were the most likely to include a race-missing category, although the highest percentage for any individual term is still less than 2%. We observed few differences between qualitative and quantitative articles, with mixed-methods articles being the most likely to include a race-missing category.
We present the trends for the two-or-more-races category in Table 7. Although 14% of articles included any two-or-more-races category, the most popular term was “multiracial” at 9%. Unlike most of the other categories, use of any term for the two-or-more-races category consistently increased from 2014 (whereas the other categories generally increased but did have periods of decrease). Similar to the Latinx category, AERA Open articles had the largest share of terms for the two-or-more-races category. Qualitative articles were most likely to mention any two-or-more-races category, although qualitative and quantitative articles had similar proportions using the term “multiracial.”
The results for the final category, White, are presented in Table 8. Overall, 64% of articles used any White term, with the majority term being “White” (61%) as compared with “Caucasian” (6%), “non-Hispanic” (5%), and “European” (2%). As shown in Figure 3, the use of any White term increased over time, with AERJ and AERA Open publishing the largest shares of articles with any White term (81% and 77%, respectively). This increase was concentrated among the terms “White” and “non-Hispanic” with no consistent changes in the use of “Caucasian” and “European.” Similar to the overall use of any racial term, qualitative and mixed-methods research used White terms more frequently than quantitative research. Similar to some of the other racial categories, although most of the White category terms were used less frequently in quantitative research, “non-Hispanic” was used more frequently in quantitative and mixed-methods research than in qualitative research (5% for both compared with 2%).
Discussion
In this study we conducted an analysis of how education researchers used racial terminology in published peer-reviewed studies appearing in five AERA journals between 2009 and 2019. Based on our findings (described earlier), we now highlight the complexities and overall contributions of the study. We organize the discussion around trends over time, less common terminology, how our findings relate to the U.S. Census, and differences across journals. Throughout the discussion, we rely on our understanding of the construction of racial categorization to explain how our observations on researchers’ use of racial terminology could be driven by the larger national sociopolitical context.
This discussion of results has several inherent limitations to consider when interpreting our findings. First, we can only observe what authors wrote in the final publication. In other words, we do not observe their data or their original manuscript prior to the revision process. We do not know whether the language we observe is a result of decisions during data collection that might not have been within the control of the authors. It is also possible that racial category language is influenced by the revision process such that the authors might have preferred different terminology than what was published. Second, our findings could be partially a reflection of journal word limits. AERJ accepting the longest manuscripts could mean more space for including discussion of racial categories even if they are not a core aspect of the analysis. In journals such as ER that accept shorter articles, these terms might only appear in supplementary material (which we did not analyze). We did not specifically measure the density or quality of discussion or use of racial category terminology, so the higher rate of racial category usage in AERJ could reflect room for longer tables that include covariates. Other journals could plausibly include a higher density of articles that authentically engage in discussions on racial categorization, which would not have been captured in our coding framework.
Trends in Categorical Usage Over Time
There is much to learn from the trends in usage of particular racial categories. We found an overall increase over time in term usage across all categories. All categories seemed to match a similar upward trend, with fluctuations over time. However, the two-or-more-races category uniquely showed a consistent upward trajectory after 2013. Given the fact that it was not until 2010–2011 that the U.S. Department of Education mandated that institutional data collection meet OMB Directive 15 guidelines for allowing students to report two or more races (Renn, 2009), perhaps this distinct trend occurred because of the potential lag in getting these data (or even data from the 2010 U.S. Census) before then moving toward publication. Moreover, the trend likely also reflects a steady increase in the representation of and consciousness around multiracial people across education (Harris, 2016; Howard, 2018), even with continued debate about the utility of such grouping for civil rights laws (Hernández, 2018). One additional insight is that “multiracial” was the most used term within two or more races, which proves interesting given that naming practices are widely contested for this group. For example, some multiracial people identify as mixed, biracial, or, more specifically, “Blasian” (i.e., Black and Asian) or “Mexipina” (i.e., Mexican and Filipina).
Our findings also documented a trend in various categories used to describe Black populations, with a steady increase in the specific term “Black” while “African American” decreased over the analytic time period. This trend aligns with increasing consciousness around racial injustice and solidarity among Black peoples across the diaspora, especially given that recent immigrants from the African continent and Caribbean, for instance, might see themselves as Black but not African American (Fries-Britt et al., 2014).
Similarly, the gender-expansive term “Latinx” first appeared in AERA journals in 2016, where only 2.05% of articles included it. By 2019, “Latinx” skyrocketed to 17.1% appearance across all journals except for JEBS, along with a substantial increase in the usage of other gender-neutral terms such as “Latina/o.” Our findings did not suggest that these gender-expansive terms are necessarily replacing other terms representing Hispanic/Latino groups because “Hispanic” also increased and “Latino” held steady across the 10 years.
Relatedly, we found a general upward trend in the use of any White category, from 53.3% in 2009 to 73.6% in 2019. Although this matches the general increasing trend in all categories, we highlight the importance of naming White/Whiteness as a racial category instead of deeming it the default or norm (Sue, 2004).
Lower-Frequency Terminology
We highlight the significantly smaller number of articles in our dataset that engaged with the Asian or Pacific Islander, Indigenous, two-or-more-races, and race-missing categories. This lack could be reflective of demographic representation, a function of the prominence of the Black–White binary, or researchers’ notions of the necessary sample size for sufficient statistical power in quantitative analysis (even though the necessary sample size to uphold statistical assumptions is actually fairly low compared with the samples in the quantitative research we reviewed). The model minority myth that Asian Americans (and Pacific Islanders as a result of collapsing groups together) are widely successful (Jang, 2018) may contribute to a larger narrative that research is not necessarily needed to support this community (compared with others). Moreover, Indigenous population sizes often have relegated them to being merely an asterisk in research studies noting that their small sample size precludes them from being separated out in the analysis (Shotton et al., 2012), further emphasizing the colonial erasure of Indigenous peoples in education research.
Our findings showed that the percentage of articles engaging the race-missing category is incredibly small. This low engagement could be because, within the K–12 education administrative data landscape, there are no missing-race or unknown students (Ford, 2019). Ford (2019) explained how school administrators must assign a racial category (through observer identification) to students if they do not provide a racial self-identification. Yet, within the higher education sector, race-unknown students are often clustered at the most and least selective institutions (Ford et al., 2021, 2022). More research is needed to understand the individual-level motivations and incentives for opting out as well as the organizational practices of collecting/reporting data. For instance, Renn (2004) found a pattern of multiracial identity termed “extraracial,” where students were opting out of racial categorization in an attempt to deconstruct race and exist beyond racial categories. However, other research suggests that the race-unknown category for college students is largely White students (e.g., Ford & Holland, 2020), which may signify a desire to distance oneself from Whiteness or lack of knowledge that “White” is a racial identity.
Although the term is not extremely popular, appearing in about 5% of published educational research, continual use of the term “Caucasian” is troubling given the research highlighting the term’s problematic history rooted in White supremacy. Mukhopadhyay (2018) traced the origins of this term to 18th-century Europeans desiring to classify peoples in an emerging “racial science,” with Johann Blumenbach popularizing the term for Europeans to have origins in the Caucasus mountains because he saw the light-skinned people of this region as the most beautiful and ideal type of humans (in “God’s image”). Blumenbach attributed value and character to these groupings, which emboldened the racial hierarchy with “Caucasians” on top and all others denigrated.
In the United States, key legal battles around which peoples could hold citizenship eventually led to “Caucasian” not just being a sociopolitical category but also a legal one with much consequence. The early 1920s U.S. Supreme Court cases Ozawa v. United States and United States v. Thind demonstrate the variability of justifications used to police and solidify the boundaries of Whiteness. While Japanese-born Takao Ozawa was denied naturalization because, despite his white skin color, his race was not deemed “Caucasian,” the court later ruled against Bhagat Singh Thind that despite his “Caucasian” or Aryan origins, his brown skin meant that he was not White (Haney López, 1997). Though the term “Caucasian” has been popularized as a polite or scientific term (Saini, 2019), it is likely that people who continue to conduct research using this term have little to no idea of its racist history. However, much more consciousness has been raised about the term, and calls have been made to discontinue its usage in alignment with a broadening body of critical Whiteness studies (Matias & Boucher, 2021) that spotlights power dynamics associated with Whiteness. More attention is needed among education researchers to expose this term’s racist pseudoscience origins to spur discontinuation of its use.
Following Census Categories
We found that “non-Hispanic” was used more often as a qualifier when describing White categories (4.5%) and, to a lesser extent, Black categories (2.1%). Although this practice seems like an accurate reflection of the specific issues related to the two-part Hispanic ethnicity versus race question used by the U.S. Census, it also, in some ways, mimics the policing of the boundaries of Whiteness and who counts as White, especially because we observed less use of the preface “non-Hispanic” with Black populations despite warranting further nuance for Afro-Latinx people (Dache et al., 2019). Moreover, the miniscule usage of “non-Hispanic” with various Asian or Pacific Islander or Indigenous categories needs more attention. This dichotomy could lead to erasure of those who are both Hispanic and Asian, Pacific Islander, or Indigenous. For instance, many Latin American countries have had long histories of immigration from Asian countries (e.g., Japanese in Peru and Filipinos in Mexico; Hu-DeHart & López, 2008). How can our current understandings of racial categories better capture this population? Moreover, how do the current racial categories reveal a more extensive history as well as the long-standing effects of colonization in education?
The finding that “non-Hispanic” is used primarily by quantitative and mixed-methods researchers across all racial groups is also important to note. This is most likely because of alignment with how the U.S. Census collects its survey data given the separated Hispanic ethnicity versus race questions. For qualitative research, this modifier of “non-Hispanic” may not be needed (and looks to not be used as often), illuminating the intricacies of racial category terms and the variability in usage methodologically. For instance, “Hispanic” is used more in quantitative (52.22%) than qualitative (37.44%) studies, whereas “Latinx” is used more in qualitative (8.72%) than quantitative (2.37%) studies. As mentioned earlier, the specific term “Latinx” is becoming more widely used across education and in popular culture (Salinas, 2020), though there has also been some recent pushback. Given that the origins of the term “Latinx” are in Latin America, people in the United States may not as easily recognize or accept it. For instance, Pew researchers found that although 23% of surveyed Hispanics had heard of the term, only 3% used it to describe themselves (Noe-Bustamente et al., 2020). Moreover, there have been critiques that “Latinx” is unpronounceable in Spanish or is an elite term used by academics, despite its origins being from community activists and many users recognizing the “x” as relating more to their Indigenous roots (Salinas, 2020). Our study showed the large increase in both “Latinx” and gender-expansive terms in 2019, demonstrating that researchers likely desire to recognize more expansive notions of gender but do not find “Hispanic” to be a good alternative (Viano & Baker, 2020). It will be interesting to see how this trend continues or whether “Latinx” might be replaced with Salinas’ (2020) recommendation of “Latin*” or “Latine” (an option reflecting Spanish grammar conventions; Slemp, 2020) as more inclusive and disruptive alternatives, especially as identities become further negotiated.
Overall, the wide variability in usage of different categories suggests that education researchers are not actually strictly following the U.S. Census categories. Instead, it is more likely that they are sticking closely to the categories used on various surveys, some of which align with the Census while others may not. Yet researchers have the ability to change the categories they use to better align with the lived experiences of minoritized people. For example, the Census does not use “Caucasian,” yet this term continues to be used in education research.
Journal Differences: Why Is JEBS an Outlier?
Our study found differences across journals in usage of different racial terms despite all being education research journals within the same professional association (AERA). AERA Open published the largest share of articles mentioning any Latinx term and specifically “Latinx.” AERA Open articles also had the largest share of terms for the two-or-more-races category as well as the specific terms “American Indian,” “Alaska Native,” “Native Hawaiian,” “Pacific Islander,” and “multiracial.” Given the historically fast review process of AERA Open, it may allow unique opportunities to more quickly adapt to changes in terminology. AERJ generally had the most racial term use except for the Latinx, two-or-more-races, and race-missing groups. Future studies could explore potential reasons for these differences, such as editorial board demographics or the editor statements that are published in the journals.
By not naming racial categories, education research tends to take a color-evasive approach (Annamma et al., 2017). This perspective might explain why JEBS had low rates of using racial categories. However, this approach perpetuates the false idea that the field of statistics is objective and neutral (Gillborn et al., 2018; Zuberi, 2001) when, in fact, statistics was born out of the 19th-century eugenics movement (Saini, 2019; Zuberi, 2001; Zuberi & Bonilla-Silva, 2008). Social science research has a troubling history of using statistics to reinforce racial hierarchy, with many core tools of educational psychology and statistics (e.g., IQ tests), created by eugenicists, still in use today (Zuberi & Bonilla-Silva, 2008). JEBS is uniquely positioned to help guide the field in ways to incorporate race and racism into statistical models. As long as educational and behavioral statistics ignore this problematic history by maintaining a false stance of neutrality, so too will continue the commodification and, in this case, erasure of racial categories.
Conclusion
This study unveils the usage of racial categorization in high-profile published education research and, in doing so, uncovers several key findings related to changes in language usage, differences across journal/methodology, and growth areas for future published research. To the extent that the lack of discussion of racial category terminology in education research has led to problems such as the continued use of “Caucasian” or the low rates of using terms related to the two-or-more-races, Indigenous, and missing-race categories, this study opens the door for more careful reflection on what is included and excluded in our empirical writing. Future research on racial categorization in published education research could delve deeper into the density of discussion on racial categorization and the research interests/affiliations of writers, include a wider set of journals, or use content analysis to explore the reasoning given for the naming and inclusion of certain racial groups. We are encouraged by many of the trends we observed in this study related to higher usage of racial categories. Still, without future research, it remains unclear whether shifts in language use are happening concurrent with shifts in deep engagement with the constructs of race and racism. Understanding how racialization occurs within published education research hopefully can lead to better fieldwide norms on how to approach conducting and publishing research, which might then create better research evidence that can be used more easily by practitioners and policy actors.
It is difficult for us to provide concrete recommendations for the entire field of education based on this study, which is part of a broader research project examining the use of racial categories in education research (e.g., Viano et al., 2024). Certain recommendations we could provide might be more useful for qualitative research versus quantitative research, and adding even more nuance, there are also differences within methodologic traditions (e.g., quantitative research that uses secondary data versus data that are collected by the research team). Therefore, our primary recommendation, beyond what we share in Viano et al. (2024), is that scholars who are interested in including race or racism as part of their research should ensure that they have educated themselves on the process of racialization (we have included several references throughout this paper that are useful starting points for this exploration). It would be useful for scholars to explore the rich empirical and theoretical research on race and racism within the societies they study. Of note, critical scholars have been doing deeply thoughtful research incorporating race and racism for a significant period of time and have provided overviews of research trends and ways to incorporate deeper introspection into the research process, often based on methodological design. At a bare minimum, scholars should have a reason for the inclusion of racial categories in their research as well as a reason for the set of categories used in their work (beyond “this is how the data arrived from the data provider”).
We know that questioning the use of various categorical terms does not necessarily connect with the material realities of racial disparities in education. Categories provide opportunities for pride, community building, solidarity, and coalitions across minoritized racial groups. However, we simultaneously acknowledge how these racial categories are rooted in White supremacy and an oppressive legacy, thus perpetuating systems of power in education research. Researchers must apply stewardship in research design decision points regarding identity category and term usage.
Supplemental Material
sj-docx-1-ero-10.1177_23328584241293681 – Supplemental material for Racial Category Usage in Education Research: Examining the Publications from AERA Journals
Supplemental material, sj-docx-1-ero-10.1177_23328584241293681 for Racial Category Usage in Education Research: Examining the Publications from AERA Journals by Dominique J. Baker, Karly S. Ford, Samantha Viano and Marc P. Johnston-Guerrero in AERA Open
Footnotes
Declaration of Conflicting Interests
The authors declare no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Notes
Authors
DOMINIQUE J. BAKER is an associate professor of education and public policy at the University of Delaware. Her research focuses on the way that education policy affects and shapes the access and success of minoritized students in higher education.
KARLY S. FORD is an associate professor in the Departments of Sociology and Education Policy Studies at Pennsylvania State University. Her research focuses on the sociology of higher education and the politics of collecting data on socially constructed categories such as race, gender, and immigration status.
SAMANTHA VIANO is an associate professor of education at George Mason University. Her research focuses on evaluating K–12 education policies and assessing school contexts that predominantly affect minoritized student populations and their teachers, with particular focus on school safety and security, school improvement, and high school graduation, in addition to research methods for studying racial equity.
MARC P. JOHNSTON-GUERRERO is an associate dean and professor of higher education in the Morgridge College of Education at the University of Denver. His research focuses on racial dynamics and multiraciality in higher education and student affairs.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
