Abstract
The scale of anti-racist research employing independent measures for skin tone (i.e., classification algorithms and colorimeters) is increasingly growing. However, recent studies reveal that non-epidermic traits (i.e., hair, lips, gender) contribute to the
Introduction
Racial hierarchies continue to map onto class structures virtually all over the globe (Bonilla-Silva, 2015; Clealand, 2017; De Micheli, 2018; Dixon and Telles, 2017; Keith and Herring, 1991), and Mexico is no exception (Arceo-Gomez and Campos-Vazquez, 2014; Campos-Vazquez and Medina-Cortina, 2019; Moreno, 2008, 2010; Sue, 2020).
Biologically baseless, the social construction of “race” persists pervasively. Through racial schemas, people assign social meaning to certain physical markers –i.e., skin tone, lip shape, hair texture (Bonilla-Silva, 2015; Haslanger, 2012; Sen and Wasow, 2016; Wade, 2012). These racial schemas are complex social phenomena. Like most other invented social categories, race is instable, slippery and mutable (Bonilla-Silva, 2015: 1360; Omi and Winant, 2014: 4). Often, factors such as nationality, culture and birthplace are also invoked to explain the concept (Chandra, 2006; Lancaster, 1991; Loveman, 1997; Marrow, 2003; Nutini, 1997).
Reaching a better understanding of the configuration and dynamics of racial schemas is particularly important in a region with documented “racial fluidity” (De Micheli, 2018: 53, 2021: 6; Irizarry et al., 2023; Marrow, 2003). In Latin America, for decades, race and skin color were treated as overlapping –even interchangeable and indistinct– concepts (Telles, 2004: 79,218, 2012: 1166). In Brazil, for instance, most people singled out skin color to justify their racial self-identification (Nogueira, 1955) and, to this day, electoral verification commissions use phenotypic criterion to validate the racial self-identification of candidates (Moraes Silva et al., 2024: 7). Moreover, some studies in the region use skin color as a proxy for race (Rejón, 2024; Telles et al., 2015; Trejo and Altamirano, 2016) and researchers in other Latin American countries concur that “race” mostly refers to skin color and that these two terms could be used interchangeably (Htun, 2015: 165; Mitchell-Walthour, 2017: 1; Saldívar, 2014: 90).
However, some scholars question the uncritical “interchangeability” of the terms. According to critics, equating the two is a mistake for at least a couple of reasons. On the one hand, studying race via skin color is to pay attention to only one of the many markers in the phenomenon of race (Monk, 2021: 80). On the other hand, because of the potential interdependence of the elements within the racial schema, “skin color” is difficult to measure and, often, researchers are not investigating what they think they are (Monk, 2016: 415).
Addressing these questions, recent studies have started to test the assumptions and examine whether skin color is indeed the best predictor for racial identification (Monk, 2016; Telles and Paschel, 2014). In Mexico, the most notable example –if not the only one– is the Project on Ethnoracial Discrimination (PRODER), a large-n sample survey that explores the interaction between different racial identifications (i.e., race, ethnic group) and skin color measurements with socioeconomic outcomes. The differentiated findings among various modes to account for racial and skin tone identification, reveal that the Mexican racial schema is more complex than what the “interchangeability” assumption suggests.
These findings raise the question: what are the particular configurations and dynamics of the Mexican racial schema? This issue is especially relevant given the recent increase on the number of studies using “objective” or “independent” skin tone measurements (i.e., classification algorithms and colorimeters) to analyse various aspects of social life, such as sociological outcomes, electoral success and political representation (Campos-Vazquez and Rivas-Herrera, 2021; Dixon and Telles, 2017; Rejón, 2024; Rejón Piña and Ma, 2023). The extent to which these tools are appropriate and helpful “replacements” for human-coding hinges on their capacity to approximate to the results of a human discrimination process that comprises so much more than a simple assessment of skin tone. If the independent skin tone measurements of these tools diverge significantly from the “skin tone” humans identify influenced by other factors in the racial schema, then the machine-coded variables are less appropriate methods for anti-racist research, as they fail to sufficiently approximate the “skin tone” detected in actual human interaction. On the contrary, if the assessments between algorithms and humans are similar, then the validity of these –still imperfect, but helpful– tools to approach the study of colorism and racism is confirmed.
To shed light into applicability of these tools in relation to the configuration and functioning of the Mexican racial schema, this article examines one sample from two different “skin color” angles: one that independently measures skin tone and one in which humans subjectively assess complexion. Methodologically isolating skin color from other social cues (via an automated algorithm) and comparing this measure with one susceptible to other (admittedly not all) biases (human-coding), helps reveal the extent to which these automated tools are a helpful and appropriate method to facilitate social research in this topic.
To achieve its aims, the article is structured as follows. First, I outline the theorical framework to which this analysis is circumscribed to; I explain how –through racial schemas– race is actioned for social categorization purposes in everyday life and people are racialized through an undeniably corporeal and ocular dimension. In the following section, I describe the sociohistorical context of the “race issue” in Mexico; reviewing how the current notions of race are rooted in the myth of mestizaje, promoted by elites in the 20th Century. In “Methods and data”, I describe the novel dataset of 3,000 portraits I use to compare two different measures of skin color. The first was gathered using CASCo –an automated algorithm that focuses exclusively on the color of the face area of the photo– and the second is a measure of “impressionistic” race, produced by independent human coders. In “Findings and discussion”, I present and analyze the results of the study. I find little inter-coder agreement about exact estimates but substantial —and statistically significant—agreement between the general direction of the two skin color variables. These findings confirm that CASCo –and, arguably, similar tools– provide measurements that do not substantially digress from people’s perception of “skin tone”. I conclude the article with a brief discussion on what these findings entail for the applicability of these tools in social research, arguing that –given their accessibility and satisfactory approximations– they are not only convenient but valid methods to reasonably substitute human coders.
Theorical framework
Races are invented categories, but they are (socially) real because they are reenacted in the everyday life (Bonilla-Silva, 2015: 1360). They do not exist as biological facts, but are real social forces (Atkin, 2012; Mallon, 2006).
While socially constructed, race has a material foundation (Bonilla-Silva, 2015: 1360) and there is a crucial corporeal dimension to it. According to Omi and Winant (2014: 4), race is “ocular in an irreducible way”. Bodies are “racialized” –read and interpreted– via symbolic meanings and associations; and it continues to identify groups and individuals via somatic markings (i.e., skin tone, lip shape, hair texture) that still signal social status and cause different forms of interaction (Haslanger, 2012: 10). Depending on context, these markers can be cultural, phenotypical, or both. Therefore, some explain race as a “bundle of sticks” or a “racial schema” made of many factors such as societal values, cultural traits and physical attributes (Sen and Wasow, 2016: 506; Wade, 2012).
Note that conceptualizing race as a socially constructed bundle of sticks makes it easy to understand as a category of analysis, but it does not say much about how race is operationalized as category of practice. What is the particular configuration of racial schemas? Are some of their elements more prominent than others? The answers to these questions are not obvious. Perhaps it is for this complexity that the recommended standard to measure and operationalize race is self-identification (CEPAL, 2020; Del Popolo and Schkolnik, 2012).
For sociohistorical reasons, however, in some contexts many people do not identify as members of a racialized groups even if they are racialized (Del Popolo et al., 2009: 63; Sue, 2013; Vaughn, 2013), which complicates the struggle for racial justice. Trying to overcome this obstacle, some scholars resort to the use of skin color as a proxy for race (Rejón, 2024; Telles et al., 2015; Trejo and Altamirano, 2016), sometimes arguing that complexion is the best predictor of racial identification (Banton, 2012; Telles and Paschel, 2014) or even that race and skin color are overlapping –even interchangeable and indistinct– concepts (Htun, 2015: 165; Mitchell-Walthour, 2017: 1; Saldívar, 2014: 90; Telles, 2004: 79, 2012: 1166).
However, this approach has been challenged recently. Critics argue that studying race via skin color is to pay attention to only one of the many markers in the phenomenon of race (Monk, 2021: 80). Furthermore, scholars have criticized the way in which “skin color” is measured, arguing that often researchers are not investigating what they think they are (Monk, 2016: 415). It turns out that measuring skin color is much more complex that social scientists initially thought.
Researchers have measured skin color employing different methods. Using verbal scales that represent a color spectrum (i.e., ranging from “dark” to “light” or “black” to “white”) is probably the most common of all. However, despite its popularity, the approach has been criticized for the way in which these cues skew and influence the answers to the questions asked by imposing rigid categories, even if these are deemed appropriate for prompting racial identity (Harris et al., 1993). Regardless of who is entering the data –interviewer or respondent–, this method also ignores the fact that people are not neutral and detached recorders of physical traits. For instance, research demonstrates that people have limited ability to accurately differentiate physical features of others (Hill, 2002) or even their own (Monk, 2015). Furthermore, skin color self-identification often differs from observers’ classification (Campbell and Troyer, 2007; Perreira and Telles, 2014) and racial self-identifications are volatile (Golash-Boza and Darity, 2008: 904).
Other studies have used exogenous indicators for skin color –color palettes– as benchmarks for different skin tones with the purpose of aiding interviewers classify respondents more accurately (Dixon and Telles, 2017; Rejón Piña and Ma, 2023: 170). However, this approach is still prone to other distortions of skin color (Harris, 1964). For example, in the United States, men in professional attire are more likely to be categorized as “white”, while people of low socioeconomic levels are more likely to be categorized as “black” (Garcia and Abascal, 2016: 423).
Researchers have also used photo elicitation – presented image(s) to participants and asked them to classify the photos according to their skin tone– to measure skin color (Candelario, 2007; Roth, 2012; Sorokowski et al., 2013). However, research shows that all the elements of a portrait –not just skin tone– affect people’s perception (Hill, 2002). For instance, Garcia and Abascal (2016) demonstrated that skin color ratings were affected by the presence of a racially distinctive name.
Especially relevant to this article, is the phenomenon of “transcoloration” –where perceived color changes– that takes place when people classify themselves (or others) into skin tone categories. As mentioned before, when people “assess” skin color, their judgment is not purely based on complexion but it is influenced by other phenotypical and cultural elements (Garcia and Abascal, 2016; Harris, 1964; Hill, 2002; Omi and Winant, 2014: 2). In the literature, this is sometimes described as “money whitening”, because higher social status, wealth and connections usually lead to whiter classifications (Bonilla-Silva and Glover, 2004: 154; Golash-Boza, 2010; Roth et al., 2022). This transcoloration points to the relevance of other elements of the racial schema. That is, the interplay between the elements is such that the presence/absence of some, shape the way others are perceived, including skin color.
Recent research has started to study the place of skin color in racial schemas, particularly if complexion is in fact the best predictor for racial identification (Monk, 2016; Telles and Paschel, 2014). The most relevant example of this kind of study in Mexico is the Project on Ethnoracial Discrimination (PRODER). Employing data from a large sample survey (n = ∼7000), researchers studied the interaction between different racial identifications (i.e., race, ethnic group) and skin color measurements with socioeconomic outcomes. They have found that –beyond skin color– the importance of other phenotypical traits in the racialization process has been underestimated (Krozer and Gómez, 2023; Solís and Güémez, 2020; Solís and Reyes Martínez, 2023). Some of their findings align with those of other studies; for instance, people are strongly sensible to even slight differences in hair texture, stature, the shape of the face, type of nose and gender (Bonilla-Silva, 2009; Bonilla-Silva and Glover, 2004; Feliciano, 2016; Solís et al., 2019, 2023, 2023). Furthermore, some of their results suggest that the effect of skin color is different to that of other features, sometimes just in intensity but others also in direction (Reyes-Martínez et al., 2023; Solís et al., 2023).
Overall, these findings reveal that the Mexican racial schema is more complex than what the race-skin tone “interchangeability” assumption suggests. What are its particular configurations and dynamics? This “racialization” process is still understudied –particularly in contexts where observers categorize others based solely on appearance– and we do not have a good understanding of which phenotypic characteristics are used to assign race (Feliciano, 2016: 391). Scholars have long pointed out that not all physical markers are racialized –at least not in the same way (Wade, 1993); and few studies have examined the role of phenotypic characteristics in shaping how individuals racially classify others (Feliciano, 2016: 395). This is especially true in Latin America (Golash-Boza and Bonilla-Silva, 2013).
To address this question, I take a measurement (an automated algorithm) that isolates skin color from other social cues and compare it with one susceptible to other (admittedly not all) biases: human-coding. In the next section, I unpack some of the particularities of the Latin American and Mexican contexts, to then develop the hypotheses of the study.
Context
The categories of racial identification in Latin America are complex and porous (Bonilla-Silva and Glover, 2004: 151), to the point that some scholars have described the Latin American racial system as “fluid” (De Micheli, 2018: 53, 2021: 6; Irizarry et al., 2023; Marrow, 2003; Telles and Lim, 1998). In this fluid system, with the right combination of status and behavior, people can “escape” the stigma of certain racial identifications and “whiten” themselves (Degler, 1971; Schwartzman, 2007; Telles and Flores, 2013).
Historically, the ambition for whiteness was so entrenched that it fuelled the development of eugenics in many Latin American countries, including Mexico (Suárez y López Guazo, 2005). The goal to “whiten” was not only personal (Delgado et al., 2017; Hunter, 2011) but national, and it was embedded in public policy (Saldaña-Tejeda, 2022). However, by the 20th Century, this project to phenotypically whiten the population had failed in most Latin American nations; and a cultural whitening project arose in its place (Bonilla-Silva, 2009; Telles, 2014).
In countries like Mexico, progressive elites promoted the idea that society was color-blind and racially unified (Knight, 1990) through the nation-building narrative of
Scholars argue that, because of this myth (and the public policies that followed), the notions of race and the practices of racism are often disconnected in Mexico; the former are removed from popular consciousness, but the latter persist (Friedlander, 2006; Moreno, 2008: 285). In Mexico, racism is not violent but “sells itself as attractive and appealing, like an obstacle anyone could overcome” (Iturriaga, 2015: 109). With the right combination of education, professional success and wealth, people can reach Mexican “whiteness” (Navarrete, 2016), which is not only a skin color but a form of cultural capital (Richards et al., 2023): physical, social, economic and cultural privilege (Vargas, 2015). The study of whiteness in Mexico is still developing (Ceron-Anaya et al., 2023), but several recent qualitative studies reveal that both phenotype and cultural habits are its constitutive elements (Cerón Anaya, 2023).
Moreover, the public discourse denying the existence of multiple ethno-racial communities along with social stigma, caused these groups to lose legal and political recognition and eventually become “invisible” (Iturralde, 2018). In fact, data on racial self-identification was not collected in Latin America until very recently (CEPAL, 2020).
Amidst these circumstances, and aware of the prominence of skin tone in the racial schema, scholars in the region resorted to the use of skin color as a proxy for race (Rejón, 2024; Trejo and Altamirano, 2016). In fact, a burgeoning body of literature has found skin color-based discrimination in virtually all spheres of Mexican society –interpersonal relationships (Moreno, 2008, 2010; Sue, 2020), education (Rejón, 2023), the job market (Arceo-Gomez and Campos-Vazquez, 2014), and social mobility (Campos-Vazquez and Medina-Cortina, 2019).
However, as flagged before, recent studies have started to examine this question in more detail, paying attention not only to skin tone but to other elements in the racial schema. These works reveal that the importance of other phenotypical traits in the racialization process has been underestimated (Krozer and Gómez, 2023; Solís and Güémez, 2020; Solís and Reyes Martínez, 2023; Telles and Flores, 2013).
We now know that some
This paper aims to contribute such an examination of the racialization process, comparing two different skin color variables for the same subjects, which were gathered through very different methods. A full description of the dataset in presented in the following section. For now, consider that one of the skin color variables is the result of an automated algorithm that focuses exclusively on the color of the face area and the other was produced by independent human coders. These two measurements allow us to observe the impact that other factors (i.e., hair color and texture, eyes, lip shape, gender, jewelry, attire) have on skin color assessment and whether they significantly contribute to “transcoloration” (whitening or darkening) of the measurement. Call these other elements,
The extent to which these
Methods and data
Data
Analyzing the dynamics of the interaction among all the different elements of the Mexican racial schema requires to look at an existing data set from different angles or to collect a data set for this specific purpose. Here, I examine two different skin color variables produced for the same subjects. Both variables are processed from 3000 studio portraits of legislators in the Mexican Chamber of Deputies. The portraits were taken against a white background with similar light, contrast and zoom conditions. These portraits, published in the official website of the Chamber, showcase various characteristics and markers that could affect skin color classification: complexion, hair color and texture, lip shape, and gender. To a lesser extent, the photos also display some class markers, such as clothing and jewelry. Figure 1 presents a panel of sample photos for reference, but the full albums are available as supplementary material (Rejon, 2025). Panel of the portraits in the sample.
Some might challenge the representativeness of the sample on the grounds that some studies suggest the pool of legislators are “whiter” than the average population (Campos-Vazquez and Rivas-Herrera, 2021), but other research argues the opposite (Lawson et al., 2010; Rejón, 2024). At worst, the question of representativeness is empirical, and it remains unanswered; by no means does it affect the validity of the sample.
Admittedly, however, there are some limitations in using these portraits to conduct the skin tone measurements. First, photos “flatten” some facial features and do not include all elements of the racial schema (i.e., stature, accent, size). This is undeniable true. Like any other method in the social sciences, this one has disadvantages, including the fact that it cannot totally recreate the conditions of actual human interaction. Still, its advantages outweigh the limitations: it exposes participants to enough elements of the racial schema to allow a comparison with independent and automated measurements. Secondly, the skin tone of the face is susceptible to modifications (i.e., make up). While true, I find this limitation to be less relevant. In fact, the creators of classification algorithms for skin color have anticipated and responded to “the make-up objection”. They conceded that make-up could distort the measurements but argued that the same is true for other methods too (i.e., interviewers) and that, ultimately, if subjects usually present using make up, then measuring that –rather than actual skin tone– would better explain their experiences (Rejón and Ma, 2023). Afterall, skin color “as seen by others” is the important factor when examining racial inequality and discrimination” (Telles 2012: 1167).
Methods
The first skin tone measurement was obtained using the Classification Algorithm for Skin Color (CASCo), introduced by Rejón and Ma (2023). CASCo is a Python library that detects the skin color of the face area of a given portrait and classifies it to one of the categories in the PERLA color palette (Telles, 2014). CASCo maps the face area on the HSV color space rather than RGB, which makes its measurements more accurate. In simpler terms, CASCo is a modern technological tool that identifies skin color alone, ignoring all other factors. It is an independent and automatic measure that focuses on skin color
CASCo’s measurements are innovative and valuable precisely because they overcome the shortcomings of other methods to measure skin color such as interviewers’ bias and money whitening, as reviewed in previous sections of the paper. At the same time, however, this independent measure effectively isolates skin color from
The second skin tone measurement was coded by humans, and conducted via an online (Qualtrics) survey that was distributed though social media by RacismoMX, an anti-racist organization in Mexico. This sampling process via activist organizations is common feature in academic research (Brinkerhoff, 2014; Brinkerhoff et al., 2019). The survey included one eligibility question: “In total, how many years have you lived in Mexico?” and respondents with less than 10 years of lived experience in Mexico were excluded from the sample. To eligible participants, the survey asked: “From the following color palette, select the skin color of the face of the person in the portrait”; then, it showed them one portrait per page along with the colors in the PERLA palette as options. Each participant was given 50 photos to classify (10 for each legislature in the sample). Asking explicitly about color (rather than “race”) and giving respondents the PERLA palette as options entails that all variation between their scores and CASCo’s are due to non-epidermic factors.
In total, the survey had 367 responses but not all participants classified the 50 photos (i.e. they left the survey before completing it). Therefore, the first legislatures asked (the oldest ones) had more classifiers/coders than the recent ones. All classifications were preserved, to permit older photos (with less quality) to have a larger number of coders. Overall, each portrait was classified by 3-7 human coders. For each photo, the exact number of coders and their scores are available as supplementary material. The assessments of these human coders were averaged and rounded to the closest integer, to produce a categorical variable comparable with that of CASCo.
I anticipate some objections regarding the suitability of using the PERLA color palette, particularly in the Mexican context. The tool has been subjected to criticism on the grounds that there it presents significant “technical issues” that prevents it to appropriately capture the main skin tone variations of people in Mexico (COLMEX, 2023). However, while these criticisms might be substantiated, they have not been published or significantly outlined yet. For these reasons, I have not been able to corroborate them. In turn, the PERLA palette has been widely used in social research in different settings, including Mexico (Campos-Vazquez and Medina-Cortina, 2019; Rejón, 2023; Telles et al., 2015); therefore, employing it logically allows for comparative studies and facilitates the communication of the results. For these reasons, here I stick to the PERLA color palette.
In sum, I examined skin color in this sizeable sample from different angles: one that independently measures skin tone exclusively and one in which humans subjectively assess “skin color”. Comparing the results of these two methods gives us an indication of the extent to which
Inter-rater reliability computations.
Most of these measurements vary from 0 to 1 (with some ranging from −1 to 1 to account for perfect disagreement), but the “accepted” levels of agreement are conventional and vary per discipline. For instance, clinical diagnoses usually require high kappa values but the social sciences accept lower rates.
In this paper, I follow the scale introduced by Landis and Koch (1977), widely used in social research. According to this scale, the strength of agreement can be interpreted as follows: values < 0 poor, 0.00–0.20 slight, 0.21–0.40 fair, 0.41– 0.60 moderate, 0.61–0.80 substantial, and 0.81–1.00 almost perfect. Following Krippendorff (2004: 241), I am cautious when interpretating the findings: discarding data with agreement measures < 0.667, drawing only tentative conclusions from data ≥ 0.667 but <0.800, and relying on data ≥ 0.800.
All replication data and codes are included in a R Markdown file as supplementary material to this paper.
Findings and discussion
Inter-rater reliability between CASCo and human coders.
The Cohen’s κ is 0.015, with a significant
Lin’s Concordance Correlation Coefficient is 0.375, with confidence intervals (0.3520, 0.397) which, again, suggest a fair level of agreement between CASCo and human coders.
The only slightly different result amongst this first set of IRR, is the Spearman’s ρ; which, at 0.491, suggests a significant and moderate positive correlation between the two skin color variables. This result, however, should be interpreted with caution because ties in the data may affect the accuracy of the rank correlation coefficient.
In general, these set of IRR support the idea that
Fleiss’ Kappa and Gwet’s AC2 between CASCo and human coders.
The Fleiss’ κ value of 0.934 associated with ordinal weights, also suggests a high and statistically significant level of agreement among raters. When considering quadratic weights, κ is a bit higher (0.959) also indicating a high level of agreement among raters. Similarly, using ordinal weights, the Gwet’s AC2 value of 0.727 with
In sum, while simplistic tests suggest a lack of agreement between CASCo and human coders, a more detailed analysis —one that takes into account the characteristics of the data— indicates that there is a high level of agreement between both variables. In other words, there is little inter-coder agreement about
As reviewed in previous sections, measurements of skin color in the United States are subject to distortions triggered by somatic and class markers such as hair color and texture, lip shape, make-up style, jewelry and clothing. These markers impact people’s perception of affluence and professional success which, in turn, makes subjects be observed to be whiter (Harris, 1964). Similarly, people perceived to be of low socioeconomic levels based on these markers, are more likely to be categorized as “black” (Garcia and Abascal, 2016: 423).
In Mexico, recent studies demonstrate that the importance of phenotypical traits–other than skin color– has been underestimated (Krozer and Gómez, 2023; Solís and Güémez, 2020; Solís and Reyes Martínez, 2023; Telles and Flores, 2013); revealing that people are sensible to even slight differences in hair texture, stature, the shape of the face, type of nose and gender (Solís et al., 2019, 2023, 2023). The magnitude of the effect of non-epidermic features is critical, as it has the potential to confirm or decimate the significance of studies examining the correlations between independent measures of skin color and variables such as electoral success, political representation or socioeconomic outcomes.
This empirical data suggests that in Mexico, subjectively perceived skin color (while not identical) is not very different to actual skin color. These findings are important because they confirm qualitative research arguing that skin tone is a good proxy for race in Mexico (Rejón, 2024; Saldívar, 2014: 90) and possibly Latin America (Telles et al., 2015). If assumptions and prejudices are attached to racial schemas (Sen and Wasow, 2016: 506; Wade, 2012) which are triggered by physical markers (Haslanger, 2012: 307), the fact that independent and subjective measures of skin color are not significantly different in Mexico indicates that independently measuring skin tone is, in fact, a legitimate approach to study racism.
Conclusion
Racism continues to be pervasive around the world. People still deploy notions of race to classify and assign value to others. They do so employing racial schemas to assign social meaning to certain physical markers. A growing body of literature is devoted to examining the configuration and dynamics of these racial schemas. In Mexico, these investigations reveal that the importance of non-epidermic traits has been underestimated. Given that studies employing independent measures for skin tone (i.e., classification algorithms and colorimeters) are increasingly frequent, the magnitude of this effect of non-epidermic features is critical; it has the potential to challenge the appropriateness of these tools and, consequently, debunk the significance of said research corpus.
Using a novel dataset of 3000 portraits, here I compared two different measures of skin color. One was processed with an automated and independent algorithm that focuses exclusively on the color of the face area of the photo. The other one is a measure of “impressionistic” race, produced by human coders. In other words, the first variable effectively isolates skin color from
The findings reveal little inter-coder agreement about exact estimates but substantial —and statistically significant—agreement between the general direction of the two skin color variables; which suggests that while non-epidermic features slightly alter people’s perception of skin tone they do not cause major digression from independent skin color assessments.
These findings have significant implications for anti-racist research in Mexico. If, as this empirical data suggest, subjectively perceived and actual skin color are not very different, then quantitative research employing independent skin tone measures (i.e., classification algorithms) are likely to find valid correlations. Given that these methods usually facilitate research that is otherwise not feasible due to limitations on human or financial resources, the findings of this article are encouraging as they entail that these accessible research tools can indeed contribute valuable scholarship.
At the same time, these results confirm that differences between the measurements do exist. This entails that, while these technology-based independent skin color measures are good approximations to human-coding, they should be accompanied by human-coded variables whenever possible. The combined use of these two types of variables would buttress the analyses and account for any divergence with human appreciations. In any case, these findings confirm the benefits of these tools, including the fact that they showcase how human codification diverges from pigment tracing.
Note that these findings say nothing about the influence of
Another interesting potential implication of these findings has to do with the way in which “racial” discrimination is framed in the Mexican case. The findings suggest that skin color assessment is not significantly distorted by other
Footnotes
Declaration of conflicting interest
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Data availability statement
The data that support the findings of this study are openly available at https://doi.org/10.26188/24873330 (Rejon, 2025).
