Abstract
Data-driven public health policies were widely implemented to mitigate the uneven impact of COVID-19. In the United States, evidence-based interventions are often employed in “racial equity” initiatives to provide calculable representations of racial disparities. However, disparities in working or living conditions, germane to public health but outside the conventional scope of epidemiology, are seldom measured or addressed. What is the effect of defining racial equity with quantitative health outcomes? Drawing on qualitative analysis of 175 interviews with experts and residents in Chicago during the emergence of COVID-19, we find that these policies link the distribution of public resources to effective participation in state projects of data generation. Bringing together theories of quantification and biosocial citizenship, we argue that a form of data citizenship has emerged where public resources are allocated based on quantitative metrics and the variations they depict. Data citizenship is characterized by at least two mechanisms for governing with statistics. Data fixes produce better numbers through technical adjustments in data collection or analysis based on expert assumptions or expectations. Data drag delays distribution of public relief until numbers are compiled to demonstrate and specify needs or deservingness. This paper challenges the use of racial statistics as a salve for structural racism and illustrates how statistical data can exacerbate racial disparities by promising equity.
Introduction
Data and science have been a critical tool in our public health response to COVID-19 from day one, and we will continue to rely on them to move resources where they are needed most. (Office of the Mayor, April 6, 2020, Press Release)
Government officials across the United States (US), like former Mayor of Chicago, Lori Lightfoot, have framed data collection and expert analysis as a solution to social problems – a trend exemplified and fueled by the COVID-19 pandemic. In this quote, Lightfoot frames the generation, management, and visualization of data as an answer to the structural inequities laid bare by COVID-19. This press conference was held because Chicago Department of Public Health (CDPH) officials had released the first reliable race and ethnicity data for COVID-19, finding that 70 of the first 100 deaths attributable to COVID-19 in Chicago were among Black residents although Black residents are approximately 30% of the city's population (Eldeib et al., 2020). This was a critical moment for leaders in Chicago who might have hoped that the effects of structural racism would not be so stark or severe.
Generating data to inform racial equity initiatives was a call to action across the US when COVID-19 emerged. Early in 2020, candidates for the presidential election called for better local and state reporting of race and ethnicity data to the Centers for Disease Control (Morrison, 2020). Academics suggested that better data and analysis was necessary for addressing health disparities emerging in the pandemic (e.g., Laurencin and McClinton, 2020; Naudé and Vinuesa, 2021). And calls for more data came from community-based organizations as well. In Chicago, the Latino Policy Forum demanded better statistics on race and ethnicity in academic journals and helped to generate some of these data (Del Rios et al., 2021). This advocacy corresponds with a trend toward treating statistical inclusion as a matter of racial justice among community organizers (Rodríguez-Muñiz, 2021; Thomson, 2016), and among health scientists (Bliss, 2012; Epstein, 2007). And yet, critical scholars have questioned whether the collection of attuned racial statistics serves to redress racial inequities at all (Benjamin, 2019; Hatch, 2022). Hatch (2022), for example, argues that when it became clear the highest case rates and deaths from COVID-19 were among Black and Latinx communities, the Trump administration withdrew federal public health infrastructure and left it up to states and municipalities to determine and enforce public health mitigation. Furthermore, activist groups like Data for Black Lives argued for greater community authority over the collection and interpretation of data in the US when COVID-19 rates began to climb, critiquing the tendency for big data to be controlled by corporate or state actors (Data for Black Lives, 2020).
Following this critical work, we use the case of Chicago and the local data-driven response to COVID-19 to challenge the singular reliance on gathering racial statistics to redress racial inequity. We argue that expert analysis of statistical data contributed to racial disparities by facilitating the calculated distribution of public resources in ways that were just sufficient to show measurable improvements. After all, racial inequities in housing, employment, and health were demonstrably exacerbated in 2020 and 2021 in Chicago (Decoteau et al., 2021). These statistics also naturalized structural racism by dislocating disparate health outcomes from their historical roots and communities’ ongoing experiences of precarity. By addressing how centralized data came to dominate the policies the city of Chicago employed to mitigate the uneven impact of the pandemic, we theorize how the substitution of numbers for the lived experiences of racial inequity was attractive and feasible for experts who were committed to racial equity.
The case of COVID-19 in Chicago exemplifies how certain qualities of quantification (like standardization and abstraction) translate the social world into more manageable terms and shows how this process requires a degree of essentialization and exclusion. Quantified data are frequently used to minimize bias and constrain individual judgment to guard against uncertainty and mitigate inequities (Fourcade, 2021; Hirschman and Bosk, 2020; Porter, 1995). However, numbers also effectively shape the very subjects they measure through processes of classification, modeling, and ranking (Espeland and Sauder, 2009; Fourcade, 2016). Population statistics, for their part, require the reification of racial difference by standardizing racial classifications (Bowker and Star, 1999; Zuberi, 2001). Demographers and epidemiologists often presume that quantitative metrics are an accurate representation of structural racism as a lived experience (Benjamin, 2019; Laster Pirtle, 2020; Rodríguez-Muñiz, 2021). In contemporary algorithmic culture, Phan and Wark (2021) suggest that enumerating racial difference has become a new racial formation, wherein race is epiphenomenal to classifications in data. In our case, ZIP codes (US postal codes) were used to impute race into epidemiological models. ZIP codes are a proxy for racial classifications because historic practices of redlining and ongoing disinvestment have produced racially segregated neighborhoods. The effects of structural racism are used to approximate resident demographics in space, and these effects are reified as an individual attribute.
This paper builds an account of racial statistics that underscores how, when leveraged to drive public policy, quantitative data entails a form of biosocial citizenship wherein state actors use decontextualized identity markers to adjudicate rights (Petryna, 2004; Decoteau, 2013). By examining the City of Chicago's COVID-19 “racial equity” policies, we illustrate a core tension between the quantitative abstraction of racial disparities and the concrete lived experiences of structural racism. We argue that rights and public resources are increasingly contingent on continual participation and representation in the data collection and analysis projects that undergird data-driven policy solutions. We term this dynamic data citizenship. In Chicago, data citizenship required the enrollment of residents and community-based organizations in the vast public health data collection infrastructure set up during the pandemic. This practice ignored the structural character of racial inequity, substituting measurement for material redistribution in two ways. First, data fixes created better numbers without attending to the structural disparities that such data represents and reifies. Public health experts were motivated to gather these data to address racial disparities but relied on variables that reify historical racism, like residential segregation by ZIP code. Second, data drag delayed the distribution of resources until quantifiable evidence of need was available. Because testing and vaccines were deemed scarce resources by city officials in Chicago (Decoteau and Garrett, 2022), neighborhoods’ statistical vulnerability was measured, hierarchically ranked, and used to determine priorities for resource distribution. Under data citizenship, communities are enlisted in data collection but the terms of analysis and response are centralized in public offices. The production of data, then, becomes its own justification and equity is an ever-retreating horizon. Improving numbers is attractive because it is more straightforward than addressing the effects of structural racism, but it is more straightforward because it substitutes technical exercises for material redistribution.
In what follows, we review existing literatures on quantification and biosocial citizenship, and then we provide some background on the specific racial equity measures taken in Chicago. After laying this groundwork, we present our findings to support our theory of data citizenship.
Theories of enumeration and citizenship
Scott (1998) suggests that a primary means by which states render legible their populations for the purposes of providing social welfare and combatting epidemics is through “state simplifications” – metrics and classifications that abstract and standardize knowledge to manage populations and optimize their health. The collection of quantitative metrics necessitates infrastructures that make data available to be collected and analyzed. Yet, Murphy (2017) argues that data collection and statistical models often do little to address the lived conditions of structural inequity. She finds that the construction of the “population” through quantitative data is an artifact of coercive social policy that deflects responsibility for poor living conditions away from state and non-state authorities by attributing statistical trends to individual-level variables (Murphy, 2017: 137). We find a similar trend in the case of the COVID-19 response in Chicago, and we provide a detailed account for the mechanics of how data-driven solutions toggle between population-level and individual-level analysis.
In the case of the COVID-19 response in Chicago, experts and policymakers struggled against this tendency under the conviction that population-level data could solve social problems without blaming specific groups or behaviors. We show that the ability to bring together massive amounts of data from different sources, instead, further distanced quantitative analysis from communities’ grounded experiences. This practice also unmoored the data from its broader social and political contexts. As such, “racial equity” was pursued in the ether of data figuring rather than in the everyday realities of neighborhoods.
Quantification
Modern governments rely upon quantification in various ways to make complex phenomena visible and actionable. Although demographic or statistical data is often presented as politically neutral or faithful to ontological realities in public policy (Porter, 1995), numbers and population statistics carry tremendous political weight and cannot be separated from political motivations, social imaginaries, or statistical techniques (Espeland and Stevens, 2008: 412; Rodríguez-Muñiz, 2021: 7). Fourcade (2016) has persuasively argued that quantification, in fact, almost always entails ordinalization: subjects are ranked as more or less worthy of consideration by default on numerical scales. For precisely these reasons, epidemiologists and public health experts rely extensively on numbers to guide decisions (Reubi, 2018). Statistical approaches to social problems are often seen as more valid and objective – a recourse toward correcting biases (Hirschman and Bosk, 2020). Yet, numerical operations convey facts while erasing the processes that yield their production (Hansen, 2015). Numbers work to commensurate different categories of objects or knowledge but flatten or exclude in doing so (Espeland and Stevens, 2008; Scheel, 2020). Numbers make unquantifiable components of health invisible, turn complex social behaviors into unilinear variables, exclude certain kinds of knowledge, and disguise their own political meaning and authority (Adams, 2016: 34). Quantified metrics have the dual quality of appearing to represent transparent differences while obscuring the political origins that make their production possible.
In this vein, the “sociology of pandemics” has found that the very construction of a health crisis and the interpretation of its outcomes is largely dependent on the specific institutional and expert steps taken to respond to it (Dingwall et al., 2013). Although epidemiological models are used extensively to predict outbreaks and convey a kind of rational pathway for political action, they remain riddled with uncertainties (Mansnerus, 2013). Yet, political contestations over numbers are often black-boxed so that modeling can remain an effective governing strategy (Mansnerus, 2013; Merry, 2016). Debates can be minimized by using bundles of individual variables that are statistically associated with health outcomes as stand-ins for social determinants of health – a process made possible by the availability of huge digital datasets (Rowe, 2021). In our case, health equity frameworks obscure how racism functions because structural causes are replaced by aggregate measurements.
Statistics, even when well intentioned, often naturalize and normalize numerical differences between racial groups. Racial disparities that are represented numerically without context can support explanations that reify racial stereotypes about behavior, leading to stigmatization and neglect (Chowkwanyun and Reed, 2020; Zuberi, 2001). Demography tends to assume that counts of people correlate to real lived experiences or political power, but advocating for particular methods of counting also shapes quantified outcomes (Rodríguez-Muñiz, 2021: 202). As such, while demography is a useful tool, the numbers produced will always reflect the aims and interests of the experts and stakeholders using them. Anthony Hatch challenges the reliance on racial health disparities data, arguing that, [T]he assumption that collecting data on racial health disparities in the COVID-19 pandemic will lead to the reduction or elimination of those disparities [is an] assumption that keeps scientists in an endless search for more and more refined measurements of racism's harms, while the political and economic systems that comprise the fundamental causes of those harms are given a pass until all the data are counted. (Hatch, 2022: 2)
In the case of public health and epidemiology in Chicago, while experts are aware of the contingency of their data, it is their very ability to produce and manipulate these data that makes it attractive. Improving numbers turns out to be much more straightforward than addressing structural racism because quantification can yield fixable numbers while complex uncertainties associated with addressing structural racism are bracketed.
Citizenship
Multiple theorists document how biosocial identities (like having a chronic illness or being disabled) become the basis for distributing limited public interventions when material infrastructure is weak (Petryna, 2004; Street, 2015). Therefore, citizenship rights are often made available only to those who can prove their identity via diagnosis or test and perform the attendant social roles or eligibility requirements to the satisfaction of experts (Decoteau, 2013; Giordano, 2014; Nguyen, 2010; Sweet, 2021). Research on survey methods and statistics highlight how representation of groups in data can displace individualized judgments of deservingness. Igo argues that survey technologies have been integral to the very making of the US public over the twentieth century, constituting a form of “statistical citizenship” (2007: 297). The relative positioning and distinction of groups, statistically, is a basis for political claims – either for inclusion or for special recognition. We draw on these theories to develop a theory of data citizenship, wherein only populations captured by data collection efforts and deemed “most” vulnerable at a particular moment are eligible for specific resources like testing or vaccines. Where statistical citizenship, for instance, emphasizes how policies of inclusion or exclusion are made possible by statistical figures, we show how the process of collecting and analyzing data to create these figures is also a mechanism of inclusion and exclusion. As with other forms of biosocial citizenship, fitting the criteria for recognition requires the fulfillment of normative standards that contractualize and destabilize citizenship (Nguyen, 2010; Decoteau, 2013). Neighborhood organizations and individual residents had to enroll in the city's data infrastructure and provide the right epidemiological data to remain eligible for city funds, for instance. Therefore, vulnerable residents were only legible to the city via population or census tract statistics, and not via descriptive conditions of their lives or neighborhoods that fell outside the scope of COVID-19 epidemiology.
In her work on ordinal citizenship, Fourcade (2021) argues that digitalization has altered the basis upon which citizenship is granted. While these processes build on the contractualization and individualization of citizenship that accompanied the adoption of neoliberal economics, where subjects are tasked with constantly working on themselves to achieve self-responsibility and citizenship allocations (Rose, 1999), ordinal citizenship also brings new challenges. Digitalization was introduced to mitigate inequities associated with more traditional welfare-state regimes because it conveys a sense of objectivity to guard against bias (Fourcade, 2021: 158). Yet, Fourcade explains: [A]lgorithmically managed social policies typically require intrusive pre-qualifying information, obligate claimants to frequent checks into the system, and are monitored by opaque fraud-detecting systems that inevitably end up targeting the most vulnerable … [D]igital citizenship … dwells in ordinality. (Fourcade, 2021: 161–162)
In our case of data citizenship, it matters how data is visualized, interpreted, and used by experts to allocate resources. We illustrate how being counted is not simply a matter of political clout or market opportunities. Rather, to be counted, subjects often must be able and willing to meet bureaucratic or medical requirements of need and deservingness, and to offer these data to public agencies. Further, experts must be able to effectively fix and manipulate the data gathered from communities. Government programs provided support and resources in response to the ebbs and flows of quantitative data, making active participation in the generation and circulation of data a primary basis for resource distribution. Community organizations and residents, however, had little input into what kinds of data would count or what kinds of resources they might receive. While testing and vaccination resources followed data relatively swiftly, welfare like rent relief required residents to individually demonstrate their eligibility. By training their gaze on quantified metrics, policymakers could congratulate themselves on their “racial equity” approach, although these policies felt suspiciously like disinvestment to many Black and Latinx residents of Chicago.
The case of Chicago and COVID-19
Calls for better data on the disproportionate racial and ethnic impact of COVID-19 were common in the early months of COVID-19 but racial statistics woefully lagged. On June 4, 2020, responding to criticisms about incomplete racial statistics, the Trump administration legislated new reporting requirements for state and local public health departments. However, by September 16, 2020, 43% of cases were still missing racial data (Krieger et al., 2020). In February 2021 (6 weeks after vaccine roll-out officially began in the US), racial statistics on vaccine uptake were missing in 46% of cases reported to the Centers for Disease Control and Prevention (Krieger et al., 2021). In contrast, Chicago was one of the first cities in the US to report news about the racial disparities in COVID-19 infections and deaths. Chicago is a “majority-minority” city where Black, Latinx, and white residents each make up approximately one third of the population (Hendricks et al., 2017). Chicago also has a long history of racial segregation – so much so that it is the classic site for analyzing urban racialized inequality in the US (e.g., Wilson, 2012). Chicago remains one of the most segregated cities in the US and is almost as segregated today as it was 50 years ago (Hendricks et al., 2017). For these reasons, addressing racial inequity is a major political priority for many leaders in the city.
When news surfaced in April 2020 that deaths from COVID-19 were concentrated in Black communities and infections were concentrated in Latinx communities, Mayor Lightfoot initiated the Racial Equity Rapid Response Team (RERRT), the city's hallmark racial equity initiative. Leaders in her administration chose three predominantly Black neighborhoods on the south and west sides of the city for targeted intervention and, soon after, three predominantly Latinx neighborhoods as well. The initiative brought together city officials, CDPH epidemiologists, hospital administrators, Federally Qualified Health Center (FQHC) providers, and leaders from at least one community organization in each neighborhood. Together, these RERRT team members designed testing and contact tracing efforts, held community education events, and organized relief efforts to alleviate vulnerability in the represented neighborhoods. Epidemiologists at CDPH also created the COVID Community Vulnerability Index (CCVI), which merges social vulnerability matrices from the American Community Survey with COVID-19 positivity, hospitalization, and death rates. In the early months of 2021, the CCVI was used to launch Protect Chicago Plus, a campaign that prioritized the 15 most vulnerable communities (out of 77 total) in Chicago for vaccine promotion and rollout. This program was launched, in part, in response to initial vaccine uptake data that showed white people were overrepresented in Phase 1 of vaccine distribution in Illinois (Lourgos, 2021).
The case of contemporary public health initiatives in Chicago and their reliance on data-driven solutions is important precisely because experts who are engaged in this work overwhelmingly agree that racial equity is a goal. Studying Chicago permits us to isolate the dynamics associated with data-driven solutions to structural racism in a relatively insulated and progressive political environment. Furthermore, it represents an instance where quantification and the rollout of policies are closely linked, allowing this study to theorize both the work of statistical analysis and its effects in real time.
Method
Because this study seeks to understand the reasoning behind the use of statistics to achieve racial equity and its impact on affected communities, we employed a qualitative methodology. Specifically, this study draws on 65 qualitative in-depth interviews with experts in Chicago and Illinois, and on 110 interviews with residents of three neighborhoods in Chicago that were heavily impacted by COVID-19. We conducted a critical policy study to uncover how state interventions are orchestrated to “define situations, classify people, and control access to resources” (Dubois, 2014: 38). Qualitative interviews with decisionmakers and experts at various levels in the policy response to COVID-19 provide narratives that situate and explain both how big decisions are made and how they are rolled out by street-level bureaucrats (Dubois, 2009) in communities. Interviews with neighborhood residents served to test the assertions made by experts and policymakers about the efficacy of their work on the ground. For this reason, resident interviews specifically sampled people who were experiencing hardships in 2020 and 2021.
Expert and policymaker interviewees were purposively sampled. We interviewed experts who were directly involved in the state's and city's racial equity responses – which included city and state officials, public health experts and epidemiologists, hospital administrators and clinic staff, and community organizers. We also interviewed other key health and mental health providers, as well as local elected officials, in the communities where the study was situated. Expert interviewees were recruited based on their key positions within government agencies, health care organizations, and CBOs. For this reason, interviewees were not sampled based on race and ethnicity, gender identity, or other demographic characteristics, and these data were not systematically collected. In general, expert interviewees reflected the demographic profile of Chicago.
Interviews with residents were localized in the Albany Park, Austin, and Little Village neighborhoods of Chicago. We sampled these neighborhoods because they represent key racial and class demographics in the city and were differentially prioritized by the city during the emergence of COVID-19. Albany Park is a racially and socioeconomically diverse neighborhood on the North Side that was not prioritized by Chicago's COVID-19 policies and programs, although several communities in Albany Park faced severe impacts from the pandemic. Austin is a predominantly Black neighborhood on the West Side where the legacies of systemic racism have long been recognized. Policymakers and government officials recognized Austin as a site of social vulnerability early on, and it has been the focus of policies to address these vulnerabilities. Little Village is a predominately Latinx neighborhood on the Southwest Side, where many residents have been considered “essential workers.” Little Village was targeted by the city for additional resources and mobilization as well. Across all neighborhoods, 49% of our interviewees identified as Black, 28% identified as Latinx, and 16% identified as white, meaning that Black and Latinx residents who faced some of the greatest hardships due to COVID-19 were effectively represented in our sample.
Analysis of interview data followed a flexible coding scheme that progressed from the generation of themes to the identification of specific quotes and examples (Deterding and Waters, 2021). An initial reading identified the importance of theories of quantification to understand the city's focus on generating the right data, and then identified theories of biosocial citizenship to consider the normative dimensions of public health programs driven by the production of data. A second reading generated codes that refined our engagement with these theories. A third and final reading of the data identified the specific examples that are assembled as evidence here.
Findings
Data fixes
Public health and policy experts working on the data-driven response to COVID-19 in Chicago realized early on that they needed to generate accurate and efficient numbers. Thus, experts adopted methodological approaches for cleaning data and adapting their analysis to ensure they had the right numbers. These adaptations are what we call data fixes, extending Benjamin's (2019) analysis of race and algorithmic knowledge and Phan and Wark's (2021) assertion that racial formations are increasingly data formations. Data fixes fuel the illusion that generating data can reduce racial inequity by changing public health facts overnight. In this case, data was fixed during analysis to ensure that the right variables were present in the dataset for it to meaningfully inform policies. Specifically, COVD-19 data in Chicago and Illinois had to be fixed to include race and ethnicity in 2020 because health care providers and medical laboratories were not accustomed or trained to report this data. Fixes occurred at the local and state level because the national disease surveillance infrastructure proved unable to effectively respond to demands for race and ethnicity data (Krieger et al., 2020; Krieger et al., 2021). However, CDPH and the Illinois Department of Public Health (IDPH) took different approaches to fixing their data to include race and ethnicity, using different proxy variables to impute these data. These fixes were only possible because the effects of structural racism could be reliably inferred using pre-existing statistical variables.
Public health agencies struggled to get reliable race and ethnicity data reported to them during the first wave of infections in Chicago and Illinois. This occurred because infectious disease reporting in the state had been migrating to a system where test results and sample information would be automatically reported to public health agencies from labs in the event of positive results. This protocol eliminated the need for individual technicians to actively report data but decreased accountability for the quality of data being reported. A lead epidemiologist at CDPH explained how the templates used for gathering data often proved insufficient to ensure all the data public health experts needed was being collected: Information that, maybe, these laboratories might not find as important, like race and ethnicity, tended to be missing in the reports that were submitted into the state system. We have a big focus at the Chicago Department of Public Health on health equity. Race ethnicity is something we are very interested in knowing about COVID-19 cases … (CDPH Epidemiologist, March 2, 2021)
CDPH and IDPH each independently developed their own strategies for fixing the first waves of COVID-19 reporting data to include race and ethnicity. At CDPH, epidemiologists partnered with university-based data scientists to impute race and ethnicity to COVID-19 case data using names and ZIP codes. This approach yielded remarkably accurate results, as another CDPH epidemiologist shared: Essentially, they built us an app that we could use internally to run our surveillance data through to predict somebody's race/ethnicity based on their first and last name as well as the ZIP code that they lived in … We ended up implementing their algorithm and were able to start reporting race/ethnicity with imputed data and improve the rate from 40 percent missing down to about 7 percent missing. (CDPH Epidemiologist, March 9, 2021)
At IDPH, rather than designing a novel algorithm to impute race and ethnicity to COVID-19 data, public health experts matched the identifiers between COVID-19 data and hospital discharge data in the state. A state epidemiologist noted that many of the people testing for COVID-19 had visited a hospital recently, explaining that, We’ve been able to use hospital discharge data to help us with quite a lot. We match that, first, with our testing data … because testing data had terrible race and ethnicity reporting … But guess what? People coming in for testing, quite a lot of them, had been in a hospital in the last three years. We could take the race and ethnicity data from the hospital discharge dataset and improve the race and ethnicity reporting of testing. (IDPH Epidemiologist April 13, 2021)
The fact that CDPH and IDPH were each able to identify different sources of data to fix the absence of race and ethnicity markers on individual COVID-19 cases is a testament to how the effects of structural racism can literally be counted on. This case offers a clear linkage between racial inequity and the ability to collect reliable race and ethnicity data at all. The legacies of entrenched racial inequity in the US, Illinois, and Chicago are what made it possible to account for race and ethnicity in the first wave of COVID-19 data. This confirms exactly what critics (Hatch, 2022; Rodríguez-Muñiz, 2016; Zuberi, 2001) have argued: that racial statistics can become a tautology where the expectation of disparities permits the generation of data that confirms disparities without attending to the policies that cause these disparities in the first place. In the next section, we show that data fixes require time, labor, and money that can delay the actual distribution of public resources. Quantitative statistical analysis proved to be a poor tool for addressing housing insecurity, food insecurity, and health threats because it incentivized fixing and generating data that were disconnected from residents’ need for public relief and services. In fact, fixing the data gave experts the illusion that they were contributing to racial equity by improving their measures, even when they relied on persistent racial inequities to complete their datasets.
Data drag
Although public health experts and policymakers suspected that racial inequities in health would shape COVID-19 infection and death rates, it was only after the first numbers on race and ethnicity showed vast disparities in deaths and infections that these leaders jumped into action and formed the RERRT in April 2020. This followed a month of growing outbreaks and community spread throughout March 2020. One leader who was involved in RERRT from its inception described being confronted with the numbers showing the disproportionate effect of COVID-19 on Black residents, remembering that, The Racial Equity Rapid Response team got started—it was really born out of the first time we were looking at data that really displayed a significant racial disparity in COVID infection rates … What we saw was that the rates in the African American community were just skyrocketing as compared to other racial demographics. We had, internally, this moment of like, ‘What is going on? What is happening?’ (City Official, April 27, 2021)
Experts across government agencies with jurisdiction in Chicago explained, time and again, how important it was for them to see disparities in the data before acting on them. In fact, the data fixes identified above are also an example of how concerned experts were with producing complete data rather than acting on known disparities, although they used these known inequities to fix the numbers. One epidemiologist who had been involved in RERRT reflected on why he felt that experts continually prioritized evidence over intervention, saying that, I’m a fan of saying what gets measured is what gets done … Probably our first-year bachelor students would’ve predicted which communities got impacted. So, why do you have to measure it? I don’t know. I guess partly because I’m an epidemiologist. I do think people want to see the data… (Epidemiologist, April 27, 2021)
COVID-19 testing data, and the way it was interpreted to direct resources in Chicago, provides a useful example of drag at the broad level of policy interventions. One CDPH official explained how, at a certain point, they employed mobile testing in the city precisely because stubbornly high positivity rates could be interpreted as a lack of testing capacity when the overall surges in cases had already peaked: Initially, we started with permanent—well, semi-permanent—testing sites at schools and parks. Then, over the summer [of 2020], we just—we had these stubbornly high percent positivities, which you know can be—it's an indication of case rate when you’re going up. Once you have gotten to the peak [of a surge in cases] a little bit, a percent positivity is more of an indication of lack of testing capacity. (CDPH Official, March 29, 2021)
The second form of data drag, the delaying of resource distribution based on the requirement for individual applicants or beneficiaries to demonstrate need, is exemplified by the slow rollout of rental relief programs and delays in unemployment funds. The need for emergency rental assistance in Chicago was obvious to experts in city agencies as soon it became clear that people were losing work in April 2020, but distribution was slow. The last round of rental relief, the 2021 Illinois Rental Repayment Program, accepted applications in two submission windows that spanned May, June, and into July 2021, but payouts to landlords and tenants were not complete until December of that year (Illinois Housing Development Authority, 2021). Obviously, rent was due several times between when the applications were submitted and when payments were dispersed. Time was needed to review applications and work with applicants because requirements included proof of identity, proof of address, evidence of income, and evidence of past-due rent. One undocumented Chicago resident explained why he did not even apply: They wanted to talk with the landlord so they can send him the money. The problem is that the homeowner then doesn’t want to provide the information that they ask for on the form because they have to give their tax number or something… (Albany Park Resident, April 13, 2021)
These programs were slow and difficult, in large part, because policies were designed to minimize fraud and ensure that programs were generating “good” statistics. Rental assistance programs were an obvious need and early priority for public officials we interviewed, but the bureaucratic roll-out of these programs was often slow and onerous for residents. One expert with the Department of Housing in Chicago explained how challenging the documentation requirements could be: We now have $80 million this round [in 2021] … Now, there's all sorts of reasons that it might not resolve the problem: we may still have more applications than we have funds to provide for, or there are situations where a tenant had arrears and then left the unit. We are not allowed to fund those applications … Then there's also just the simple fact of all the required documentation … Those present real barriers for people to get their income documentation and lease documentation and everything else together. (Department of Housing Official, May 4, 2021)
As these last examples show, data drag if a feature of means-tested welfare allocation, but it is not reducible to this example. Data drag is a more generalizable account of how the cadence of state action can be tied to the availability of detailed data or documentation, when individuals or communities have presented themselves as good data points and uncertainty is minimized. Data drag occurs not only in the distribution of resources to communities or individuals but in the rollout of policy responses as well, like the case of the RERRT above.
Data citizenship
The mechanisms of data fixes and data drag highlight how data-driven decision-making and resource distribution prioritize getting or maintaining “good” data over addressing core vulnerabilities of communities. In this section, we argue that reliance on data to drive policy decisions in the COVID-19 response in Chicago also constructed boundaries between those communities that received the full support of government and those that did not. Consequently, continual participation in the generation of good data to represent collective (or individual) need has become a prerequisite for full inclusion in government policies and programs. And policymakers, for their part, understood the generation and stewardship of data to be a function of government. Even though public health experts and city officials often acknowledged that the preponderance of data showing systemic racial disparities in Chicago could lead to apathy or inaction, they often asserted that the production of better data on an increasingly “granular level” (City Official, April 27, 2021) would ensure continued action. Producing and managing data became central to individual and collective recognition, rights, protections, and resource distribution - data citizenship. These data transited between individual-level observations like vaccination status, geographic indices like the CCVI, and population level statistics like positivity rates. What is important is that all these data had to originate with individual Chicago residents or communities who were connected to government initiatives.
Importing austerity logics into their COVID-19 response, the City of Chicago did not infuse COVID-19 supports in all the communities that needed them (Decoteau and Garrett, 2022). City officials prioritized the availability of certain kinds of technocratic resources such as testing, contact tracing, and vaccines. Food, cash assistance, or childcare supports were mostly provided by private organizations. Therefore, getting resources from the city during the pandemic required living in a community that CDPH deemed most in need at that moment in time, and meant receiving one of the few resources offered by public programs. If individual needs or community statistics did not fit the designated parameters, the vulnerability of these individuals and communities was not legible to public officials. Even with resources, like vaccines, which were necessary and desired, the use of data to triage resources made their availability contingent and fickle.
Chicago data from vaccination campaigns in early 2021 embarrassed public officials who had been touting their racial equity approach. The first phases of vaccine distribution prioritized health care workers and elderly people, so vaccinations disproportionately went to predominantly white neighborhoods because white people are overrepresented among health care workers and the elderly in Chicago (Lourgos, 2021). Decisionmakers made strong efforts to incorporate data-driven and racial equity approaches to correct this new disparity, by targeting the 15 neighborhoods that ranked highest in the CCVI with early support to increase vaccine uptake: We took a pretty big gamble on saying we’re gonna get all of our vaccine that we have, and we’re gonna push most of it to these neighborhoods. Even with that, we’re still barely equal in terms of vaccine uptake in the city. It's about 50% Black and Latinx. Whereas to be equal for population, it’d have to be 60%. Whereas to have to be equal for [COVID-19] burden, it’d have to be 75%. (CDPH Official, March 29, 2021)
The use of CCVI rankings to produce a more equitable vaccine rollout was also controversial because it limited vaccine priority to only 15 (out of 77) neighborhoods, making the worthiness of neighborhoods for resources dependent on public health metrics and census data. For instance, although community leaders involved in RERRT represented communities that had been hardest hit by the pandemic at various points in 2020, not all of the same neighborhoods were ranked among the most vulnerable when the CCVI ranking was finalized and vaccines were rolled out in early 2021. South Shore, a predominantly Black neighborhood, was one such community. Some of the earliest deaths and outbreaks in Chicago occurred in the South Shore neighborhood, but it was not among the top 15 CCVI neighborhoods. One community leader in South Shore explained how the reliance on data for ranking community needs failed to serve their community, remembering that, [A]ccording to the city's COVID Vulnerability Index, South Shore was [originally] a high-needs community. And because we were doing such a good job with bringing that number down, we worked our way into a medium-needs community. So, while on paper that looks good, the sense of urgency is the same in the neighborhood. You know, the residents don’t know that. You know, they just know that, ‘hey, we need to have access, we want to get vaccinated.’ (Community Leader, November 9, 2021)
Getting funds and supports for community-based work from the city was premised on collaborating with CDPH to document needs and demonstrate outcomes in ways that often conflicted with community members’ sense of need. Some community leaders opted not to be a part of the city's initiatives: It was these big, large, multimillion dollar nonprofits that were at the table [with the city] and they don’t represent mutual aid groups … If the plan was working, and the nonprofit would be doing the work, it wouldn’t have been people fending for themselves too. At some point, the money is not trickling down. (Community Leader, December 16, 2021)
Under data citizenship, resources and recognition are tied to participation in the generation of evidence, and priority is given to communities with the right data. This was a parsimonious, transparent, and efficient way to organize government interventions from the standpoint of city officials in Chicago. But there were always material needs that were not being fully addressed in this model because it was only designed to produce measurable improvements. Many Chicago residents we interviewed expressed frustration that quantitative data were being used to speak for their experiences and needs when they were perfectly capable of speaking for themselves, as one resident responded when asked what Mayor Lightfoot could do for them: There's so many stories, so many, many people that didn’t have enough resources, that had a bad time during COVID, that lost their jobs. She has to sit down with them and know, so she can understand … we’re just not numbers, we’re humans. (Little Village Resident, November 23, 2021) [E]verything looks good on paper. It looks great on paper, in labs, and on graphs, and things like that. But come out and deal with individuals who have been greatly impacted [and] talk to them in person… That's all on paper. (Austin Resident, January 27, 2022)
Conclusions
Data-driven policies have become increasingly common in the United States. The case of COVID-19 in Chicago provides an important illustration of how these policies work in action, especially when aiming to address racial inequities. Experts were preoccupied with numbers to the extent that producing better numbers became their primary motivation, even when it was unclear if these numbers corresponded with real community needs. In fact, analytic fixes of COVID-19 testing data relied on well-known features of structural racism like residential segregation and uneven health care access. The dependence on specified evidence also slowed the response when it was determined that numbers merited correction or were incomplete. For residents who were not represented by the right numbers, needed supports and interventions were retracted, denied, or delayed. We argue that this constituted a regime of data citizenship characterized by expert data fixes and endemic data drag. Numbers became a mediator between policies and individual residents. By abstracting individual experiences into statistical variables, data citizenship led to the prioritization of certain lives over others based on how they were connected to the systems of data collection, analysis, and visualization.
Certain core features of quantification have been utilized by policymakers in Chicago to illustrate their success at achieving racial equity during COVID-19 through numbers. First, experts abstracted data from peoples’ lived experiences – concentrating on fixing testing and vaccine numbers, rather than addressing core vulnerabilities like working conditions and housing insecurity. Second, experts reified and essentialized racial statistics by relying on markers of structural racism (zip codes, hospital admissions) to fix early testing data that failed to reliably report on race and ethnicity. Third, experts presumed data was an objective and unbiased metric by which to make decisions on resource distribution, but they also presumed scarce resources and then used numbers to dictate which neighborhoods to infuse with resources at which point in time. The ability for numbers to manage uncertainty and constrain bias are reasons why statistical analysis is attractive for policymaking. Our point here is that there are cases, like policy approaches to racial equity, where engaging the uncertainties and biases that characterize lived experiences might be important.
Our analysis of data citizenship also shows that giving priority to quantitative analysis over qualitative experiences is a problem when it becomes routine, because vague and mechanistic causal relationships are substituted for the experiences of people and their communities. Quantitative health outcomes are attractive and useful measures of racial inequity because they are more easily measured and their numbers more easily fixed, but whether these fixes have lasting structural impacts is difficult to say. While data-driven policies are parsimonious responses to social problems, they may not be deep or lasting solutions when driven by the goals of policymakers and experts rather than community groups. Our findings show how important it is to build public systems that allow for community leadership in the very conceptualization of policy interventions rather than community engagement once the interventions are being rolled out.
Footnotes
Acknowledgements
This project was made possible by funding by the Institute for Research on Race and Public Policy, the Institute for Policy and Civic Engagement, and the Center for Clinical and Translational Services at the University of Illinois at Chicago. We would like to thank the experts who gave us interviews in the midst of attending to their many duties as COVID-19 emerged. We are likewise indebted to the residents who shared their experiences of vulnerability with us. We would also like to thank Cynthia Brito and Fructoso M. Basaldua, Jr. for their collaboration on data collection and preliminary analysis.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
Policy and Social Engagement Fellowship, Institute for Research on Race and Public Policy, University of Illinois at Chicago
Civic Engagement Research Fund, Institute for Policy and Civic Engagement, University of Illinois at Chicago
Pilot Grant, Center for Clinical and Translational Studies, University of Illinois at Chicago – Award no. 2020-05
