Abstract
The article proposes the theoretical category of data arenas as a relational field for strategic actors in diverse areas of the contentious politics of data (Beraldo and Milan, 2019). The paper argues that the conceptualization of data activism needs to be related to the immediate data arena in which the action takes place, in order to select the interactive opportunities and threats for emerging data-driven repertoires of action. To fully work through the relational dynamics of data activism, it is necessary to move from a conceptualization of data infrastructure to the notion of data arenas as an ‘open-ended bundle of rules and resources that allows certain kinds of interaction to proceed’ (Jasper, 2006: 141). Using the case of environmental data activism, I highlight four key dimensions to study: (a) strategic use of data as capital that differentiates and positions actors, as well as influences their further choices; (b) practices of defining the boundaries of the problem on which the arena focuses and outlining the pool of actors who participate in the process of solving it; (3) sets of relationships among the outlined pool of actors which represent opportunities and threats for the actors, related to the position they occupy within an arena; and (4) power as the ability to control and shape an arena. Data arena approach shed new light on data activism as a relational practice, combining the latest developments in research on data contexts and the political situatedness of data with the emerging field of research on data activism.
Keywords
The rise in the strategic use of data by grassroots actors within the increasingly datafied environment spotlights the category of data activism. Data activism is both a new conceptualization in which datafication technologies are used in contentious politics (Tilly and Tarrow, 2015), as well as an emerging field of research in social movement studies (Beraldo and Milan, 2019; Gutierrez, 2018; Gutierrez and Milan, 2018; Mattoni, 2017; Schrock, 2016).
This article proposes the theoretical data arena concept in response to the growing necessity of studying the contentious politics of data. In this realm, various grassroots initiatives attempt to influence or hijack the dominant processes of datafication by contesting existing power relations and narratives, as well as by re-appropriating data infrastructures for purposes distinct from those intended (Beraldo and Milan, 2019). I will elaborate that the arena, as a relational field of strategic action, offers the possibility of considering data activism in the various contexts in which it occurs. This will be shown in the case of the environmental data arena. Environmental data arenas are interactive spaces that develop in areas affected by biodiversity depletion or environmental pollution. For example, a concerned activist addressing air pollution may use data as a tool to express the problem. This opens doors to various human and non-human actors such as technology, institutions and researchers who have different standards and practices on what data is adequate and what it indicates. As a result, she interacts with the social world which affects her actions, goal setting, dilemmas faced and consequences of her actions.
The data arena concept has several advantages that stem from the potential of various field theories (Bourdieu, 2022; Fligstein and McAdam, 2012; Jasper, 2006), but the primary benefit is its potential to capture the relational dynamics of data activism in a contextual realm. A data arena is defined as a space of visible contentious or cooperative interactions in which various actors see each other and monitor each other's actions (Jasper 2015), interpret their relationship with others as well as identify emerging rules and use resources that shape legitimate action in arena (Bourdieu, 2020; Fligstein and McAdam, 2012: 9). Data is a relational object that requires production, transformation and use – its potential lies in its interaction with other actors and contexts (Douglas-Jones et al., 2021; Lash and Lury, 2007; Pink, 2022). Data is a tangled object that lacks clear delineation between its core and surroundings (Latour, 2004: 24). However, within arenas, data objects are perceived and transformed into specific capital for exerting relational influence (Muniesa, 2020: 58). Data is treated as a good that is worthy of being sought after in an arena (Bourdieu, 1977: 178), it mediates relations between actors and is differently valued by them (Jasper, 2012: 20), it is a mean of constructing meaning and influencing the perception of a given arena-specific problem (Bourdieu, 2022: 115), and, consequently, it affects the positioning of a given actor in terms of data use and opportunities. The legitimacy of action is based on the use of data as specific capital that has multiple properties and values for actors.
To fully work through the relational dynamics, it is necessary to move from a conceptualization of data infrastructure as a site to the notion of a data arena as an ‘open-ended bundle of rules and resources that allow certain kinds of interactions to proceed’ (Jasper, 2006: 141). From this perspective, we can capture issues of power and control, the multiplicity of relationships and stakes in the game, the definition of the boundary and adaptation to data arenas in the process of strategic action.
To consolidate the concept of data arena, I will firstly refer to the varying working definitions of data infrastructure that are employed in theoretical and empirical studies on data activism. In the case of data activists, the immediate socio-technological environment of their action is data infrastructure, understood as a bundle of heterogeneous elements, such as standards, technological objects, and procedures (Slota and Bowker, 2017). Secondly, I will turn to the relational and contextual concepts of data and formulate a new concept of the data arena. As I will argue, the shift from the data infrastructure as a site to a data arena takes us from observable interactions regarding particular infrastructure to inferred relationships between actors occupying specific positions in the arena (Bourdieu, 2020) in relation to data as capital that is a principal goal sought and an instrument of political struggle to impose a legitimate perception of the particular problem (Bourdieu, 2022: 115). Moreover, it includes actors’ ability to create shared meanings and senses (Fligstein and McAdam, 2012: 11), recognize the terrain and stakes in the game and strategically adapt to these conditions (Jasper, 2015). Thirdly, I will present an application of the key dimensions of data arenas using existing research on environmental data activism.
The relationality of infrastructure
The contentious politics of data is related to the diverse types of data infrastructure, either through contention, hijacking or engagement with a particular infrastructure (Beraldo and Milan, 2019). The infrastructure concept is grounded theoretically in the STS paradigm, referring to a nexus of diverse and relationally combined material/technological, cultural and political dimensions (Larkin, 2013; Slota and Bowker, 2017). Infrastructure appears solely as a relational property, representing the interdependence of a system composed of non-human actors with the organized practices of various users (Star, 1999: 380).
In the data activism literature, we can encounter three overlapping ways of defining data infrastructure: as (a) an activator, (b) a materiality and (c) a site.
Firstly, data infrastructure is an activator of strategic action, which in practical terms means the utilization of infrastructure as an enabling method (Gutierrez, 2018: 4) or boundary conditions for new forms of action in the contentious politics of data (Beraldo and Milan, 2019: 3). Secondly, data infrastructure takes an objective form that is external to the strategic action. This materiality is understood as the digital ‘installed base’ for data infrastructure and the technological ability to assemble and shape our everyday life (Pellegrino et al., 2019: 92–96). In this sense, materiality is conceptualized as the digital platforms and other devices which act as mediators of strategic action, e.g. by means of ‘datafied’ emotions and technologically mediated relations (Milan, 2015) or by creating temporary ‘data publics’ (Milan, 2018) that are politically situated (Larkin, 2013: 333). Thirdly, data infrastructures are perceived as sites. A site is a place of interacting forces and a set of facts out there that need to be embraced by strategic actors (Yaneva and Mommersteeg, 2019). Data infrastructures are spaces that ‘embed manifold opportunities for engagement and subversion’ (Gutierrez and Milan, 2019: 4) or modes of participation within it (Gutiérrez and Milan, 2018; Schrock, 2016). This approach shows how data activists interact with other actors who have different strategic goals and possess diverse resources, with or without the ability to control or shape the infrastructure.
The question of data infrastructure as a relational site is crucial in the theoretical deepening of the data activism category. As stated by scholars, the concept of data activism needs to be deepened in regard to power–citizen interactions (Gutierrez, 2018: 143), as well as empirically situated in distinct political fields (Bigo et al., 2019), e.g. the environmental field (Mattoni, 2017). However, paying attention to observable interactions is only the first step to grasping the relational dynamics of data activism (Bourdieu 2020: 252–253), as the controversies around data and infrastructures reveal far more in terms of actors’ strategic perspective, their ability to adapt and their data culture. The compositional approach to data infrastructure addresses issues of coexistence, functioning and agency of the entire complex system, recognizing the diversity of actors in terms of their goals and values (see Luque-Ayala and Marvin, 2020). However, the process of composition always takes place in a terrain informed by power relations where conflicting interests and perspectives are at stake, including the ‘war of positions’ aiming at the transformation of existing power relations (Mouffe, 2013: 141–143). Issues of power or positioning have been considered to some extent through the example of agonistic data practices (Crooks and Currie, 2021) that introduces the benefits and risks of using data in a relational struggle for visibility, the concept of ‘just-good-enough data’ as alternative ways of creating and interpreting datasets that can be mobilized to create different forms of evidence (Gabrys et al., 2016) and ‘data valences’ as varying subject positions and orientations towards data that mediate what can and should be done with it (Fiore-Gartland and Neff, 2015; Kitchin, 2022).
The data arena's focus on action, positioning and power relations enriches the comprehension of data activism in different realms. This includes recognizing opportunities and threats for activists within the arena, as well as examining their various relationships with other actors. The data arena encompasses compositional issues of infrastructure as a method and installed base whilst also addressing the political situatedness and differentiated interactions. At the same time, however, it goes further, emphasizing how arenas as sites of interactions are shaped by the stakes of the game and the strategic action with various sets of opportunities and threats imposed by other actors, as well as how the boundaries of the arena are collectively defined (who's in and who's out) and how the power over key data resources influences the relational positioning of actors, mainly data activists in their relational contexts.
The relationality of data
The data arena should be considered in light of other relational conceptualizations of data that have recently been gaining prominence in the literature. The data arena is characterized by a series of relationships and interactions that stabilize the arena, its rules and the ability of individual or collective actors to access power and control within it. In this section, I will examine existing contextual approaches in data studies and, subsequently, I will formulate a concept of data arena (Bourdieu, 2020; Fligstein and McAdam 2012; Jasper, 2015).
There is a growing volume of theoretical proposals describing the role of the relational context in data studies (Airoldi, 2022; Loukissas, 2019; Mattoni, 2020; Nost and Goldstein, 2021). Most of these relate to critical data studies that seek to break down the universal understanding of the datafication (Kitchin and Lauriault, 2018). The anti-universalist approach aims to re-situate practices and imaginaries around data within a specific context of action. It breaks with the perception of datafication as an infinite space of freedom (Thatcher and Dalton, 2021) for economically rational, profit-enhancing actors who are not constrained by their immediate local environment (Loukissas, 2019: 10), as well as re-establishes the connection between data practices and the lived experiences of actors, situating them within a ‘communicative ecology’ (Mattoni, 2020: 4) through which we gain insight into diverse data cultures (Kitchin, 2022). As noted by Loukissas (2019: 10), the practice of looking at the local conditions of data can constitute a form of resistance to digital universalism and the threat of erasure that it poses to myriad data cultures. The anti-universalist approach re-opens the context of data in which there are power asymmetries, as well as differentiated opportunities and threats for multiple actors.
There are several theoretical perspectives that directly attempt to formulate the meaning of context and outline the empirical field for the study of this issue. One of the founding concepts of data context was the data assemblage (Kitchin and Lauriault, 2018), which reframed the composition of ontologically diverse technical and social elements that constitute complex systems concerned with the production, management and translation of data (Kitchin 2022: 23). As some authors point out (van Schalkwyk et al., 2016), a purely descriptive representation of assemblage composition does not fully capture the relational dynamics of power and resistance. I will examine four anti-universalist data approaches that emphasize power dynamics and relationships between human and non-human entities, including infrastructure as action sites.
First is the concept of data settings (Loukissas, 2019; Loukissas and Ntabathia, 2021). Data setting is situated and conditioned by the relationship between humans and technology and by the hybrid of creator and audience (Loukissas, 2019: 164). It highlights the role of place attachment, which is understood as a mix of values, local knowledge about data collection practices and ways of encoding places as subjects (Loukissas, 2019: 31). Locality is defined both as the spatial and social boundaries that influence situated knowledge production processes, as well as its meaning and uses (Loukissas, 2019: 15–19). To consider data settings is to emphasize the complexity of relations behind the specific composition and representation of data, which are impacted by the differentials of power that are at play in a very specific environment (D’Ignazio and Klein, 2020: 162–171). The study of data settings uncovers the embeddedness of data in a place and the power relationships that exist within it – whether we are talking about a single institution, a region or international spaces. Moreover, the use of data raises a high risk of epistemic violence associated with the presence of strangers in data settings (D’Ignazio and Klein, 2020: 147), which could influence the process of data meaning-making and defining the boundaries of who does/doesn’t participate in the pluralist arena of defining them. In this sense, attachment to place emphasizes the relatively stable and less permeable boundaries of data settings. But this is not always the case – boundaries are open and contested by disputes and conducted by various actors from different positions (Bourdieu, 2021: 9).
Airoldi (2022) highlights that situatedness applies to datafied contexts in which both humans and machine learning systems operate, e.g. in the context of social media or digital platforms. Airoldi coined the term machine habitus, which is ‘the set of cultural dispositions and propensities encoded in a machine learning system through data-driven socialization processes’ (2022: 113). Machine habitus is determined by two data contexts: firstly, the global context constituting the preliminary training of machine learning systems by primary producers (scientists and software developers), and secondly, the local context that represents datafied expressions of socially recognizable individuals, places and communities (Airoldi, 2022: 56–57). Emphasizing the importance of data contexts introduces structural opportunities and determines a technologically mediated relationship between the actors and their datafied environment which is influenced by both the construction of the tools and the sets of local relationships. It draws attention to the issue of strategic uses of data that are based on locally defined goals and furthermore emphasizes the importance of defining the boundaries of a specific arena that contain its knowledge regimes, rules and pool of actors. However, the data context framework does not introduce the very process of recognizing individual elements of reality as being valuable enough to be subjected to datafication – which is extremely important in terms of introducing data as a form of capital that provides the means of expressing dominant or subordinate visions of reality (Bourdieu, 2022: 115).
Some researchers point to the political situatedness of the data itself (Edwards, 2010; Gabrys, 2016), which is influenced by four primary infrastructure factors, as introduced in the political ecology of data approach by Nost and Goldstein (2021: 4–5). The first factor is the political economy of data production, understood as socially differentiated access to, and control over, resources. The second factor is the financing of data production and the rise of data as a modern currency of environmental expertise. The third factor concerns the technology–nature relationship, in which data echoes the natures of various actors because the data is rooted in different perspectives and modes of governance. The fourth factor again emphasizes the contextual aspect in terms of the influence of historical factors and scale – that is, sets of historically situated knowledge regimes that determine the value and meaning of data. This political ecology of data approach draws attention to the current and past relationships that shape the field of action and outlines sets of opportunities and threats for the effective use of data. From this perspective, the political ecology of data determines the possible ways of interacting with, and hijacking, infrastructure for own strategic goals.
Moving from a focus on data and its spatial, technological and political location to the actor's perspective, Mattoni (2020: 9–11) proposes the application of situational analysis (Clarke et al., 2018) to the relational exploration of data-enabled activism. Mattoni details three general types of maps to explore and define a situation: (a) a map of human and non-human actors as elements of action, (b) a relational map that defines the relationships among specified groups of actors and (c) a map of social worlds. In the context of data activism research, this third type of map, which reveals the arena of engagement, interaction and negotiation between different actors, is particularly relevant and necessary to deepen the issue of context for data activism as a social movement actor. In particular, the literature on contentious politics studies has highlighted the influence of social context on social movement activities in numerous ways, conceptualizing it as an environment (McAdam et al., 2001), arena (Jasper, 2006) or field (Fligstein and McAdam, 2012), to name a few.
The above approaches consider various themes of data arenas, highlighting issues of cultural, technological and political positioning, as well as providing a perspective of the social world that actors co-create to consciously navigate, make decisions and act strategically within them. To fully comprehend contextuality, it is essential to incorporate concepts from social movement studies that address relational issues. The perspective on relations focuses on ‘political opportunity structures’ (Meyer, 2004) and describes the degree of political system openness in relation to collective claims. This view suggests that an arena's character is dynamic as it impacts and interacts with claimants’ tactics (Hadden, 2015). Contentious politics involves diverse actors, such as challengers who establish connections with polity members, government agents and other unorganized groups like public opinion (McAdam et al., 2001). These interactions are guided by culturally encoded repertoires of contention (McAdam et al., 2001:16), whist specific arenas consist of rules and resources that facilitate certain interactions where there is something at stake (Jasper, 2015: 14). Specific arena can be defined as a social order where actors adhere to socially constructed rules and interact based on shared understandings of its purposes, power dynamics and legitimate actions (Fligstein and McAdam, 2012: 9–11).
As it was stated before, approaching infrastructure as an interactional site is a primary step in the analysis, allowing us to empirically grasp the basic differentiation of actors and their approaches to data. However, from the observable interactions, we can not only infer knowledge about the social world in which an actor moves and the multiplicity of relationships that co-create it but also highlight how the actor moves and what decisions makes within the arena: pursuing goals, defining the boundaries of the arena and addressing issues of power within it. To achieve this, there needs to be a greater focus on the instability/openness of boundaries that are contested in a definition process, which affects how certain issues are framed, who is given a voice or whose voice is taken away and on what symbolic or material basis. Moreover, attention should be paid to the data itself as a social good that mediates relationships and constitutes the position of the actor, as well as being the building block of the vision that the actor wants to introduce in the search for recognition of the problems and issues that the data address. Consequently, we can capture the relational dynamics involved in an actor's position and action to improve his position through alliances and contestations of the current order (even on a small scale), as well as his potential symbolic power to define rules, the meaning of data and a vision of the problem.
To capture data arenas, the following dimensions should be considered, as presented in Table 1.
Contextual data approaches and dimensions of data arenas.
A data arena is a space constructed through interactions between actors, focusing on solving a case-specific problem using data as a symbolically meaningful resource. In this sense, the key is to look at the visible interactions in each place, which boundaries are defined and contested by the actors participating in interactions and which have specific opportunities and threats regarding the datafication process. The data arena shows how actors, based on access to and control over data, position themselves in relation to others, trying to influence them and introduce their own definition of the problem. Thus, the proposed approach focuses more on the acts of composing and creating contingent orders of datafication that are stimulated by relational and power dynamics, rather than the composition of the system itself and its agency, as is the case with relational approaches to data infrastructures.
The four dimensions above constitute a relational perspective that will be applied to the analysis of environmental data activism in the next section of the article. First, it is important to consider the strategic goals of action, which in the field of data activism boil down to objects, phenomena and processes that are subject to datafication and thus constitute a valuable domain of contention using data as capital. Second, the definition of the arena boundary comes down to assigning cultural meanings to data settings, defining the rules of the game within the arena and determining who are the audiences for our actions in the arena. Third, together with the cultural definition of the arena and the strategic goals of the actors, we gain insight into relational dynamics – that is, sets of interactions between different actors (individual and collective) and the opportunities and threats to which actors respond and adapt using defined means of action or repertoires. Fourth, relational dynamics are also shaped by positioning oneself in the arena – both as a dominant actor and as a challenger in terms of control and access to material infrastructure, financial resources and the granting and gaining of legitimacy within culturally defined standards, knowledge regimes and data meanings. The data arena perspective enables empirical inquiry into strategic action related to datafication objects, power dynamics and control over data resources within an arena. This includes examining relationships that influence the decision-making of actors such as data activists and their organizations. In the following section, I will discuss how certain dimensions impact grassroots effectiveness in environmental data activism based on existing cases.
Environmental data arenas
The environmental data infrastructure facilitates climate and environmental awareness through visualization, modelling, prediction and governance (Edwards, 2010; Gabrys, 2016). It enables a supra-individual view of climate change or pollution whilst depicting nature as an object under public scrutiny (Lövbrand et al., 2009). The hegemonic role of data in shaping our understanding of the environment influences how we address issues such as pollution and climate change (Machen and Nost, 2021; Mol, 2008).
Conversely, data is becoming a politically relevant good and resource for influencing environmental policy, and the realm of environmental datafication is becoming increasingly populated by various actors, technologies and standards (Bakker and Ritts, 2022). The growing importance of data in environmental policy is a prerequisite for the emergence of diverse environmental data arenas as spaces of contentious and cooperative interactions over emerging environmental problems which brings together all individual and collective actors participating in the emergence, definition and resolution of these problems (Duyvendak and Fillieule, 2015: 306). Both data and actors are involved in problematization, which is the process of identifying potential issues that can be addressed through public policy. Problematization involves neither representing pre-existing objects nor creating non-existent ones (Duyvendak and Fillieule, 2015: 307), but it assumes the establishment of specific publics that are affected by a particular issue (Marres, 2015). In the latter sense, data arenas are spaces of visible interactions in which various actors see each other and monitor each other's actions, interpret their relationship with others (e.g. by defining who has power and why) as well as identify emerging rules and use resources that shape legitimate action in the arena (Fligstein and McAdam, 2012: 9; Bourdieu, 2020). The legitimacy of action is based on the use of data as capital that has multiple properties within data arenas. Data as capital is treated as a good that is worthy of being sought after in an arena (Bourdieu, 1977: 178), it mediates relations between actors and is valued by them (Jasper, 2012: 20), it is a means of constructing meaning and influencing the perception of a given arena-specific problem (Bourdieu, 2022: 115), and, consequently, it affects the positioning of a given actor, depending on the use of data, access to data and strategic action.
The relevance of data in this arena is both a condition for policy implementation and an opportunity for the participation of groups without power, such as grassroots initiatives and advocacy groups. Environmental data arenas are composed of incumbents who wield power and have a strong influence on defining the environmental problem, as well as of challengers who have less influence, seek legitimacy in the arena and await new opportunities to challenge or influence dominating perspectives (Fligstein and McAdam, 2012). Environmental data arenas not only serve to legitimize the dominant point of views that are based on data but also provide opportunities to legitimize grassroots actors, such as social movements whose goal is to win social and political recognition of the problem definition they represent.
The following sections of the paper will present how to use the data arena approach, detailing its four dimensions: (a) strategic use of data, (b) boundary definition, (c) relational opportunities/threats and (d) power in the arena.
Strategic use of data
Environmental data arenas are composed of disparate actors who focus on solving an environmental problem, sharing the understanding that data are the primary means of perceiving, communicating and managing that problem. All actors enter an environmental data arena to pursue their diverse strategic goals, in line with their agenda and values, which are often contradictory and conflicting. Hence, data is considered an arena-specific capital (Bourdieu 1977) or a meaningful resource that is recognized as a significant social good that has different meanings. Firstly, data is recognized as an effective resource that produces evidence and facts about environmental issues (Gabrys, 2019). Secondly, data mediates the relations between actors as a specific language of engagement (Gutierrez and Milan 2019: 8) that articulates values and compels others to collaborate by defining situations and appealing to the beliefs, values and interests of diverse groups (Fligstein and McAdam, 2012: 50–51). Thirdly, data is used for establishing relationships in the environment and articulating narratives about reality, giving it meaning and significance and enabling action towards it (Pellegrino et al., 2019: 106). Fourthly, data as capital in an arena produces differentiation effects (Bourdieu, 2021: 16), given that it is unequally distributed among actors due to their varying analytical, scientific and technological capabilities (Gabrys and Pritchard, 2022).
The above issues are most apparent when we take the perspective of an actor who enters a specific environmental data arena as a challenger (Gabrys, 2016, 2019; Gabrys et al., 2016), for example, citizen sensing initiatives entering the air pollution data arena in northeastern Pennsylvania (Gabrys and Pritchard, 2022). In the arena, the dominant actor was the Pennsylvania Department of Environmental Protection, which managed the measurement infrastructure by providing data according to the principles of the nationwide and standardized Air Quality Index, but it was focused more on urban pollution than on natural gas extraction pollution in rural areas (Gabrys and Pritchard, 2022: 108–109). Faced with these monitoring coverage constraints, residents of these rural areas have engaged in collecting their own data in the form of various projects and initiatives to produce evidence using predominantly low-cost sensors. This data allowed local communities to reframe their immediate environment in terms of the types of pollution and indicate those actors responsible for it, as well as to enter discussions and negotiations with regulators (e.g. influence one agency to deepen local monitoring). On the other hand, data collected by citizens was perceived by regulators as not comparable to the standardized and high-quality data they use to make decisions (Gabrys and Pritchard, 2022: 112). As it was stated by Gabrys et al. (2016), data produced by citizens are ‘just good enough’ to create different accounts for engaging with environmental problems and to produce different data stories of pollution, as well as to initiate conversations with environmental regulators. But data is often not enough to constitute legal claims and change the course of public datafication of environmental problems, and it is ‘a relatively open question as to what the uses and effects of data gathered through citizen sensing technologies might be’ (Gabrys and Pritchard, 2022: 106).
From the arena perspective, grassroots data is a form of capital that is an entry fee to the environmental data arena, which allows one to participate in discussion on a given environmental problem, as well as to articulate and introduce the perspective of a particular group. However, ‘just good enough data’ for one actor is valued differently by other actors, especially those dominating in the arena (regulator and scientist), which affects the effectiveness of data within this space. Data is unequally distributed among actors within the arena in terms of their quality and what actors find valuable in solving a given problem with the environment. Since data within the arena has a differentiating effect on actors, the issue of data effectiveness should be considered in the light of the positions they hold and the choices they face in realizing their own strategic goals and vision of a problem.
Boundary definition
Another distinguishing feature of the arena is the boundary definition as the establishment of an order in which the various actors are situated. The boundaries of the arena – that is, the definitions of what is a problem and who has the right to speak and participate in its solution – are contested and negotiated by actors participating within it (Bourdieu, 2021; Jasper et al., 2022). Data has a transformative potential in ability to shape perceived realities (Renzi and Langlois, 2015: 202) or to create stories about a given order, pointing out dependencies and injustices, as well as introducing arena newcomers as potentially relevant social groups (Lassinantti et al., 2019) for a given problem. In the latter sense, data is used by various initiatives addressing environmental data justice (Longdon, 2020), e.g. in counter-mapping practices of Canadian indigenous communities (Kidd, 2019).
The practice of mapping is political in nature and is primarily the domain of state actors, who hold a monopoly on describing their subordinate territory; yet the same techniques are resorted to by groups without power, such as indigenous communities, who counter-map the reality around them by creating a set of representations, shedding light on discrimination and establishing new social categories (Kidd, 2019: 955). Examples of counter-mapping include the indigenous peoples’ initiatives in specific provinces in Canada, in which local communities mobilized against the politics of extractivism understood as the construction of pipelines and mineral, oil and gas extraction (Kidd, 2019). The alternative maps acted as data sets situated in a specific setting: they served to make visible the long socio-cultural history of indigenous peoples and to situate them within a well-defined territory. Secondly, they were meant to represent the impact of capitalist and colonial extraction on their living conditions, serving at least as evidence in lawsuits and exerting political pressure (Kidd, 2019). The goal was to change the perspective of looking at the territory – instead of the government and corporate narrative regarding the profits of extraction, attention was given to the aspect of inhabitation, the sociocultural significance of the territory as well as the introduction of previously excluded groups (Kidd, 2019: 966). Thus, there has been a shift in the boundaries of the arena as a relational space of contention – new voices and new actors have been introduced, and they have become legitimized as relevant social groups within this environmental data arena.
The distinguishing characteristic of the arena is that its boundaries are called into question – which is conducted from the perspective of both challenging and dominating groups. The peculiarity of the dominant actors is that they are keen to maintain stable boundaries and establish barriers of entry (Bourdieu, 2021: 11), so environmental issues are exclusively the domain of technical administration rather than political disputes. Hence, within the environmental data arenas, activists and local communities are not only using data of varying quality to enter the arena and start a discussion, but at the same time are engaged in unsealing the boundaries of the problem, introducing the social groups they represent, their values, perspectives and social imaginaries.
Relational opportunities and threats
Actors within environmental data arenas hold certain positions, orienting themselves to a specific definition of the problem and the quality of the used data, as well as defining the boundaries of the arena according to their values. Actors perceive others in the arena from their position, but they also face specific opportunities and threats regarding various relations with other actors. From the perspective of data activists as challengers, we can identify some typical relational opportunities and threats, as well as the basic groups of actors that pose dilemmas in their strategic action. However, it is worth noting that the pool of actors and relationships is specific to each environmental data arena.
The basic dilemma of a grassroots actor is to maintain its data-driven agenda whilst gaining acceptance of the data standard by other actors in the arena. An example of the above dilemma is the smog dispute in Poland, where the dominant and decision-making actors have high-quality air pollution data that meet EU standards, whilst the social and market side (companies that sell sensors and form the main measurement network) produces lower quality data, which is nevertheless based on a commercial-citizen measurement infrastructure that is much more dense and extensive, complementing the blind spots left by public institutions (Wróblewski and Goszczyński, 2020). This means that the anti-smog movement can use commercial sensors to articulate the problem, but the role of its data ends with community awareness, whilst the influence on public policy can only be achieved through traditional social movement repertoires like petitions and protests. The bottom–up infrastructure allows for the creation of shared meanings; however, it remains vulnerable to de-legitimization by other actors in the arena, including public institutions and private companies that accuse citizen initiatives of methodological flaws and poor-quality evidence, in an attempt to de-legitimize their allegations and demands (Kimura, 2021).
Data processing serves both internal purposes – including mobilizing organizational participants, involving them in new ways of doing things as well as creating a shared and objectified framework for experiencing a given problem – and external purposes, i.e. communicating problems to the public and policymakers. At the same time, however, the new technological possibilities associated with datafication may become the subject of dispute around standards for the quality of evidence (Lampland and Star, 2009). This may also force organizations to seek partners among dominant actors, especially scientists as a group that in a practical sense will influence the procedures of data collection and analysis and in a political sense will play a role in legitimizing the demands of initiatives (Kimura, 2021). From the perspective of an actor within the arena, it is crucial to react to the standard – whether by accepting it and playing by the dominant rules, negating it and rejecting it or using data as an intermediary tool, e.g. to build the identity of a movement or a civic initiative.
Within the typical environmental data arena, we can identify three primary categories of potential allies for data activists. The first comprises data journalists, who seek themes and explanations for their own stories using publicly available or hacktivist data (Gutierrez, 2018: 36–43). The second group consists of academics, who either combine their academic practice with direct engagement in data activism (possessing avant-garde analytical skills, see Gutierrez and Milan, 2019) or constitute an external partner that stimulates, verifies and, in the most general sense, legitimizes the process of data collection and analysis. Activists in these collaborations gain recognition and scientific legitimacy or receive technical support to produce tools (Gabrys, 2016: 148). The third group is made up of commercial actors participating in the arena, who seek to use data for profit intensification and the creation of new markets for their services (Dauvergne, 2020). In this sense, private companies adapt to emerging needs and explore and stimulate the development of new markets by providing products in the form of measurement tools and applications and creating their measurement networks (Wróblewski and Goszczyński, 2020).
The above three types of potential allies appear to varying degrees in environmental data arenas, whilst hybrid and complex actors – such as technology-oriented movements or social enterprises (Hess, 2005; McInerney, 2014) – proliferate at the intersection of these relationships. Nevertheless, the most complex relationships are between activists and public institutions, which in environmental policy can act as partners in cooperation, objects of demand, sources of threat or, most often, internal governance units, stabilizing the functioning of the whole arena and representing the dominant perspective on the issue (Fligstein and McAdam, 2012: 13–14). The visibility of other actors and specific opportunities is dependent on the position the actor holds. The realization of a strategic goal – such as the implementation of a particular solution – leads through relational interdependencies that affect the legitimacy of the actor, its narrative and the data it has produced or uses. These relational interdependencies reveal power differentials within the arena.
Power
Along with recognizing relational dependencies, actors are also discovering and making visible issues of symbolic and material power in environmental data arenas, orienting themselves to actors who hold dominant positions in the arena. The actors who manage the infrastructure have a significant advantage in the data arena over the others, which leads to the relational asymmetry between the state, corporations, academia and civil society (Domínguez and Gordo López, 2019). This asymmetry is expressed in the ability to influence the rules of the arena, whilst it also ensures the dominance of certain actors in terms of data resources and analytical techniques. It affects the possibilities of setting data handling standards and has a hegemonizing effect on objectifying public policy goals and creating conditions of experience (Pellegrino et al., 2019).
Activists move in the field of power and are therefore forced to refer to the current standards of data production and the information derived from it. The actions of public institutions are, on the one hand, a source of political opportunity for grassroots actors in pursuing their strategic goals; on the other hand, the actions of these institutions may also pose a political threat to them. In both variants, activists recognize and adapt to conditions in the arena, adjusting their actions and entering cooperative or antagonistic relationships with public institutions.
An example of how a dominant actor influences the arena relates primarily to the decision-making power to increase or decrease the openness of public data, as happened with the release of two types of environmental data generated by the satellite systems of the Brazilian Space Research Unit (Rajão and Jarke, 2018). The first data made public was aggregated, responding to public concerns about deforestation in the Amazon region – with each annual summary, activists adjusted their anti-deforestation efforts by referring to publicly defined parameters and thereby putting public pressure on the Brazilian government to expand the level of protection for specific parts of the Amazon rainforest (Rãjao and Jarke, 2018: 323–324). Public data also provided an advantage in the international politics of the Brazilian government, which was able to remain transparent about its policies towards the ‘lungs of the earth’. However, the opening up of access to aggregated data encouraged activists to further press for access to non-aggregated, and therefore more detailed, data (Rãjao and Jarke, 2018: 324). The level of transparency – identified by the degree of data aggregation, the timing of its publication and its public availability – influences the campaigns and action practices of activists in the environmental arena, who not only adapt to the conditions in the field but also negotiate them with public institutions.
A negative example of a dominant actor's action is the intention to close the arena by limiting data collection, albeit stemming from climate change denialism or austerity policies. An example of a government threat was President Donald Trump's policy towards defunding and partially dismantling measurement infrastructure. The federal government at the time rolled back a ban on neurotoxic pesticides, proposed repealing the previous president's Clean Power Plan and planned cuts to important environmental programs (Walker et al., 2018). The institutionalized climate change denialism of Trump's presidency was characterized not only by criticism of the scientific consensus but also by his action to shut down important elements of the public measurement system, including NASA satellite missions that focused exclusively on climate change (Edwards, 2019). In response to threats to the integrity, accessibility and continuity of federal environmental policy, most notably access to reliable information, the Environmental Data & Governance Initiative (EDGI) (Vera et al., 2019), a network comprised of environmental activists, government agency staff, journalists and data analytics specialists, among others, was formed. The initiative focused on acquiring and rescuing publicly available data as a response to President Trump's anti-environmental actions (Vera et al., 2018), archiving tens of thousands of public domain databases. In the initial and reactive stages of the initiative, activists did not consider either the usefulness of the data or why public institutions in the USA had collected it so far (Walker et al., 2018); however, over time, based on the resources they obtained, they began to support public institutions and create tools to improve climate and environmental policy with the information they possessed (Walker et al., 2018).
Data activists within arenas recognize dominant actors and their importance, responding to their actions, positioning themselves in relation to it or attempting to transcend their limitations. From the perspective of the arena, we can see the dynamic power relationships – that is, the influence of the dominant actors, but also the ways in which the challengers try to adapt to achieve their goals. In a general sense, the arena is a space of visible relationships that actors respond to achieve their own strategic goals – however, actors are differentiated by their positions and thus have differential influence in shaping reality through data. This poses a major dilemma for challengers, who must look for specific opportunities – such as temporary weakness or openness on the part of dominant actors – or gain legitimacy by collaborating with academic, commercial and social actors that have greater measurement capabilities, produce data of higher standard-compliant quality, etc.
Towards the data arena
In conclusion, this paper has argued that the conceptualization of data activism needs to be related to the data arena in which the action takes place to select the inter-active opportunities and threats for emerging data-driven repertoires of action.
Using the three definitions of data infrastructure that prevail in the data activism literature, and specifically addressing the issue of infrastructure as a site of interaction, I moved from empirically observable interactions within data infrastructures to issues of the arena. The arena considers the relationships between positions within it, the actors’ ability to recognize the terrain and rules of the arena and their strategic action, which is geared towards achieving their own goals within the defined boundaries and conditions of the game. More specifically, I highlighted four key dimensions of the data arena as shown in Table 2: (a) the strategic use of data as a form of arena-specific capital that differentiates and positions actors, as well as influences their further choices; (b) the practices of defining an arena boundary to shape narratives about environmental and climate challenges, as well as to include or exclude other actors from the arena; (c) relational dynamics and the sets of opportunities and threats that result, allowing actors to not only make alliances but also change positions within arena; and (d) issues of power and control over the arena.
Dimensions of data arenas.
The data arena is a space of visible contentious and cooperative interactions. It emerges with a problem that brings together diverse actors who share the belief that using data is the main means of conceptualizing, knowing and acting on the problem. Accordingly, data becomes an arena-specific form of capital that is used in strategic action, that mediates interactions and that is valued by other actors according to its quality. Each actor enters the data arena with a particular goal and uses the available rules and resources to achieve it. The data arena is perceived by an actor not only as a predefined and foundational plateau but also as a site that can be redefined, for example, by introducing a new boundary definition or by reconfiguring existing relationships and sources of control over data infrastructures – as was illustrated in the examples of environmental data activism initiatives.
Activists sense and seize the opportunities and threats related to specific data arenas, considering the roles of dominant actors, their strategies of action and broad infrastructural changes, to proceed with the action and seek new or empty spaces for pursuing their own goals. Data activists inter-react with the arena to hijack, supplement or contest the rules or purposes of the game. Tracing the interactions between data activists and the data infrastructure not only enables us to situate the contentious politics of data in a specific data arena (e.g. environmental) but also provides a selection of diverse forms of interaction within the arena, through which they can determine the relevance of data-driven repertoires of action and thus systematically map the opportunities and threats for data activists.
This theoretical perspective sheds new light on data activism as a relational practice, combining the latest developments in research on data contexts and the political situatedness of data with the emerging field of research on data activism. This new field of research represents a new type of social movement actor, which is spilling over from the data politics context into other areas, such as environmental, anti-corruption, humanitarian and anti-racist fields. In a theoretical sense, the focus on the data arena as a relational site contributes to deepening our understanding of the strategic nature of data activism – how activists choose technological tools, how they target the objects of claims, as well as how they consider opportunities and threats of their activities. These theoretical findings provide the following insights for future empirical research: on the one hand, the need to study the contentious and cooperative relationship of data activists with other actors, including incumbents in the data arena, and, on the other, the need to map the arena-specific forms of data activism in different areas of politics. Based on previous research on environmental data activism, it has been possible to formulate the basic dimensions of the data arena; however, many things remain unstudied, such as issues of certification and legitimacy in the arena or the transformations of capital used by diverse actors within the arena, including social actors that seek legitimacy within it. Nevertheless, the data arena concept draws attention to the fluid and culturally defined game order, which, taking forms of data as capital into account, can allow for the empirical study of the consequences of data activism.
Footnotes
Acknowledgements
I would like to express my gratitude to Renata Włoch, Natalia Juchniewicz and three anonymous reviewers for their valuable comments.
Conflict of interest
The author(s) declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship and/or publication of this article: This work was supported by the Narodowe Centrum Nauki (grant number 2021/41/N/HS6/01110).
