Abstract
This introduction to the special issue on data and agency argues that datafication should not only be understood as the process of collecting and analysing data about Internet users, but also as feeding such data back to users, enabling them to orient themselves in the world. It is important that debates about data power recognise that data is also generated, collected and analysed by alternative actors, enhancing rather than undermining the agency of the public. Developing this argument, we first make clear why and how the question of agency should be central to our engagement with data. Subsequently, we discuss how this question has been operationalized in the five contributions to this special issue, which empirically open up the study of alternative forms of datafication. Building on these contributions, we conclude that as data acquire new power, it is vital to explore the space for citizen agency in relation to data structures and to examine the practices of data work, as well as the people involved in these practices.
Introduction
It has been well established, in the pages of this journal and elsewhere, that the advent of Big Data brings with it new and opaque regimes of population management, control, discrimination and exclusion. Numerous insightful critics have made this case, including Andrejevic (2013), Beer and Burrows (2013), boyd and Crawford (2012), Gillespie (2014), Hearn (2010), Turow (2012) and Van Dijck (2013), to name only a few. The expansion of data mining practices and the recent activities of the National Security Agency (NSA) in the US and Government Communications Headquarters (GCHQ) in the UK, as well as major social media corporations themselves, quite rightly gives rise to critical claims about systematic surveillance, privacy invasion and inequality (Lyon, 2014; Van Dijck, 2014).
But these troubling consequences are not the whole story of our datafied times. At the same time as big business and big government embrace the capacities of
Writing specifically about one particularly expansive source of data, social media, two of us, Van Dijck and Poell (2013), have argued that ‘all kinds of actors – in education, politics, arts, entertainment, and so forth’, as well as police, law enforcers and activists, are increasingly required to act within what we define as ‘social media logic’. Such logic, we argue, is constituted by the norms, strategies, mechanisms and economies that underpin the incorporation of social media activities into an ever-broader range of fields, and one such mechanism is datafication. Given the ubiquity of social media and its underpinning mechanism of datafication, we need to be attentive to the diverse engagements with data, especially within key fields of public space. To fully comprehend how data and datafication in their contemporary formation affect public life and democratic politics, we need to carefully interrogate how they sustain, undermine and transform vital public values.
In the launch edition of this journal, Couldry and Powell (2014) made a similar argument about the need to ground studies of Big Data, datafication, data mining and analytics in real-world, everyday practices and contexts. They argue that the focus in much critical debate on the power of the algorithm – they give the work of Lash (2007) as an example – leaves little room to explore the small-scale actors who are making organisational adjustments to accommodate the rise of data’s power. In contrast to highlighting algorithmic power, they suggest that these actors deserve to be examined, alongside ‘the variable ways in which power and participation are constructed and enacted’ (Couldry and Powell, 2014: 1) in data practices. This is precisely what this special issue sets out to do. We first make clear why and how the question of agency should be central to our engagement with data. Subsequently, we discuss how this question has been operationalized in the various articles, which empirically open up the study of alternative forms of datafication.
Understanding agency
Thinking about agency is fundamental to thinking about the distribution of data power. And yet, in the context of datafication, questions about agency have been overshadowed by a focus on oppressive techno-commercial strategies like data mining. It is for this reason that Couldry and Powell (2014) call for more attention to agency than theories of algorithmic power, or data power, have thus far made possible. But how might we think about agency in its relationships with data? Agency is a core concept in studies that seek to explore how cultures and societies are made, and how they might be made fairer and more equal. Agency is frequently opposed to ‘structures’ in debates about which has primacy. Structuralist theorists argue that structures not only determine, but serve to restrict and oppress already-disadvantaged groups in society. Marx’s assertion that people are able to make history, or act with agency, but that they do so in conditions not of their own making (1852), guides contemporary Marxist critics of capitalist structures which incorporate processes of data mining and profiling (for example Fuchs, 2011, 2014; Hearn, 2010, 2013). Some of the authors discussed above arguably fall into this category. In contrast, others have stressed the capacity of individual human agents to make and shape their worlds. Still others have highlighted the dialectic relationship between structure and agency: structures shape and constrain human agency, but human agents act against, as well as within, them (Giddens, 1984).
Toynbee offers a useful summary of the ways in which critical realism understands interrelationships between structure and agency, drawing on the work of Roy Bhaskar (1979). Within this framework, people are not seen only as components or effects of structure(s). Rather, they reproduce and occasionally transform society; they do not simply create it, as social structure is always already made. Bhaskar develops ‘the transformational model of social activity’, or TMSA, as a way of making sense of the relationship between people and society. Toynbee sums this up as follows: Society consists in relations between people, and as such is dependent on their activities which reproduce or (less often) transform society. From the other side, human practice depends on society; there can be no meaningful action without social structure. Crucially, this dependency on structure imposes limits on what people can do while never fully determining actions. In other words we have some autonomy as agents. (Toynbee, 2007)
The questions at the heart of this special issue reflect these tensions between structure and agency, control and resistance. The five contributions aim to consider the extent of the dominance of the structures of datafication, the possibility of agency, and the spaces in between. At the same time, the contributions seek to combine critical perspectives on datafication with the perspectives of actors within data mining practices. The aim is to enrich our understanding of data and datafication, by bringing together structural analyses with recognition of individual agency in the context of these structures.
But what kinds of agency are implied within this framing? For some writers, agency is necessarily a reflexive practice. Couldry, for example, defines agency as ‘the longer processes of action based on reflection, giving an account of what one has done, even more basically, making sense of the world Social analysis must take into account the meaning that the social world has for the individual based on how the person understands and responds to their lived experience. The way people construe their social existence helps them formulate their plans and intentions. They make choices about the direction in which their lives should go on the basis of their experience. As such, persons are ‘intentional’, self-reflective and capable of making some difference in the world.
Five contributions on data and agency
The first paper, ‘Datafication and Empowerment: How the open data movement re-articulates notions of democracy, participation and journalism’, focuses on the case of the Open Knowledge Foundation in Germany. In the paper, Stefan Baack draws on interviews and content analysis to argue that by applying the practices and values of open source culture, open data activists develop particular rationalities in relation to datafication that are supportive of the agency of publics and of themselves as activists. There are three parts to this process, suggests Baack. First, activists conceive of ‘raw data’ in the same way that the open source movement conceives of ‘source code’: both are prerequisites for the production of knowledge. Conceiving of raw data as source code, activists share the former in the same way that the open source movement shares the latter, aiming to break the interpretative monopoly of governments and allow publics to produce their own interpretations of public data. The second way in which open data activists reproduce open source practices to enable the agency of datafied publics is to apply the open source model to political participation. Open data activists believe that applying the open source, ‘bazaar’ model of participation to politics will lead to more politically active and engaged citizens and communities. Finally, open data activists recognise the important role that intermediaries play in making data accessible, working with journalists to encourage them to adopt this intermediary role as well as acting as intermediaries themselves, for example by developing civic technologies to do the translational work of intermediaries.
Baack highlights how open data movements represent an intriguing coming together of the two contradictory tendencies that are at the heart of this special issue – that is, the problems and potential of datafication. On the one hand, he notes, open data movements depend on datafication for their existence, and all the troubling consequences that this phenomenon brings with it. On the other hand, they also depend on the democratic practices and values of open source culture, including advocacy of transparent and collaborative forms of governance and the right to access and distribute knowledge. Baack explores what this unusual convergence reveals about the relationship between data and agency, with the former tendency, datafication, arguably suppressing the possibility of public agency in relation to data and the latter tendency, open source practices, arguably enabling it. He concludes that datafication supports, rather than undermines, the agency of data activists, as all of the strategies and tactics he discusses connect datafication with open source culture in ways that enable and support the agency of the kinds of actors that Couldry and Powell insist need our attention. Thus ‘datafication’, the ubiquitous quantification of social life (Van Dijck, 2014), does not necessarily lead to centralized control and surveillance which are often associated with Big Data rationalities (boyd and Crawford, 2012) that threaten to disconnect phenomenology and political economy (Couldry and Powell, 2014: 4). Activists can develop alternative rationalities and imaginations around datafication that do not undermine but support the agency of actors outside big government and big business.
In the next paper, ‘Forensic devices for activism: Metadata tracking and public proof’, Lonneke van der Velden retains the focus on activism that Baack introduces to the collection. The paper focuses on a mobile phone project called InformaCam, an application that mobilises the tracking capacity of mobile devices to produce evidence in the context of human rights activism. Van der Velden argues that InformaCam turns a surveillance problem, that of mobile device tracking, into a method for the production of public proof, in a way that is sensitive to some of the issues that arise when human rights activists and organisations use mobile devices for the purposes of their activism. These issues include: the ways in which mobile devices can be easily tracked; the importance of verification in a context in which digital material is vulnerable to manipulation; and the volume of images and video that are captured and the subsequent need to sort and evaluate this volume. In this context, citizen journalists and human rights organizations are faced with the question of how to investigate and prove the truth of an event by using digital technologies without being traced themselves.
InformaCam addresses these concerns. Developed by The Guardian Project, it is a prototype application that deals with metadata, such as GPS data or the device number, embedded in the make-up of a file. When posting images or videos online, potentially identifying metadata is posted along with it. InformaCam allows users to remove those metadata. The application also makes a second version of the image which has evidential value: in this version, contextual metadata is not obscured but captured, encrypted and stored, so that when images are assembled together, the annotated data proves useful for event analyses. Van der Velden argues that InformaCam can therefore be understood as a ‘forensic device' – understood by Van der Velden, following Weizman et al. (2010), as a device for the ‘production of public proof’ – through its arrangement and re-arrangement of metadata, legal requirements and code. InformaCam thus constitutes a way of thinking about surveillance risks in which surveillance becomes not just something to be ‘informed of', but a phenomenon that can be hacked and repurposed for specific ends, she suggests. In these ways, argues Van der Velden, her interrogation of InformaCam can be seen as a response to Couldry and Powell’s (2014: 1) insistence that ‘emerging cultures of data collection deserve to be examined in a way that foregrounds the agency and reflexivity of individual actors as well as the variable ways in which power and participation are enacted’.
In the third paper, ‘Hacking the social life of Big Data’, Jennifer Pybus, Tobias Blanke and Mark Coté pursue the notion of hacking as a form of agency in times of datafication, with a focus on what they call ‘big social data’ (Coté, 2014; Manovich, 2011), or data produced through communicative practices online and on mobile devices. Reporting on their project ‘Our Data Ourselves’, the authors explore what data-making possibilities exist for young users of smartphone devices that would enable them to be agents in relation to their own big social data. The project involved bringing together members of a youth hacker group, Young Rewired State, with the project team to work on the volumes of social data that young people regularly produce on mobile phones. The project team produced an application called MobileMiner, which allowed the young participants not only to see the extent of data produced and shared, but also to access these data and to consider what might be done with them to augment their agency as individuals and also as a collective. In this way, Our Data Ourselves constituted a preliminary investigation into the agentic possibilities that are opened up when young people are given access to their own data, data which they are usually ‘structurally precluded from accessing’.
In the paper, Pybus et al. contrast the notion of datafication – which frames citizens as primarily passive generators of data – with the much more active notion of data-making, which they describe as ‘a strategic mode of agency that can arise if the subjects of datafication are given tools to both understand and work with the data that they produce’. With this conceptual framing and the action research that they undertook, they seek to ‘critically leverage the spirit of Jenkins’ “participatory culture” (2008) into the realm of Big Data’. In practical terms, this resulted in various creative responses from participants, once they had access to their own social data and an understanding of the extent and frequency of their tracking by data controllers. In theoretical terms, the authors argue that their experiments point towards the need to develop forms of data literacy as a constitutive component of data agency. Data literacy, they argue, extends media literacy to incorporate understandings of the material conditions of the proprietary control of personal data. Incorporating, for example, ‘privacy literacies, information literacies, code literacies, algorithmic literacies, database literacies’, it is through the development of data literacy that citizens can act with agency in the face of data power.
The next paper, ‘Heuristics of the Algorithm: Big Data, user interpretation and institutional translation’, by Goran Bolin and Jonas Andersson Schwarz, moves away from the focus on the previous three papers on specific empirical examples of data/agency relations, to consider some of the tendencies that the authors have observed during diverse research (with media users and media producers) that relate to the consequences for media producers and users of what the authors define as ‘the principles of algorithmic surveillance technologies’. The authors argue that much of what is popularly attributed to Big Data is in fact attributable to the historical statistical administration of society. Given this, the ‘ontological shift’ that some proponents of Big Data claim is upon us is, in fact, not quite so fundamental. What’s more, argue Bolin and Schwarz, ‘the incursion of Big Data as a heuristic is unevenly distributed’: there is both lag and institutional resistance as, in the everyday practice of media management, the impulse to adopt ‘inferential, relational Big Data heuristics’ meet the need for this understanding to be translated back into more familiar categories.
Bolin and Schwarz begin their paper by making a distinction between two key types of descriptive statistics – what they describe as ‘a statistics of discrete data points’ and ‘a statistics of interconnected data points’. Since what is seized upon in data mining operations is a statistics of pure relation, the latter mode of statistical imagination becomes central when analysing the ontologies of the audience generated in database economies. They then discuss the ubiquitous tracking of data and the parallel ways in which conceptions of the media user have shifted during the same post-war period. This is followed by a discussion of the avoidance strategies of media users and translation practices of media industries that occur in relation to Big Data as heuristics and as myth. The authors conclude the paper by proposing an adjustment of the myth of Big Data, based on the ways in which both professional and non-professional media users relate to Big Data in daily life. They argue that among media users and professionals in the media industries, a felt need to ‘translate back’ algorithmically produced statistics into ‘traditional’, often intuitive, social parameters can be observed. In other words, users’ agency may serve to straighten out obscure relationships between data streams.
The final paper, ‘Known or Knowing Publics? Social media data mining and the question of public agency’ by one of us, Helen Kennedy, and Giles Moss, considers the conditions required to enable a relationship between the public and data in which publics have greater agency than has generally been the case to date. The paper emerges from empirical research with public sector organisations (Kennedy et al., 2015; Moss et al., 2015) but does not report on that research. Rather, like Bolin and Schwarz’s paper, it focuses on the principles that the empirical research has unveiled as necessary for good data/agency relations. And like Pybus et al.’s paper, this one also focuses on social data, because social media have been viewed as crucial sites where publics emerge and because, paradoxically, although a wide range of public actors are technically able to access social data, publics generally do not intervene or interact in this process. Drawing on growing calls for alternative data regimes and practices, Kennedy and Moss argue that to enable this different relationship between publics and their data, data mining and analytics need to be democratised in three ways. First, to address concerns about the potential negative effects of data mining on the public, data arrangements need to be subject to greater public supervision and regulation. Secondly, to address the danger of new, data-driven digital divides emerging, these arrangements must be available and accessible to the public so they can be used in varied ways. Thirdly, given the contribution that data and data mining increasingly make to how publics and public issues are represented, uses of data mining in ways that enable members of the public to understand each other, reflect on matters of shared concern, and decide how to act collectively as publics, are also essential. This final condition, argue Kennedy and Moss, enables publics to constitute themselves as more reflexive and active agents than datafication traditionally allows.
Kennedy and Moss argue that together, these three ways of democratising data mining point us towards ways in which
Future directions
Agency, in the context of Big Data, is a complex and multifaceted concept. It encompasses many kinds of users and many kinds of data contexts. Data subjects may be citizens or consumers, professionals or amateurs, conscious hackers or unwitting bystanders as data streams increasingly direct our everyday lives. Each of the five papers in this special issue address agency as a techno-cultural construct, and a number of them point at a very specific category of users – that is, activists or hackers – who could be understood as ‘conscious’ or ‘resisting’ agents. This is a very important user category, one which is in explicit dialogue with the new technological developments we focus on in this special issue, and that plays an important role in terms of data and agency.
However, as data acquire new power, it is important that we understand
Besides obtaining more empirical knowledge of user agency in relation to (control over) data streams, we also need more understanding of data work, the people involved in it and their processes. Some scholars have attended to the work of the data scientist (for example Gehl, 2014, 2015; MacKenzie, 2013), but there are many more roles involved in the process of producing data than this. Data cleaners, algorithm writers, data visualisers, designers of the interfaces of systems that gather and output data are just a few of them. Studying these workers will help to understand better the entanglements of data, power and agency. Often, digital workers are held responsible for the systems that they contribute to produce, as if they were all powerful (for example, Adam and Kreps, 2006 on web designers and Munson, 2014 on the designers of recommendation systems). But power does not operate in simplistic ways and the location of power in data making processes is complex. At the same time, we might expect that data workers create spaces in which to exercise some agency in their work, like the other actors discussed in this special issue. This is why we need more understanding, through studies of data workers, of how data and their representations come into being.
Such empirical inquiries should open up new avenues to think critically and creatively about ways in which datafication can be repurposed and redirected to enhance rather than undermine citizenship. For this shift to take place, it is crucial to understand data and datafication not only in terms of power and domination, but also in terms of agency. We hope that this special issue contributes to developing a new approach and vocabulary through which datafication can be put to emancipatory ends.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
