Abstract
This paper draws together empirical findings from our study of hackathons in the UK with literature on big data through three interconnected frameworks: data as discourse, data as datalogical and data as materiality. We suggest not only that hackathons resonate the wider socio-technical and political constructions of (big) data that are currently enacted in policy, education and the corporate sector (to name a few), but also that an investigation of hackathons reveals the extent to which ‘data’ operates as a powerful discursive tool; how the discourses (and politics) of data mask and reveal a series of tropes pertaining to data; that the politics of data are routinely and simultaneously obscured and claimed with serious implications for expertise and knowledge; and that ultimately, and for the vast majority of hackathons we have attended, the discursive and material constructions of data serve to underpin rather than challenge existing power relations and politics.
Introduction
This paper draws together empirical findings from our study of hackathons 1 in the UK with literature on big data through three interconnected frameworks: data as discourse, data as datalogical and data as materiality. We suggest not only that hackathons resonate the wider socio-technical and political constructions of (big) data that are currently enacted in policy, education and the corporate sector (to name a few), but also that an investigation of hackathons reveals the extent to which ‘data’ operates as a powerful discursive tool; how the discourses (and politics) of data mask and reveal a series of tropes pertaining to data; that the politics of data are routinely and simultaneously obscured and claimed with serious implications for expertise and knowledge; and that ultimately, and for the vast majority of hackathons we have attended, the discursive and material constructions of data serve to underpin rather than challenge existing power relations and politics.
At the same time, (big) data is already, as David Beer has noted: ‘an established presence in our everyday cultural lives’ (2015: 2) and this means that the material and embodied configurations of data are already normative and quotidian. It also suggests that hackathons can be investigated as a ‘datalogical structure’ (Clough et al., n.d., 2015) in terms of the events themselves and as particular prototypes pertaining to how innovation and creativity occurs – as increasingly data driven and automated. In relation to hackathon events, these normative and quotidian configurations also highlight a real disjuncture between the way data is discursively constructed, and the way data is variously and unevenly operationalized in the material and embodied processes of the events themselves.
Inhabiting the hack
Hackathons are established methods and ‘spaces of innovation’ in the corporate sector (see Gómez Cruz and Thornham, 2016) but are also emerging in the public and third sectors across the globe (see also Irani, 2015; Leckart, 2012; Marlow, 2013; Townsend, 2013 for histories of the hackathons). They have been widely discussed and critiqued within academia and industry
2
and our aim for this paper is to draw on our experiences in order to critically discuss what we perceive as the underlying and connective thread across all the hackathons events we researched: the discursive, technological and material significance of
For our research, the hackathons we attended were delineated along three axes: (1) Those within the arts and creative sectors, where programmers, coders, hackers, digital artists used hack and ‘wreck’ methods to explore digital objects as new ways of ‘creative’ and political disruption (
We attended just over 20 hackathons in the UK between 2013 and 2016 as participant observers, but we also worked with hackathons organizers and digital artists as they organized and ran hackathons events in order to understand processes behind the events and their position within a wider spectrum of temporality (beyond the discrete event itself). The particular methodological approach we used in our fieldwork was drawn from Pink et al.’s (2015) concept of ‘digital ethnography’, which is situated in the politics and principles of reflexivity, participation and observation but is also attuned to the (digital) mediatory elements of digital culture (2015: 3). Our aim was to understand the event from the perspectives of those involved, and reflexively consider our own involvement, with a central question around the issue of
In what follows, we discuss three interwoven strands that emerged from our research: data as discourse, data as datalogical and data as materiality. These three configurations enable us to elucidate the multiple and oscillatory alignments of data. For example, on the one hand hackathons are only possible
Data as discourse
As suggested above, ‘data’ is a discursive trope with particular affordances for the events themselves and written into the language of the events through an explicit reference to the digital and to particular cultural moments and politics of digital culture. 5 The second issue is that these constructions of data also obscure a very real discriminatory politics in which non-valuable data (‘raw’, ‘dirty’, ‘waste’ data) is not simply negated, it is actively unsought, dismissed, disappeared. This means there is a politics at the heart of these events that remains unacknowledgeable through the discourse of data. At the same time, because data is constructed as a malleable, benign tool for participants, identifying the discriminatory politics that are mobilized is a difficult task and one that resonates the existing debates within big data about obscured data aggregation, value and waste (see boyd and Crawford, 2012; Clough et al., 2015; Kitchin, 2014b).
These politics can be discussed in a number of ways, but for the purposes of this article, we will focus on two interrelated issues: the first seeks to interrogate how that discursive construct actively glosses over material and lived inequalities such as age, gender, class, ethnicity that are also particular metrics used in the formation of the event (in terms of the demographics who are invited). The second looks at the kinds of data that becomes unavailable – the ‘waste’ data; data that cannot be ‘cleaned’ – and so continually fails to frame this data in an operational capacity: not only can ‘waste’ data
For the hack events that were more consciously politically orientated – such as those organized by MadLab or the ODI, the women only hack event, or those run by the NHS that sought intervention into a system of care – participants were carefully invited along key demographic lines. They were chosen because of their particular value: they were women, they represented a certain set of disciplines, they were users of an existing system that needed changing, they had particular expertise or they were the ultimate users of the new system that was sought. To a certain extent then, it is worth noting that the demographic organization of a hack event is also, in a similar fashion to ‘clean data’, written along a value system where key identity signifiers are aggregated and afforded particular value. In saying this, we are not drawing a direct correlation between clean data and participant demographics. Instead we are simply noting that the processes critiqued in relation to big data (see boyd and Crawford, 2012) and data analytics (Han et al., 2011; Kitchin, 2014b; Miller, 2010) which note the processes of aggregation in the formation of data and along the data points and ‘variables that have the most utility’ (Kitchin, 2014b: 101) are more widespread practices than we might at first acknowledge. Similarly just as the power relations behind these processes of aggregation are sometimes obscure, so hack organizers write out their own intervention in their (later) narratives of the socio-technical outcomes of the event and the specific contribution that these chosen demographics contribute
The second point to make here is that many of these demographics could neither find themselves in, nor identify with, the datasets they were invited to use by the organizers. We worked with obese families who could not locate the complexities of their obesity within the datasets they were offered (the datasets pulled together income, class and age for example, but not other health and mental health issues, family relations, debt or education). We worked with not in education, employment or training (NEET) groups who appeared as unemployed and female and in Leeds, but not (also) as being a primary carer or as having a mental or physical disability, for example. In other words, the datasets were always incomplete, yet were presented as ‘clean’ data to the participants. The issue here is not about the completeness of a dataset; the issue is that certain datasets were valued over others within the parameters of the hackathon (presumably for time, ease, cost, availability, aims of organizers, etc.). At the same time the rationale for valuing certain datasets over others was masked through a discourse of ‘cleanliness’, which worked to construct those datasets as benign. The participants had to decide whether to actively seek different datasets thus taking up valuable time from the event or to use the available datasets. If they did the latter, the elements that mattered to those demographics (such as being a primary carer or giving a percentage of income support routinely to another family member) remained obscured in whatever they built, or had to be retrospectively and consciously added, while the discourse of ‘clean’ data remained relatively unchallenged.
Of course, participants have different stakes in the hackathons and indeed the data they use, they have their own politics and aims: the datasets were in no way evenly taken up, but neither were they benign in any way. As many researchers have noted, data is uneven and wrought along political and power relations: data ‘depend on hierarchy’ (Gitelman and Jackson, 2013: 8); they are ‘correlated’ in particular ways to construct particular value (Mayer-Schönberger and Cukier, 2013: 70). Yet through the discourse of cleanliness, malleability and openness, the data was simultaneously constructed as the bedrock for and of innovation or creativity (as fuel, a building block, a terrain). We heard this kind of comment at nearly every hackathon we attended: ‘This is non-raw data, and it’s clean’ (Ian Holt, Ordinance Survey. Digital Shoredich, 2014); ‘The data is clean, it’s there for you, its ready to use’ (Leeds Data Mill Hack, 2013); ‘I have a lot of really, really clean data for you. So it’s really easy to use’ (Bruce Darling, UP London Hackathon, 2013). Seen here, data was constructed as central tools for the events, and better because data is ‘ready’ to use, speedier and can be automatically and interoperably (computationally) aggregated.
The data was also constructed as open and transparent – and available for anyone to use (‘programming is easy, it’s just a matter of ‘ifs’ and where to put them’ (Leeds Hack, 2014), or ‘all I need is the data, once I have that, its easy’ (UP London, 2013). By comparison, ‘raw’ data required interpretation, sorting, translating: they required
During many of the hackathons we attended, participants came up not only against data they could not identify with, then, but they also came up against inaccessible datasets such as land register data, social housing data. The women only hack event (Newcastle, 2015) for example, had an implicit gender politics that had to be forged through the human and material. There was also a concerted effort to ‘find’ data relating to women in the North East, but the data was unclean, patchy, too ‘raw’ to use. Instead, the participants ended up using digital archive material or university material that obscured class, age or geographic location, for example. For the subsequent prototypes, this meant that the politics of the data had to be retrospectively and actively imposed through the material and human interventions. In nearly every instance, participants did one of three things: they turned to familiar data and APIs (familiar through their own use – weather data, travel data) or to social media data (twitter or scraping tools that drew on social media data such as Klout 6 or Kred 7 ) or they used the APIs made available by organizations at the hackathon (ordinance survey data, mapping data). For us, this further contributes to a self-fulfilling cycle that perpetually reinforces the notion of data as a priori, self-evident and truthful not least because in these instances, then, it was ‘clean’ data that shaped what they built, and whatever the original human/machine intention, the data that was not clean did not register either in the process of creativity or the final project.
These issues also reconfigure ‘waste’ or ‘non-representational’ data (Clough et al., n.d.; Thrift, 2007) in new ways. Indeed, even if we recognize that in the hackathons, data was generative and conditioned the possibilities for the activities and prototypes (Clough et al., n.d.: 14; Suchman, 2007), we also have to recognize they also created ‘silences’ (Bowker, 2005: 11–12) as well as the ‘absences of relations’ (Kitchin, 2014b: 22, see also Vis, 2013) as we discuss below.
Data as datalogical
Many of the hackathon processes described above are built on a wider, long-standing and normative construction of data. Indeed as suggested above, the central underpinning element that enables data to be valued and operationalized in the ways described above relates to the long-term construction of data as ‘transparent’, ‘self-evident’, ‘the fundamental stuff of truth itself’ (Gitelman and Jackson, 2013: 2). In this section, we briefly elucidate some of these issues in order to suggest that this explains some of the power relations and processes described above and their wider iteration beyond data ‘itself'. We also suggest that, particularly with the advent of big data, the notion of self-legitimating datalogical structures is gaining traction with a number of implications not only for the subsequent valuing of the human and material but also for wider issues such as expertise and knowledge which are (re)located into the data processes. The irony, of course, is that the conception of datalogical systems as self-legitimating requires a huge amount of complex (and ironically, external) discursive, human
There are a number of accounts (see also boyd and Crawford, 2012; Clough et al., n.d.; Gitelman and Jackson, 2013; Manovich, 2001) that seek to understand the long-term social, political and technological trend towards a valuing of a system over an individual, and the subsequent consequences for contemporary constructions of (big) data as the foundation
Following this, the central issue for the purposes of this article is how this long-term valuing and construction of data is operationalized into and conditions subsequent value systems within the hackathon events. As we discussed above, such discursive constructions of data have material consequences in terms of what becomes possible to build, to imagine or to utilize. But this discursive construction of data exacerbates a second critique levelled at hackathons around their politics. Irani has noted, for example, that hackathons are exceptionally good at masking a neoliberal and entrepreneurial politics in a language of innovation and creativity with similar consequences to those noted above (in relation to data) in terms of the socio-technical and material outcomes of hackathon events (2015: 2–3). As she argues, this results not only in the active enforcement of a kind of citizenship as entrepreneurialism, but also that this is imagined and constructed as a positive and productive force for social change (Irani, 2015: 2–3). For the purposes of this article, then such considerations remind us that data operates within and alongside a range of other powerful material and discursive signifiers that similarly ‘condition' and shape the event; they also remind us that data is also becoming enmeshed in a wider discourse not only of creativity and innovation but also in the elision of these concepts with that of entrepreneurialism. All of these concepts work to disappear or negate an overt politics under the guise of the hackathon event
8
and its premise of
The second issue, which perhaps draws a more direct parallel with contemporary concerns around data, relates to how the construction of data configures the human or material elements within the hackathon. Indeed, if we consider the way the human and material are conceived by organizers of events, this becomes clearer. Hackathon events are carefully structured with delineated challenges, discussions, scenarios and research and build times written into them (see also Gómez Cruz and Thornham, 2016; Leckhart, 2012). Irani (2015) describes the design of (particularly corporate) hackathons as wrought with the politics of speed and vision (2015: 19) and this means that, although there may be an overt discussion of technological or material ‘play’ or even ‘failure’ (when ideas don’t work), in reality and because of the temporal conditions of the events (with strict and rapid times that are conducive to the notion of
Indeed, datalogical structures, by comparison with humans, are constructed as dynamically adaptive – they are both in a constantly fluid state
The final point to make here is in relation to the discursive and material process of automation, then, particularly in relation to knowledge and expertise. Indeed, we suggest that the construction of data not only as ‘a priori’ (Drucker, 2011: 1) to information, bias, value, but also as the underpinning element of self-legitimating datalogical structure, conditions knowledge and expertise in particular ways. If we add wider constructions of smart technology to the conceptions of data already discussed in this article, we not only have ‘clean’ and ‘truthful’ data, but we also have clean and truthful data
Data as materiality
The final consideration of this article relates to the concept of data as materiality, which we understand in two key ways. The first is through the transformation of data into a tangible and literal material prototype through embodied socio-technical processes, which elucidate a range of tensions around what we call mundane and ritualized practices and the discourses of data already discussed above. This offers a more ontological and embodied account of data reconfigured into the specific parameters of a hackathon, as well as raising further critical questions for concepts like creativity, innovation and automation. The second way we understand data as materiality relates to a more traditional concept of technology as ‘disciplining’ bodies (Foucault, 1991, 1997) following scholars such as Cheney-Lippold (2011) and Nafus and Sherman (2014). Foucault’s work explicitly connects metrics and measurement of populations with disciplining and measuring bodies (1997), which correlates if not elides (for the purposes of this article) participant demographics with socio-technical capital (see also Graeber, 2015). In the context of the hackathon with its temporal and spatial frameworks and clear delineation of activities organized into specific events, this notion of disciplining is a clear condition of the events themselves that work to both enable and close down certain embodied interactions and interventions. These conditions, in turn, are centrally framed and conditioned by
On the one hand, data is transformed into a tangible and literal prototype through established and familiar practices that are enacted in the hackathon event
There are a number of ways we could consider this. One interpretation relates to how we conceive ‘innovation’ and ‘creativity’ as something more in keeping with what Shaun Moores has called ‘unreflective, taken-for-granted’ (digital) corporeal movement and process (2014: 202) where routine and everyday embodied actions occur in specific places and with specific objects (see also Ingold, 2013; Pink, 2012). Seen here, innovation and creativity are more about our mundane and ritualized practices or relationships with technological objects and processes that are enacted across a variety of contexts (see also Kember and Zylinska, 2012: 120–122), than, for example, novel and negotiated socio-technical methods (Balsamo, 2011). To a certain extent then, we might suggest that ‘innovation’ is better conceptualized as a our hackathons have all been about treating technology as something that can be dissembled and reimagined not as black boxes that can’t be accessed… hackathons for us are about humans reclaiming technology as human, taking it apart, claiming it for yourself. (Uncanny Valley Hack, 2015)
A second interpretation of the data as materiality relates to how we conceive processes of ‘innovation’ and ‘creativity’ (and even ‘imagination’ see Gómez Cruz and Thornham, 2016) as primarily data driven (through both the framework of the hackathon and through the methods and kit brought to the event by participants) and what this means for embodied and human subjectivities. Indeed, one argument we could make within the specific temporal and spatial frameworks of the hackathons relates to the ways embodied participation could be understood as a particular form of lived and embodied socio-technical action (even agency) that disrupts the disciplining power of the data through embodied mediation. Indeed, some participants actively used kits as disruptive methods in order to highlight the power relations, assumptions and politics at work within the hackathon. Their participation was in and of itself a disruption to the processes, materialities, discourses and ideologies of the events (see author). In this sense, we could argue that the kits brought to events are disruptive digital–human negotiations that intervene into the process of the hackathon: they are familiar, routine and lived as well as digital, technological and datalogical.
These arguments draw on a range of literature within a longer trajectory of (feminist) STS studies that investigates digital embodiment through concepts like ‘intra-action’ – entangled in and ‘intra’ to the digital, technological and discursive (Barad, 2009); digital data–human ‘assemblages’ (Haraway, 2015; Lupton, 2016: 4), or the ‘body multiple’ as a site to explore lively data that is ‘ingested’ or ‘emitted’ (Mol, 2002, 2008: 402). All of these concepts think through embodied human subjectivity as data in relation to agency and power. Indeed in relation to the hackathon, they offer a powerful framework for enabling embodied data–human agency whereby the data is corporeally and discursively reconfigured through embodied action. But as Deborah Lupton has noted, these actions ultimately generate more data that circulates in a wider ‘digital data economy’ (2016: 4) beyond the initial corporeal and lived processes. This is not to undermine the potential of data–human agency but to note its temporal and spatial contextuality. In the case of the hackathon, it seems that that ontological expertise, human everydayness and human–technological processes of design are actually carefully and doubly framed and contextualized by not only the rigid infrastructure of the event itself (and therefore only afforded bounded and embodied agency within the temporal and spatial frameworks of the event), but also by the various conceptions of data discussed in this article (data as discourse, data as datalogical, data as materiality) that continually and ultimately try to (re)position data as
Conclusion
This article has explored data, discourse and materiality through the specific lens of the hackathon in order to interrogate the socio-technical and political configurations they engender. What we have demonstrated is that data, as self-fulfilling prophecies and self-legitimating structures is discursively, operationally and materially constructed as both the basis for, and condition of, creative and innovative practices such as those claimed within the discourses of the hackathon. Perhaps more importantly in the context of this article, these discursive, operational and material constructions of data obscure and mask the enormous effort surrounding data that is necessary to position it
At the same time, and as suggested at the start of this article, data and datalogical structures already have an established presence in our lives. But what this article has also attempted to demonstrate is that the claims made in the relatively new field of big data – around metrification, decision making, around infrastructure and processes of organization – have a long-term trajectory in terms of wider value systems and discursive claims. This means that we not only have to recognize events like hackathons within the discursive, ideological and material parameters of (big) data, but also that hackathons reveal that many of those parameters pre-exist and condition the contemporary discourses of big data as well. This highlights to us that the politics of big data have a long term and normative base beyond datalogical structures, and that the turn towards big data has revealed these politics in new ways that similarly make demands of us (see also Gitelman, 2006; Gitelman and Jackson, 2013).
This last issue raises questions around the kinds of demands that can be made around big data, not least when human interventions are normatively positioned as always already inferior. And our hope is that in recognizing data as discursive, as datalogical and as materiality we can open up a different space where new demands of the digital can be made. But in relation to this, we also have to ask about our own complicity. Indeed, Caroline Bassett has recently critiqued our ‘post-digital desire to accept the presence of…technology as a given, but also to put it aside’ in the explanation of power (2016: 23). Her comment locates data as already figured within discursive and material power relations (and vice versa)
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: This research was conducted as part of a Digital Economy Community and Cultures Network+ EP/K003585/1.
