Abstract
Automatic aggregation of large-scale data is increasingly conceived as central in the production of ecological knowledge. This article examines the implications of the employment of automation techniques and ‘data-driven analysis’ in long-term biodiversity monitoring. What are the pathways and paradoxes in the possible public acceptance of automated data-sets as a trustworthy source for use in global protection and regulation of biodiversity? This article suggests that the precautionary discourse aid topdown measures for the public acceptability of the use of such techniques. Automated biodiversity monitoring offers distinctive advantages to further precautionary goals in terms of a faster, cost-effective and less messy way of collecting data, at a large scale over long periods of time. However, it contradicts other values implied through precaution – for instance the opacity and reification of the construction of risk. How do the specific forms of data-making relate with specific forms of risk governance, and what implications does this have for helping us to understand appropriate ways of political representation in governance? Can paradoxes attendant to introducing a form of construction of data help understand the nature of the exercise of governmental power?
This article is a part of special theme on Data Associations. To see a full list of all articles in this special theme, please click here: http://journals.sagepub.com/page/bds/collections/data-associations.
Keywords
This article is a part of special theme on Data Associations. To see a full list of all articles in this special theme, please click here: http://journals.sagepub.com/page/bds/collections/data-associations.
Increasingly, automation techniques are employed to collect, organize, validate and distribute data to make knowledge claims in the environmental sciences. Such knowledge is used for environmental regulation, including through long-term ecological monitoring. This article seeks to unpack the legitimizing pathways through which the disparate data that are collected through these techniques come to be seen as complete, coherent and consistent global data. This cohesion enables such data-sets to become the basis for subsequent environmental risk regulation, global standard-making, as well as global environmental law and policy. The article seeks to understand the trajectories through which these techniques gain trust and wide acceptance, as well as the paradoxes that law encounters when using ‘global data’ in environmental regulation and policy making. The specific focus, here, regards automation and ‘data-driven’ analysis in long-term ecological monitoring and bio-diversity studies, and further, as a possible basis for global biodiversity regulations, including implementing international legislation like CITES (Convention on International Trade in Endangered Species of Wild Fauna and Flora, 1973). 1 It discusses the existing tension between traditional-risk regulation and precautionary approaches as paradigms of regulation, focusing on the manners in which use of such techniques could impact this tension. Further it focuses on the important issue of scale, and how claims depicting seemingly disparate data forged through automated models as consistent and ‘global’ is rendered publically acceptable. What are the pathways and paradoxes in the possible public acceptance of automated global data as a trustworthy source for use in regulation and protection of biodiversity?
Large-scale aggregation of immense data-sets is posed by its advocates as ushering in profound shifts in knowledge production. The oft-quoted assertion of Anderson aptly summarizes these ambitions for ‘Big Data’: … a world where massive amount of data and applied mathematics replace every other tool that might be brought to bear. Out with every theory of human behavior, from linguistics to sociology. Forget taxonomy, ontology and psychology. Who knows why people do what they do? The point is they do it, and we can track and measure it with unprecedented fidelity. (Anderson, 2008) all the more valuable to the extent that it allows for specific patterns to be found and new correlations to be made between different datasets, so as to eventually deduce or infer new information, as well as to potentially predict behaviours or assess the likelihood for a certain event to occur. (De Filippi, 2014: 1) the potential for social change means that we are now at a critical moment; big data uses today will be sticky and will settle both default norms and public notions of what is no “big deal” regarding big data predictions for years to come. (Richards and King, 2014: 394)
Elaborated in later sections, automation and data-aggregation techniques are progressively sought by scientists for use in biodiversity studies including in long-term ecological monitoring. Cast as a fourth paradigm of sciences, data intensive statistical explorations and data mining which applies a broad variety of algorithms to determine the best or a composite model of explanation, are increasingly used to locate and assess risk. This composite model attempts to use correlation as a mode of explanation, and is radically different from traditional approaches that seek causal relationships in risk-analysis, also often characterised as the scientific method (Hey et al., 2009; Lehning et al., 2009). Scientific studies that currently employ automated data-driven methods typically traverse a large scale (discussed in section ‘Representation in contemporary environmental risk regulation’ and section ‘Pathways and paradoxes in the putative data-turn’), vis-a-vis the territorial space that they deal with or the long range of time such studies usually factor in. These algorithmic approaches make the knowledge produced through them easily extendable to support regulatory measures attendant to traditional-risk regulation (Kitchin, 2014). Here, the nature of the global regulation that seeks to protect and regulate sustainable use of biodiversity brings to central focus the issue of scale. Global regulation towards protection of biodiversity, including endangered species or specific ecosystems like wetlands that support biodiversity, brings forth the need for collection and analysis of ecological data pertaining to large regions or even a global scale. Factors like vulnerability of specific species need to be ascertained at the regional if not global level, including population-counts of species that migrate large distances or get affected by factors that cannot be limited to the sub-regional. For instance, a number of international treaties delineate global principles for protection of biodiversity, as well as provide for a framework to gather evidence for effective regulation and implementation of these principles; CITES and CBD (Convention on Biological Diversity, 1992) being prominent examples. Automation and data aggregation techniques may then be seen as crucial factors to generate global-data that can guide this regulation of biodiversity. Data generated through such techniques, and constructed as ‘global-data’ would need regulatory incorporation in ways that are not only acceptable for the relevant scientific communities, but also for other regulatory communities and interested publics.
Both epistemic bases and normative foundations attendant in the quest for protection of environment (and public health) through traditional-risk regulation have been replete with intense public challenges, often termed as a tenuous relationship between ‘fragile science and anxious politics’ (Beck, 1996; Golan, 2010). It is important to place the emergent ambition to use automated data driven analysis in risk regulation within this context of intense public contestations about knowledge and governance. Elaborated later, public contestations about effective and legitimate regulatory measures to protect the environment generate tensions between the overlapping (but conceptually distinct) paradigms of traditional-risk regulation and precaution. Increasingly, precautionary approaches have been advocated as a measure to combat the various limitations in traditional-risk regulation viz., its representational lacunae, reification of risk, epistemic gaps in expert knowledge and to ensure public participation in risk-analysis and risk-management that would also increase public trust (see further in section ‘Precautionary-risk and the data turn in biodiversity monitoring’). Seeking to supplement/supplant traditional-risk architecture, the advocates of precautionary principle assert its relevance in framing and carrying out scientific studies (risk-analysis), as also how these findings are processed within attendant public decision-making (risk-management). Whether a possible turn towards automated data-driven science would change existing forms of risk governance is also related to the continuing tension between the two aforementioned paradigms of risk governance (see further in section ‘Precautionary-risk and the data turn in biodiversity monitoring’). The tension between the two paradigms includes concerns of reification of expert knowledge in traditional-risk regulation, which the precautionary principle is sought to ameliorate and democratize. Given the possibility of automation techniques becoming the default norm, how would one approach and understand the pathways through which use of these techniques in scientific studies and regulation are rendered trustworthy and acceptable? Would the employment of automated data aggregation techniques amplify concerns of reification of risk, and how would this affect concerns about fostering of public trust? Precautionary approaches are also viewed as a way forward to bring back trust and engagement in regulation that was lost by traditional-risk paradigms. The trust required for the employment of data-sets generated through automated data-aggregation techniques in public regulation cannot merely be located in scientific debates and technocratic communities. Hence what paradoxes and pathways are visible in the generation of trust towards incorporating such techniques within a precautionary paradigm of environmental regulation is seen as an important question. Can the paradoxical relationship between claims in introducing a form of construction of data and the attendant top down measures for making it acceptable for use in risk regulation help us understand the nature of the exercise of governmental power?
Basing environmental risk regulation on algorithmically constructed data-sets can have significant impacts on the way reality is represented, as well as on how law recognises, represents and protects interests, knowledges and bodies. This article seeks to highlight pathways in the process of making such data-sets global, and to examine how such techniques may be at odds with the spirit of precautionary principle. This article also suggests that, paradoxically, it is the heuristics of the precautionary discourse that facilitate the acceptance of these techniques. Toward this aim, the next section introduces the reader to prominent ways in which data-aggregations are increasingly used in environmental sciences. It recalls the top-down measures that were necessary to make disparate data become accepted as complete and consistent global-data in an earlier instance of automation in weather-studies. The use of automated objective models of analysis to supplant existing expert-subjective analysis in weather-studies have important similarities to the putative data-turn in environmental sciences, in general, and long-term biodiversity monitoring in particular (elaborated in later sections). This second section also highlights the issue of scale – both in terms of large quantities of real-time data as also covering large territories at the regional and global scale – as also important in debates about public trust and acceptability of such techniques in regulation. Section ‘Precautionary-risk and the data turn in biodiversity monitoring’ brings to focus the representational claim implicit in traditional-risk regulation, the role of expert techniques of data-generation in representations of risk, and further, draws attention to the de-legitimation of the representational claim in traditional-risk approaches due to concerns of reification of risk in technical risk-analysis. Further, this section discusses the rise of precautionary approaches as accompanying the problematization of the reification of risk in traditional-risk regulation. Further, it describes a specific regulatory intervention for protection of biodiversity through CITES, where precautionary-risk paradigms have been employed, pointing to it as a possible site where automated data-driven risk studies may be used. Section ‘Pathways and paradoxes in the putative data-turn’ focuses further on the use of data-driven ecological-monitoring, bringing into focus two relevant features in precaution: long-term environmental monitoring, and wider public participation in risk-assessment. It explores the prospect of this apparent fit between the heuristics of precautionary approaches and automated data-driven monitoring as a factor in the building of trust and public acceptance for certain epistemic communities, including ‘lay’ groups that are already involved in biodiversity conservation. By following scientific debates among contemporary ecologists, which recall similar debates in weather-studies in the 1980s, the section also dwells on the attendant epistemic contestations to foreground challenges that surround the possible acceptance of these techniques to generate ‘global data’ for environmental regulation and policy. It remains to be seen what top-down approaches may facilitate the creation of public trust in these techniques, and what paradoxes they will pose for law and its representational claims in risk regulation. The aim here is not to enquire as to which specific modes of public reason law ought to mobilise within this realm but to unpack the complex mix of continuities and discontinuities that arise from the use of automated data-driven studies in a precautionary-risk paradigm. It seeks to draw attention to the ostensible pathways and paradoxes during the production of trust for the use of these techniques in existing global biodiversity regulation by focusing on this complex interplay. How do the specific forms of data relate with specific forms of risk governance, and what implications does this have for helping us to understand appropriate ways of political representation in regulation and governance – rationalities through which claims of public order and good governance is asserted?
Automated aggregation in environmental sciences and recalling constructions of global weather data
Use of automated data driven studies as the basis of policy-making and its implementation in regulatory structures can conceivably become an important focal point in global environmental risk regulation in general, and in biodiversity regulation in particular. As the call to use automated data-aggregation techniques in environmental sciences gains strength (see for instance: Gleeson and Greenwood, 2015; Jones et al., 2006; Kelling et al., 2009; Madin et al., 2007, 2008; Soranno and Schimel, 2013; Williams et al., 2006), it becomes important to scrutinise these with reference to earlier attempts in weather forecasting that was successful at such aggregation, the institutional practices and other constellation of changes that made the use of such techniques possible, and the attendant contestations. This section also seeks to point the reader to the prominent sites where automation in such large scale data-aggregation is used in ecological research, its impact on policy making and regulation, as also the parallels offered by automatic data aggregation in weather forecasting.
Four Vs (volume, velocity, variety, and veracity) are often cited as important facets in understanding large-scale data aggregation’s relevance in knowledge making, policy-making and regulation. We are reminded, repeatedly, that it is not merely the large amounts of data that are important, but also the multiplicities of sources and contents, and the growing speed of the production of such data. This multiplicity helps keep claims regarding the veracity and integrity of the data intact, which gives credence to the particular claims emerging from the gleaned data (Kitchin, 2013). Kitchin emphasizes the ambitions of exhaustiveness in striving to capture entire populations or systems within these four Vs, with a claim to be ‘fine grained in resolution and uniquely indexical in identification’ (Kitchin, 2014). Within the backdrop of the recognition of the inherent complexities in biological sciences ‘typical of macrosystems research’, the data-turn is increasingly visible in research in ecological and environmental sciences; where large-scale data is increasingly seen as ‘required to study large, complicated, and highly variable objects’, involving ‘analysis or prediction across vast geographic areas or time periods’ (Soranno and Schimel, 2013).
Mirroring the call that ‘ecologists need big data and big ecology’ (Soranno and Schimel, 2013), knowledge is being created in a variety of sites where large-scale data associations have the potential to be the driving force for both scientific research and regulation in the environmental realm (Lehning et al., 2009). Densely deployed wireless sensor networks enable collection of data from disparate sources for archiving, processing, analysing, integration and assimilation in remote laboratories. The Mountain Research Institute (MRI), for instance, has sought the ‘assembling and homogenizing of existing, disparate data sets into a global set of observations’ through the last decade. This, it is asserted, ‘will certainly provide new insight into how mountain socioecological systems function, how climate warming affects high-elevation environments, and how mountain communities exercise agency and via what forms of governance’. However, such enthusiasm is ostensibly tempered by concerns about veracity and accessibility: none of these questions are just about what variable to measure. As the mountain community embraces the era of medium-sized data, it will be necessary to identify new ways to ensure data quality, manage global data sets, and create repositories that are accessible and that allow researchers to understand global change in mountains in new and better ways. (Gleeson and Greenwood, 2015: 87)
Use of these techniques to develop global data sets by MRI, arguably, points towards a wider emergent role in the implementation of global environmental law and regulation, including biodiversity regulation. There is a perceptible shift in the calls for use of such techniques for biodiversity monitoring among practitioners: ‘Rapidly emerging new technologies from drones to airborne laser scanning and new satellite sensors providing imagery with very high resolution (VHR) open a whole new world of opportunities for monitoring the state of biodiversity and ecosystems at low cost’ (Vihervaara et al., 2017: see also: Aide et al., 2013; Bush et al., 2017). Even as the regulatory use of these emergent techniques may currently be uneven, the potential for its employment is already visible in the case of the regulation of international trade in endangered species of wild fauna and flora through CITES, a key international treaty that seeks to protect biodiversity. The levels of protection accorded in CITES to specific species, expressed in its three appendices, are linked to the threat perception and vulnerability of these species. While this level of protection provided in the appendices can only be changed at the periodic Conference of Parties, such decisions are strongly influenced by the technical advice of IUCN (International Union for Conservation of Nature, 2012). The proposals for changes of level of protection at the Conference of Parties are strongly guided and assisted by IUCN data, given its substantive and formal advisory role to assist the Secretariat and State parties to implement the Convention. The technical advice that IUCN provides here is based on the scientific information and categorization that it arrives through biodiversity studies, including long-term ecological monitoring at regional and global scales. Elaborated in the fourth section, it is conceivable that automation and data aggregation techniques used by the MRI can also be employed in the IUCN studies, to aid attendant technologies of camera-traps and satellite-collaring that IUCN studies are already employing to collect data about specific endangered species. For instance, the snow leopard was reclassified from the ‘endangered category to the vulnerable category’ in the IUCN red list in September 2017, a reclassification that estimated global numbers through modelling data collected through camera-traps and satellite-collaring (McCarthy et al., 2017).
Far before the ‘internet of things’ became popular and fashionable, the structures for use of large-scale data-aggregations – understood as ‘global-data’ – were already being put in place in other knowledge realms like weather-studies. Paul Edwards (2010) offers a rich ethnography of the institutional and epistemic shifts behind the making of today’s weather forecasting, including large-scale data aggregations in weather forecasting and climatology. Through focusing on a mixture of institutional practices and new technologies introduced as part of super-power military agenda and cold-war strategies, he unearths a constellation of changes that made both the making of ‘global data’ and ‘making data global’ possible. An important part of the making of ‘global data’, here, was the introduction of new technologies to generate upper-air data. The use of technologies like pilot balloons, rawinsondes, radiosondes, radio meters and systems to fly weather reconnaissance aircraft on a daily basis, and later, the use of weather satellites, introduction of real-time surveillance through computers, radars and satellites, and a centralized command and control system were instrumental in collecting upper-air data that contributed to the creation of data-sets that were called global-data.
While simply assembling global-data in one place required a monumental effort, Edwards reminds us how this gathering of numbers can only be the beginning for such a global project since ‘methodological skepticism is the foundation of science, so creating trust is not an easy task’ (McCarthy et al., 2017: 252). If one must trust these data as global, other top-down approaches are necessary for building complete and consistent global data-sets with a view to making the ‘data global’. He identifies the various modes of standardization, ex post facto, through systems of institutional, epistemic and methodological interventions. Such a project sought to build complete, coherent and consistent global data-sets from incomplete, inconsistent, and heterogeneous sources through measures like automatic data-processing, which were hitherto considered anathema in weather-analysis. Objective numerical weather prediction through automatic data processing fundamentally changed existing field-analysis, hitherto seen as an interpretive process involving a shifting combination of mathematics, graphical techniques and pattern recognition, since human interpretation through heuristic principles and experience-based intuition were widely considered foundational in forecasting. ‘(W)ith ever more sophisticated interpolation algorithms and better methods for adjudicating differences between incoming data and the first guess field, objective analysis became a modeling process in its own right’ (McCarthy et al., 2017: 257–258). The steady availability of ‘real-time’ data generated by new instruments at a large scale and its automatic integration, facilitated through exponential increase of the computational power of electronic computer, and attendant changes in institutional practices made large-scale automatic approaches the only viable model. Large amounts of data that were generated and collated from various sources across the world as ‘global’ involved ‘carrying data-rich regions into data-poor ones’ (McCarthy et al., 2017: 269), as also to interpolate the collated data from some points into the fourth dimension of time, through the move from simple expert-interpolation to automatic data-assimilation, marking a ‘profound transformation in meteorology’s models of data’ (McCarthy et al., 2017: 270).
The advent of such data-driven models in the institutional practices of meteorology and weather studies, and the supplanting of subjective expert-analysis by objective automated-analysis have important similarities to the current day ambitions in the data-turn in environmental sciences, elaborated further in a subsequent section. This shift from ‘weather forecasters [using]… an intuitive scientific hermeneutic – an interpretation of data by skilled human beings – to an objective product of computer models’ was heavily contested. An important instance was the response of Tor Bergeron, an influential Swedish scientist: (weather) analysis was, for a Bergen School connoisseur, not only a method to determine the ‘initial state,’ but also a process whereby the forecasters could familiarize themselves with the weather, creating an inner picture of the synoptic situation…“one should not accept the present strict distinction made between ‘subjective’ and ‘objective’ methods… In fact, all such methods have a subjective and an objective part, and our endeavor is continually to advance the limit of the objective part as far as possible… Support from intuition will be necessary”. (McCarthy et al., 2017: 259) [emphasis supplied]
2
Automatic aggregation of large-scale disparate data is increasingly conceived as central in the production of ecological knowledge, as the basis for policy-making and its implementation in regulatory structures, prominently through global environmental risk regulation (Gleeson and Greenwood, 2015: 87–88). Much of such regulation is based on technical risk-assessments that are informed by fields like ecological-studies, environmental-sciences and epidemiology, where similar logics of automation and data-driven knowledge production are increasingly advocated as the next big ‘paradigm’. The slow shift in the discipline of ecology towards statistics over the last six decades is at a stage where various ecologists characterize their field as a ‘synthetic discipline benefitting from open access to data from the earth, life, and social sciences’ (Reichman et al., 2011: 703). The crucial factor of global scale is quite clear in the ambitions for the purported data-turn in ecological sciences. This mirrors the shift in weather studies through the 1960s to the 1990s in the making of the ‘vast machine’ alluded to above. While ‘(t)echnological challenges exist,…, due to dispersed and heterogeneous nature of these data’, ecologists advocated ‘standardisation of methods and development of robust metadata’, development of large-scale data warehouses, increase of data access, and techniques to ensure ‘reproducibility of analyses’ from these large-scale data. These factors, they argued, are key to ‘the transformation of the synthetic discipline of ecology into one of the greatest scientific revolutions in the present century,…through dealing with the vast volume and heterogeneity of ecological data’. In order to accelerate the advance of ecological understanding and its application to critical environmental concerns, it is said that ‘we must move to the next level of information management by providing revolutionary new data-management applications, promoting their adoption, and hastening the emergence of communities of practice’ (Reichman et al., 2011: 705). The ambition here is to access ecologically pertinent data from a ‘diverse array of data and information’, integrate this into ‘the chain of information from the gene to the biosphere’ so as to ‘significantly enhance our understanding of the natural world’, ‘develop robust analysis’ and promote ‘wise management strategies for natural resources’ (Jones et al., 2006). The attempt here, then, is to make the data global – to automatically access, integrate and synthesize data to reveal important patterns that can generate broad generalities, which can be used as the basis of public decision-making through risk regulation. The distinct possibility of characterizing broad generalities at a global scale seem to be an ambition that plays a key role in the debate about acceptability of such use of automation techniques among these specific scientific communities, an aspect that is further elaborated in fourth section. But then how would such automated techniques muster trust among expert and other regulatory communities to facilitate the use of data, thus gathered, as global-data for regulation? It is important to recognize that the attendant contestations about these techniques would render different concerns of trust, and expectations about public acceptability, among different epistemic communities like ecological scientists, risk assessors and other regulators, ‘lay volunteer’ groups, civil society groups, and other ‘general’ publics? Before we examine how these concerns play out within the putative shifts in environmental sciences, it is important to underline the nature of contemporary risk discourses and institutions which comprise the regulatory ecosystem in which this data-turn is sought to be implemented.
Representation in contemporary environmental risk regulation
Issues about trust and public acceptance in the use of automation techniques and data-driven studies in global biodiversity regulation are intricately related to tensions between existing paradigms of traditional-risk and precautionary-risk. Traditional-risk is understood as an objective reified phenomenon that can be accessed through expert investigation and assessment, attained through the calculation of the probability that particular adverse events (identified as hazards) may occur. Emerging from disciplines like engineering, economics, psychology and epidemiology, this kind of quantification and calculation of probabilities has become the backbone of contemporary risk regulation. Traditionally, risk regulation is bifurcated as (technical) risk-assessment and (political) risk-management (US National Research Council, 1983, 1996). This bifurcation leads to the production of an apparently transcendent quantitative idiom of technical risk-factors through the ‘simplification of multivalent complexities to simple parameters of likelihood and magnitude, and subsequent aggregation across highly diverse dimensions, contexts and etiologies’ (Stirling, 2008). Characterised as objective, risk-analysis is based on the statistical analysis of ‘objective risk factors’, and gives the regulatory framework a spectral quality, one that is supposedly trans-cultural and post-normative.
Serious challenges to the legitimacy and reliability of traditional-risk regulation, and the reification in it, had amplified through the 1980s. Vociferous criticisms about the fallacy of the fact-norm divide embodied in the bifurcation of risk regulation as risk-assessment and risk-management problematized assumptions of representation implicit in traditional-risk regulation (EU Expert Group, 2007). Various catastrophes like Chernobyl, serious institutionalised regulatory lapses like in the incidence of BSE, and incessant controversies like introduction of GMOs in agriculture had amplified existing public concerns about the inadequacies of risk regulation as a tool to protect public health and the environment. The competing and contradictory claims about the definition of risk in every concrete context of risk-conflict signalled erosions of the monopoly of scientific establishment regarding truth-making in risk-analysis (Beck, 1992). Various critical efforts to unpack contemporary scientific expertise, and its use in regulation, have problematized the assumption that regulation ought to be based on a reified idea of risk that is revealed through expert-analysis. Significantly, the withering claims to disinterestedness and objectivity of the scientific enterprise were highlighted to implicate the position of expert-advice in regulation (Ziman, 1996). Further, the blurring of scientific disciplines and a commonplace mixing of the traditional categories of scientific, technological and industrial enterprises challenged the unmediated use of technical advice, especially in the face of fading distinctions between doing science and doing business (Hagendijk, 2004). Thus scientists in risk-assessment enter the public arena as ‘experts who are part of a complex rhetoric and political system, as opposed to as experts on scientific truths, as truth speaking to power in a traditional picture’. Most techno-scientific controversies are seen as ‘wicked problems’, and by now it is a cliché to quote the wickedness of post-normal science where ‘facts are uncertain, values in dispute, stakes high and decision urgent’ (Funtowicz and Ravetz, 1993, 1999; Kastenhofer, 2011; Ravetz, 1986):
That governments can uncritically delegate issues regarding selection of potential hazards as worthy of analysis to expert techno-scientific communities through risk discourses has thus encountered serious public challenge, since they are central to assumptions about social value. Therefore the representational claim in this delegation of risk-assessment to expert bodies has encountered serious public reservations.
Precautionary positions became popular across the globe in the context of this de-legitimation of the representational claim in risk. The role of precautionary principle is crucial to understand the continuities and shifts in contemporary environmental regulation surrounding the data-turn in ecological sciences. Norms of humility, public participation, and ecological monitoring over long periods of time and large regions – that are attendant on precautionary principle – were put forward as ways to overcome these challenges to reified expert-risk as a technique of governance (EU Expert Group, 2007; Jasanoff, 2003). Liberal accounts of sovereignty and citizenship, based on the important justification that the modern state is expected to protect human health, the environment and vulnerable groups, faced a distinctive shift in the modes of their implementation, departing from an earlier basis in the principle of protection, moving instead towards precautionary approaches. The principle of protection assumes that modern science can regularly demonstrate causality between undesirable consequences and its causes. In contrast, various versions of the precautionary principle encourage positive regulatory action notwithstanding scientific incertitude, given the increasing difficulty in establishing scientific consensus on causality about grave and irreversible harm. Precautionary principle also helped bring focus on the potential of public participation in ascertaining the appropriate level of risk that different governments might deem as acceptable within their polities (Foster, 2008), given the cultural specificities of risk among different societies (Douglas, 1992). Despite controversies about the acceptability and application of the principle in international law – starkly visible in the transatlantic regulatory disagreements about the use of GMOs, asbestos and Beef Hormones, precaution had become a ubiquitous feature in the stated ambitions of risk regimes across the world by the turn of the century (Applegate, 2002; The World Commission on the Ethics of Scientific Knowledge and Technology, 2005; de Sadeleer, 2006). Hence, to understand how knowledge production that is driven by large-scale automated aggregation of disparate data impacts contemporary risk regulation, it is important to place it within the trajectories of precautionary discourse.
Precautionary approaches in risk regulation, notwithstanding broad divergences in its various avatars, were expected to alleviate fundamental problems with traditional-risk regulation. This included significant shifts in the role of experts in regulatory decision-making and the potential for deliberative practices outside the confines of formal democratic institutions. Possibilities of wider public participation during framing in risk-analysis are offered, through precautionary discourses, as radical improvements to the reification of risk in traditional environmental regulation. The public identification of appropriate data for assessment and long-term monitoring of these parameters are seen as important to address scientific ambiguity and ignorance within the precautionary rubric (EU Expert Group, 2007; Harremoës, 2002), as also to invoke public participation to ascertain the appropriate level of risk a society is willing to take (Foster, 2008; Thayyil, 2014). Notwithstanding such promise, the legal implementation of precautionary doctrines has invariably encountered the search for a workable trigger to invoke precaution, with the assumption that normal situations continue to warrant traditional-risk regulation. However, the trigger for extraordinary situations where the precautionary principle is to be invoked also continues to be controlled by techno-scientific spaces that drive traditional-risk frames (Peel, 2007; Thayyil, 2014). This has led to a fundamental confusion, where the very reasons for the call for precaution viz., scientific ignorance and control of risk regulation by scientific institutions that are currently dominated by the industry and technocracy, is negated when the invocation of the principle itself is controlled by those very spaces. Nevertheless, precaution has continued to offer a legitimizing palliative to the fundamental representational problem within the traditional-risk frames. It has also stimulated more careful and reflexive methods in risk-analysis. For instance the IUCN, which has a significant role in the implementation of CITES, requires its scientific studies to adhere to the precautionary principle (IUCN, 2012: 23).
The tension between the two paradigms vis-a-vis fundamental differences about risk among social groups is relevant in a discussion on the use of automated data-driven studies in regulation. An important aspect of this tension relate to the reification of expert findings, regardless of issues of epistemic incertitude (like scientific ambiguities and ignorance, and information gaps), and ignoring the right of the political community to decide the level of risk appropriate for each society. Precautionary approaches aspire to have other public knowledge shape environmental regulation and mandate public participation in risk-analysis and risk-management. The ways in which data-driven studies may be reified during its possible use in biodiversity regulation may need further thought, given the general effect of visualisation techniques and the concentration of automation platforms within a few corporations. These factors can possibly have a definite impact on its acceptance among advocates of a strong precautionary principle, who advocate public participation in risk-analysis and risk-management as important pillars of precautionary-risk regulation. Significant academic collaborations that offer sympathetic but critical reviews of the implementation of the principle identify wider public participation in risk-assessment and long-term environmental monitoring across time and space as crucial contributions. The implicit logic is to provide a more acceptable tool of decision-making to deal with situations of scientific ignorance, and the inability to establish causality of identified hazards, by providing for long-term ecological-monitoring across large areas and public participation in risk-assessment (Fisher et al., 2006; Harremoës, 2002). The principle seeks public participation also to foster public trust and engagement in risk-analysis and risk-management. It is amidst the unfolding of this rationality of trust-building around precautionary-risk – to respond to public contestations regarding the limits of traditional-risk regulation in predominantly relying on expert advice – that the implementation of automated data-aggregation technologies may become central. Precautionary-risk paradigms mandate a more public way of creating data that guides risk regulation, and acknowledge the contentious nature of collation, analysis and use of relevant information, given the distinct possibility of scientific gaps, ambiguity and ignorance. The possible transformation of biodiversity studies into a synthetic (data-driven) discipline, in manners not too dissimilar from the transformational changes in weather studies mentioned earlier, may unfold along particular trajectories tied to the implementation of precautionary-risk by governments. Here, the ability to represent safety concerns of different constituencies and political groups becomes important for generating public trust for automated data driven biodiversity studies. It is then crucial to understand how shifts in data-driven environmental sciences may be implemented through precautionary-risk and how these shifts respond to aforementioned issues of representation.
Precautionary-risk and the data turn in biodiversity monitoring
Attempts at building complete, coherent and consistent global data-sets, from incomplete and inconsistent environmental-data, appear key to understand possible shifts in governmental practices around precautionary-risk today. Anticipating the different kinds of measures that seek public acceptance for the use of these techniques among different epistemic communities, like ecological scientists, risk assessors and other regulators, ‘lay volunteer’ groups, civil society groups, and other ‘general’ publics, becomes important. Shifts in the modes of standardization in ecological sciences sought through large-scale automated data-aggregations speak to the promise of precautionary approaches in two important sites public participation and long-term environmental monitoring. Important preliminary steps in this data-turn in ecological sciences include generation of large amounts of data that is heterogeneous in content and disparate in sources, securing its veracity and ensuring access to relevant scientific and regulatory publics.
Citizen science projects are often viewed as ideal in generating such disparate data across a number of disciplinary areas within ecology, and may appear well-suited to the ambition in precautionary approaches to incorporate public participation in risk-assessment. Beyond the generation and collection of ecological data, citizen contributions have been used to aid vetting of ecological information in some projects – prominently in fields like ornithology, paleontology, astronomy and atmospheric sciences, fields with long records of volunteer involvement. The Cornell Lab of Ornithology, for instance, has operated several citizen science projects of various sizes, engaging thousands of individuals in collection and submission of bird observations. These projects involved questions like how breeding success is affected by environmental change, how emerging infectious diseases are spread through wild animal population, how acid rain affects bird population and how bird populations change in distribution over time and space (Bonney et al., 2009: 977). Engaging citizens in collecting scientific information about biodiversity is seen as particularly helpful in pursuing questions that have a large spatial or temporal scope, as in long-term monitoring: ‘(w)here studying large-scale patterns in nature requires a vast amount of data to be collected across an array of locations and habitats over spans of years or decades’ (Bhattacharjee, 2005). However, this collection of data by ‘lay-publics’ for such long-term monitoring may well be forged within a meta-frame of knowledge already set by the scientific establishment, with the data that citizens provide being used in ways akin to the use of a native informant’s narratives by a colonial ethnographer. Nevertheless, the potential for building trust and public acceptability of these techniques for ‘lay’ groups who demonstrate an active interest in biodiversity protection, like conservation enthusiasts and birders, may be significant.
Long-term monitoring, be it through citizen partnership or otherwise, is the second site where the stated ambitions in precautionary doctrines and the data-turn in ecology appear to converge. Ecology has transformed as an ‘integrative collaborative’ field from its earlier avatar of ‘small-scale, short-term observations and experiments conducted by individuals to include large-scale, long-term, multi-disciplinary projects that integrate diverse data-sets using sophisticated analytical approaches’ through the 20th-century. Some ecologists identify three attendant technological challenges for this transformation viz., data dispersion, heterogeneity and provenance, given that ‘ecosystems and habitats vary across the globe, and data are collected at thousand of locations [sic]’ (Reichman et al., 2011: 703): (W)ith [the] advent of recent technological and computational advances, scientists are using increasing numbers of in situ environmental sensors, model simulations, crowd sourcing tasks, and embedded networked systems that enable environmental studies to incorporate various spatio-temporal scales and to produce unprecedented amounts of data. (Hernandez et al., 2012: 1067) computers to collate data from the Web without human intervention, enabling new types of synthetic data studies at much larger scales…useful for representing the semantics of ecological observations and for building tools that directly support synthesis through precise data search and automated data integration. (Reichman et al., 2011: 704)
It is not apparent whether the data-sets generated through these automated data-aggregation techniques will be accepted as global environmental data, even within specific scientific communities. Nichols et al. bring out the difficulties in the drive towards automated approaches, by reminding that data-driven hypothesis selection cannot be seen as an alternative to expert or knowledge driven hypothesis selection, but rather as complementary at best. While both can produce useful hypotheses, in a cost–benefit analysis for each approach: there is nothing to suggest that the former can really or even likely to be more useful in generating hypothesis. If anything we could think that hypotheses and prior knowledge are even more important when complex systems are studied…[F]aced with large numbers of patterns and hypotheses, it would seem that we should use any sort of prior knowledge that might help us sift through them and discard those that are unlikely to be plausible or useful…to distinguish random noise from the output of complex models. (Nichols et al., 2012: 498) (w)e do not deny the possibility of important patterns and hypotheses that do not emerge from prior knowledge, but we do not view methods that focus on their discovery as an efficient approach to increase existing knowledge. Large temporal and spatial scope is certainly challenging, but no more so for knowledge-driven scientific approaches (which are, incidentally, frequently conducted by teams rather than by single individuals) than for data-driven approaches. (Nichols et al., 2012)
Much like in weather studies, visualisation is seen an essential and integral part within the data-turn in environmental sciences (Kelling et al., 2009: 617). These techniques appear to foster highly interactive functionalities made available through web-enabled desktop applications. As a visual medium, regulatory communities generally appear to have greater preference for visualisation given its immense potential for impact in representation amongst target groups, as also as an effective means of communication with the ‘general public’ (Kaplan, 2016: 54–57). The contours of the aforementioned scientific controversies reveal a privileging of expert-centred or subjective analysis by some scientists, and automated hypothesis selection by others, similar to the contestations in meteorology. Whether automated hypothesis-selection overcomes these contestations and become the default option in biodiversity monitoring is uncertain. However, it is remarkable that two major sites, public participation and long-term monitoring, overlap key points in the precautionary discourse and justification for use of automated data-aggregation in ecology. At the same time, the source of reification appears to move from an expert-dominated site in traditional-risk regulation to automated programs. The precautionary principle posed a serious challenge to expert reification of risk in fundamental ways. Nevertheless, the reification continues in automated models, even as data-intensive evidence-based ecological monitoring promises a better platform for citizen-participation, multi-site/disciplinary collaboration and long-term monitoring: key professed sites of improvement amongst the advocates of precautionary principle. It is therefore conceivable that the terrain for the wide acceptance of data-sets, thus generated as global data, will be made through these two planks of the precautionary discourse. However, whether such discourse would provide sufficient traction to generate trust and acceptance among various communities identified here require further engagement.
Pathways and paradoxes in the putative data-turn
Various biologists have asserted the distinct advantages of a faster and neater way of collecting large-scale data long periods through automated monitoring. Given the rapid habitat loss, they assert urgent scaling-up of monitoring through these techniques to build more homogenous, reliable and permanent data-sets, which can efficiently guide biodiversity regulation. Strong calls from state regulators and international environmental agencies for incorporation of these techniques in monitoring and implementation in biodiversity regulation are still scarce. However, this may merely be a matter of course, given that bodies like IUCN could be a bridge for such incorporation. Nevertheless, such introduction in biodiversity regulation may be complicated due to the earlier-mentioned tensions between various paradigms of risk regulation. This (concluding) section discusses the possible pathways for such techniques to be publically accepted as appropriate for regulation, and how competing concerns from various sections about trust (regarding the use of these technologies in regulation) may unfold as paradoxes. The objective here is not to advocate ways to resolve such paradoxes, but to suggest that the trajectories of the implementation of these techniques in regulation may well be shaped by the manners in which attendant paradoxes are approached.
Issues of appropriate representation of reality in risk regulation, and concerns regarding the legitimate protection of interests, bodies and the environment come together sharply here. Whether automatic aggregation of large-scale disparate data can be recognised as global environmental data, and used for global regulation is a question beset by fundamental differences regarding the meaning of precautionary principle. Given that precaution is often asserted as a central norm in risk regulation towards tackling expert reification and scientific incertitude, it is paradoxical that the ‘expert-lay hierarchies’ in traditional-risk frames that the precautionary discourses aided in revealing continue to fit, even if uneasily, within an expert-automation cleavage in the data-turn. Through the use of automated data-driven studies, it would appear that the site of reification of risk could be shifting in precautionary-risk regulation, from expert spaces to technological platforms that conduct these studies, even while precautionary approaches seek to supplant reification in traditional-risk approaches. Be as it may be, the heuristics and solutions offered through the precautionary discourses also facilitate the incorporation of the data-turn in ecological monitoring, and yet such incorporation further reifies expert risk. How would these shifts change the delegitimation of expert-risk, including its promise of representation and attacks on the reification of risk as an abstract object that is to be accessed through expert mediation? Visualisation techniques inherent in the data-turn in environmental sciences may appear to only hinder challenges to reification of risk in regulation. Opacity of collection methods and the implicit agenda-setting by groups that control these technological platforms, inter alia through visualization techniques, the move to automate generation of hypotheses, automated selection from among these hypotheses, and the effect of data-turns in subsequent monitoring may all make precautionary-risk regulations more opaque and technocratic. Even as the use of these techniques could be made more acceptable through the heuristic possibilities of the precautionary discourse, it also makes its implementation paradoxical.
The urgency to create cheaper and reliable data-sets to alleviate the global crisis of habitat loss has palpable traction amongst specific publics who have demonstrated active interest in biodiversity conservation. Given that some such ‘lay’ groups have had years of involvement in related technological platforms (like in e-bird), and the apparent fit these technologies promise to have with the precautionary principle (long-term monitoring, and public participation), the chances of acceptance of these technologies among them may be very high. Whether the existing divide between those who control the data-architecture and others may become a factor, here, is uncertain. Public efforts like ARBIMON (‘Automated Remote Biodiversity Monitoring Network’ that seek to bridge data architecture divides seriously (Aide et al., 2013) may have an impact on the way public acceptance and trust about such use.
There are other paradoxes that beset the construction of disparate data as global environmental data through automated techniques for public trust to be fostered through precautionary emphases. What information is considered relevant in risk regulation is a normative and political question, and various kinds of existing social divides are key in its selection. The role of experts and their normative standpoints in such selection is already under critical scrutiny, leading to calls for adequate representation through public participation, often through the rubric of the precautionary principle. However, the normativity of the data-set from which the risk-analysis draws, including techniques of automated data-gathering, may be submerged in an ever-increasing abstraction through newer techniques of bioinformatics (Jones et al., 2006). This may be in contradiction to the stated emphasis on transparency, openness and public participation in precautionary approaches.
Further, a fourth paradox besets the claim of the transcendental idiom of risk-assessment that is reached through large-scale algorithmic data aggregation. While risk regulators claim that their analysis is based on accurate and reliable data, algorithmic experts embrace ‘messiness as a virtue’, and are seen to express a surprising nonchalance about the precision or provenance of data (Mayer-Schonberger and Cukier, 2013: 32–33). Accuracy and scientific veracity are seen as great normative resources for traditional-risk regulation. How these resources can coexist with ‘messiness as a virtue’ requires further top-down approaches, as such methodological questions may not be resolved through scientific deliberations. There are examples where such concerns are overcome through the sheer volume of use of these techniques by a significant section of the practitioners, similar to the experience in automation in weather-modelling. Calls, nudges and encouragement for such practices from the regulators can have such effects.
Much like the data-turn in weather-modelling (where the data images made ‘global’ look transparent and accessible while the underlying models itself were opaque), the opacity in the data-turn in risk regulation can take institutional practices away from precautionary and participatory discourses rendering the contingent social choices as natural. This production of an opaque technocratic idiom – one that claims to be empirically accurate and real while claiming to transform existing processes to make them more precautionary and participatory – reveals important paradoxes. These, then, may hinder the viability of public acceptance of such data-sets as global environmental data, and as the basis for attendant global regulation, standards or policy-making. This would then beg the question as to what other top-down norms may aid the representation of large-scale disparate data, automatically collected from both data-rich and data-poor regions, as complete, coherent and consistent global environmental data.
The paradoxes that are identified here are important enough to beset the breadth of possible acceptance of these techniques to be used for biodiversity regulation. Nonetheless, public acceptability of the use of these techniques in regulation may well be contingent on how these issues are navigated towards fostering active trust among different relevant publics.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
