Abstract
The paper explores the role of measurement in securing sustainable and just environmental governance. Examining New Zealand's ambitious initiative to monitor and improve its freshwaters, I identify four ‘limits’ to realising the promise of measurement: scarce resources, ontological ambiguity, epistemological narrowing, and decision-making logics. Expounding these limits helps to identify the costs of, and alternatives to, current visions of science-driven environmental governance reform. By reckoning with these limits rather than ignoring them, a new modus operandi for environmental science can be composed that is both more practically ambitious and less vulnerable to failure.
Keywords
Introduction
The promise of measurement inspires and propels environmental science. Through advances in science and technology, every day it is possible to measure the environment in greater resolution, with greater accuracy, at greater scale and lower cost than ever before. Ecosystems from groundwater to forest cover can be measured over large areas from space (Curtis et al., 2018; Richey et al., 2015); LiDAR technology can precisely map three-dimensional riverine topography (Piégay et al., 2020); new analytical methods using environmental DNA from water samples can indicate what lives in a freshwater system at large spatial and temporal scales (Thomsen & Willerslev, 2015); citizen observations can illuminate important variability in ecosystems previously invisible to environmental managers (Truchy et al., 2023). The frontiers of environmental measurement are multiple and expansive, promising a more fulsome account of the natural world, bringing more of the natural world into the domain of the visible – and actionable.
The promise of measurement begins with the will to know, but also includes the desire to act. The saying in environmental management that ‘you can’t manage what you don’t measure’ invokes measurement as an essential precursor and enforcer of environmental management (Pine and Liboiron, 2015). This promise carries the weight of common sense, and yet, despite decades of investment into environmental measurement, many things we care about and have measured for decades – global carbon dioxide concentrations, biodiversity, soil carbon, forest area, marine fisheries – are heading in worrying directions (IPBES, 2019; Ripple et al., 2017). This track record should give us pause: if the promise of measurement is not being realised, why is this so? Is it that the policy frameworks for environmental measurements to drive regulation have yet to be found? If measurements are given policy weight to compel environmental regulation, will this complete the circuit and drive environmental improvement?
This paper explores these questions by analyzing a best-case scenario in which strongly favourable political conditions enabled the promise of measurement to be institutionalised into freshwater management. Drawing on STS, political ecology, and state theory, I distinguish how measurement operates as accountability, performativity, and mode of governance. Looking at freshwater management in Aotearoa New Zealand, I examine what happens when measurement as a mode of governance is enshrined through a concrete regulatory apparatus in a social democratic country with strong environmental values, and yet is struggling to achieve its hoped-for outcomes. Within this regime I identify infrastructural, ontological, regulatory and epistemic limits encountered in the process of implementing freshwater policy that carry lessons for attempts elsewhere to institutionalise the promise of measurement. Learning from this situation is crucial because better conditions for realising the promise of measurement may not be easily found or produced. Furthermore, if the promise of measurement is not only unrealised but unrealisable, we need new imaginaries to shape the science-policy interface.
After establishing key ideas about measurement in environmental politics, I briefly outline the methods and materials for the study, and then describe how measurement is being constructed and enshrined as a mode of governing freshwater in Aotearoa New Zealand. Then, drawing on interviews with scientists and practitioners in the environmental monitoring apparatus, I identify four limits to the promise of measurement, and consider what these mean for environmental politics. Finally, I reflect on the dangers of continuing to reify and centre the promise of measurement, and point toward an alternative, reflexive agenda for reform of the science-policy interface in environmental governance.
Environmental measurement and government
Diverse research traditions, ranging from positivist environmental science to critical social theory, have elaborated roles for measurement in environmental governance. To build a working theory of measurement and government, it is helpful to establish three key sets of ideas. The first idea, drawn from state theory, focuses on the promise of measurement as a tool for accountability and demands-making in political struggle. The second idea, overlapping with science and technology studies and political ecology, concerns the material and social consequences of naming and measurement. A third idea, measurement as a mode of governance, considers how measurement fits into a wider theory of people and nature. Through these lenses of accountability, performativity, and mode of governance, it becomes possible to comprehend the actors, intentions, choices, and impacts of governing through measurement.
Measurement as accountability
The idea that ‘you can’t manage what you don’t measure’ is common parlance in environmental management, yet its foundations nevertheless merit elaboration. One important foundation lies in state theory, which emphasises the state's capacity to know. For the state to claim a legitimate monopoly of violence, as well as of taxation and expenditure in the public interest, it must demonstrate comprehensive knowledge of both its population and its territory (see Whitehead, 2017). This includes what valued natural resources exist within its territory, how these are distributed, and who relies upon them (see Scott, 1998; Whitehead et al., 2007). In contemporary environmental governance, the state is not only expected to demonstrate to its citizenry that i) it has adequate knowledge of what aspects of nature matter to people, but also ii) that it has undertaken activities to protect and enhance these values, and iii) these activities have at least partly succeeded (Ding, 2022; Duit et al., 2016; PCE, 2019). As Höhler and Ziegler (2010: 421) put it, ‘Accountability in the political sense requires that those who exercise power show or must be able to show that they have done so properly’.
Environmental monitoring, as the repeated and consistent measurement of environmental phenomena in a place (Quality Planning, 2024), constitutes an important infrastructure though which the state knows its environmental territory. Many states undertake systematic environmental monitoring to track valued environmental outcomes such as air quality, marine fisheries, freshwater pollution, and terrestrial biodiversity (Australian Government, 2021; European Environment Agency, 2020; Government of India, 2015; MfE and Stats NZ, 2023; USEPA, 2024). Global targets for species conservation or carbon dioxide emissions mean little, for instance, without an ability to assess whether these outcomes have been achieved. Reporting this information publicly enables civil society ‘to perform epistemic checks on the validity of claims being made in regard to the achievement of environmental targets’ (Turnhout et al., 2014: 581). Crucially, as no other actor has both the resources and moral mandate to comprehensively monitor the state of the environment for the public interest, the state ‘plays a pivotal role in generating environmental knowledge’ (Duit et al., 2016: 8). Many states have developed their own monitoring infrastructure to measure environmental changes and thereby demonstrate responsiveness and accountability to their publics.
Through infrastructures of environmental monitoring, measurement constitutes an important mechanism for accountability in environmental politics (see Höhler and Ziegler, 2010). The promise of measurement, through this lens, is about science being able to identify and evaluate, with increasing accuracy and precision and decreased cost, those environmental entities and processes that are valued by the citizenry. As the instruments of measurement become cheaper, more widely available, and greater in their scale or resolution, and as environmental data become more accessible, this opens up new horizons for making claims upon the environmental state. In turn, by demonstrating a willingness and ability to manage these environmental features, the state can secure its legitimacy and demonstrate its competence to rule. Measurement as accountability thus sustains state power.
Measurement as selective and performative
A second set of ideas consider how environmental measurement, far from being an impersonal and rational process, is a contingent product of individual values and social forces, and furthermore, that measure selection privileges certain ways of experiencing the world (Forsyth, 2003; Robbins, 2001; Robertson, 2012; Turnhout, 2018). Scott (1998: 11) powerfully observed that state representations of the German forest using the metric of commercial wood ‘brings into sharp focus certain limited aspects of an otherwise far more complex and unwieldy reality’. This simplification, he continued, ‘allowed the forester to estimate closely the inventory, growth, and yield of a given forest’ which then supported attempts ‘to create, through careful seeding, planting, and cutting, a forest that was easier for state foresters to count, manipulate, measure, and assess’ (p15). This type of systematic environmental knowledge, linking forestry science and state power, underpinned efforts ‘to transform the real, diverse, and chaotic old-growth forest into a new, more uniform forest that closely resembled the administrative grid of its techniques’ (ibid: 15). Measures are selective, Scott points out, because an ecologist, in contrast to a forester, might choose instead to measure the forest for its biodiversity, abundance of certain species, functional shade, or the provision of food and fibre. While foresters might agree on standards for measuring commercial yield, this is not the point; at issue is whether the selection of ‘commercial yield’ as a measure is a sufficiently legitimate way of representing the environment as valued by the citizenry.
Much scholarship in STS and political ecology since Scott has shown how the selection and construction of environmental measurement is a value-laden enterprise (Bouleau et al., 2009; Blue and Brierley, 2016; Lave et al., 2018; Loconto et al., 2024; Mennicken and Espeland, 2019; Nost and Goldstein, 2022; Robertson, 2006; Turnhout et al., 2007). Selecting which entity to measure involves deciding what environmental entities have value, and deciding how to measure involves imputing equivalence among entities (Pine and Liboiron, 2015; Robertson, 2012; Tadaki et al., 2015). Environmental entities might be selected for measurement and monitoring because they reflect specific concerns and coalitions in local environmental politics (Bouleau, 2014, 2017), align with policy and legislative categories (Turnhout et al., 2007), or even simply because they reflect the implicit judgement of a particular group of people as to what is valuable (Blue, 2018; Pine and Liboiron, 2015). Choosing how to measure that environmental entity can in turn be influenced by organisational policies, available technology and resources, and institutional norms such as a desire for protocols that can be easily undertaken by agents with minimal training (Lave, 2012; Muller, 2018; Robertson, 2006).
Choosing entities and their measures requires judgement and produces social and material effects on the world. Measuring and classifying wetlands enables the claim that one wetland is equivalent to another, which provides a logic for spatial offsetting (Robertson, 2006). Even something as seemingly straightforward as counting Escherichia coli (E. coli) bacteria in a sample of river water requires application of judgement to i) define applicable rivers, and then ii) report counts in a way that makes comparison possible. Applying different judgements to these choices renders the world differently, as the New Zealand government did when it relaxed its E. coli classification thresholds in 2017, turning 13% of New Zealand rivers from ‘unswimmable’ to ‘swimmable’ overnight (Blue and Tadaki, 2020: 24). Similarly, Clifford (2022) showed how US state agencies applied judgement in designation of ‘exceptional’ dust events to allow otherwise noncompliant air quality outcomes to become officially compliant (see also Arce-Navario 2018, for an example of water quality). While intentional manipulation of measurement for partisan ends is a real risk (see Brombal, 2017; Mansfield, 2021), unintended effects of measurement are perhaps more widespread.
What we choose to measure, and how we choose to measure it, are not just world-representing decisions, but also world-making decisions (Jasanoff, 2017; Turnhout, 2018). Environmental measures bring new worlds into being, such as Scott's (1998) commercial forest counts, by making certain aspects of the environment visible and thereby amenable to discriminatory interventions in the name of the public interest (see also Pine and Liboiron, 2015). If the state theory idea of measurement-as-accountability highlights the political utility of measurement, the ideas of measurement selectivity and performativity highlight the hazards of treating measurement as a purely epistemic exercise.
Measurement as mode of governance
A third set of ideas, from STS, environmental studies, and policy studies, focuses on how measurement organises the task of environmental governance. Here, measurement is not simply a representation, but a mode of experiencing, acting in, and governing the world (Fortun, 2004; Loring et al., 2021; Porter, 1995; Robertson, 2012; Turnhout et al., 2014). This vein of scholarship extrapolates beyond a single instance of measurement as accountability to examine the patterns of behaviour that arise in governance regimes that focus on measurement.
Ideas about measurement as a mode of governance can be parsed into the promise of measurement and a social theory of improvement. The promise of measurement refers to the idea that comprehensive measurement of the biophysical environment, that is both precise and accurate, will enable a clear characterisation and understanding of environmental changes, the drivers of these changes, and the specific human practices responsible for these drivers (Goldstein, 2022; Hesse et al., 2023; Kuch et al., 2020; Shapiro et al., 2017). This promise conjures a utopian ideal in which quantified environmental changes are tied to specific and regulate-able human actions, with science providing an unambiguous a fulcrum for social regulation (Salmond et al., 2017; Shapiro et al., 2017; Shattuck, 2021). As a consequence, scientific measurement should receive priority for investment, and the resulting measurements should be accorded primacy as a foundation for political action (Goldstein, 2022; Hesse et al., 2023; Shapiro et al., 2017). Yet despite investments into the measurement frontier, scholars have noted that the goal of accurate and precise measurement is always receding beyond the horizon, and that ‘environments and landscapes can never be enumerated enough––eternally requiring further verification’ (emphasis in original, Shapiro et al., 2017: 584; see also Kuch et al., 2020) or greater certainty (Goldstein, 2022). For these reasons, the promise of measurement has been criticised as a ‘data treadmill’ (Shapiro et al., 2017: 584) that suffers from ‘a shortcoming of feasibility, of delivering on its own terms’ (ibid: 584).
Even if the promise of measurement may never reach its final destination, the act of measuring itself contributes to a wider social theory of improvement. Turnhout et al. (2014) coined the term ‘measurementality’ to theorise the proliferation of environmental monitoring and verification systems globally, and to describe the political vision underpinning many of these infrastructures. Here, measurement contributes to an ideological project of performance improvement, in which environmental measurement provides the infrastructure for evaluating governance success, and it is expected that practitioners and policy makers ‘will anticipate and reflexively adapt to the possibility of outside scrutiny… [by] making a greater effort to enhance their performance’ (Turnhout et al., 2014: 583). Measurementality names a self-propelling political dynamic in which scientific measurement enables ecological transparency, transparency drives public pressure, and public pressure drives improvements in environmental performance (see Fortun, 2004). More than just providing a mechanism of accountability, measurementality names a theory of social change by which scientific measurement alters the behaviour of different environmental actors to achieve the ‘right disposition of things’ (Foucault, 1994: 208), where incentives are logically aligned and outcomes can be assumed as assured.
An analytic of government by measurement
Ideas of measurement as accountability, performativity, and mode of governance help to conceptualise how environmental governance through measurement takes place. Accountability highlights the dynamic process of the state internalising environmental demands and values into its infrastructures and operations. Performativity highlights how the environmental entities and their methods of measurement are both contingent and political, revealing the implications of specific measures and their alternatives. As a mode of governance, measurement is placed within a wider vision for social life, in which measurement serves to direct political investment and compel right action through informational incentives.
Freshwater policy in Aotearoa New Zealand
Recent developments in freshwater policy in Aotearoa New Zealand (henceforth New Zealand) provide a unique and best-case opportunity to examine what happens when the promise of measurement and vision of performance improvement are rolled out as a programme of political reform.
Since the late early 1990s at least, freshwater science has pinpointed with increasing robustness that pollution such as nutrients and sediment from New Zealand's agricultural land use has posed a major threat to its freshwater estate (Smith et al., 1993). The expansion and intensification of dairy farming in particular has been pegged as a dominant driver of degraded water quality (see PCE, 2004, 2013; Joy, 2015; Ministry for the Environment and Statistics New Zealand, 2023). Alongside this, urban water contamination from heavy metals and manufactured chemicals, industrial pollution from factories, wastewater, and infrastructure, and sediment from forestry all compound to degrade freshwater ecosystems and narrow the range of life that can flourish in them. Erosive agricultural landscapes lead to degraded habitat for fish and bugs, and excess nutrients from fertilizer and livestock can overload and choke waterways with algae.
Despite New Zealand's world-leading Resource Management Act 1991 (RMA) requiring regional authorities (henceforth ‘councils’) to evaluate and manage the environmental effects of development, freshwater quality has declined due to permissive norms of development coupled with strong international demand for New Zealand dairy (Brown et al., 2016; EDS, 2007; PCE, 2004). In part, this failure has been assigned to the devolved nature of RMA implementation: councils, funded sparsely out of local rates (land taxes) and left to implement the RMA without precise direction on water quality, found themselves averse to spending scarce resources to justify environmental regulations in court (Brown el al., 2016; EDS, 2007).
Although the RMA allowed the possibility for national policy to determine more precise freshwater objectives, this mechanism was unused for two decades. In 2011, in response to public pressure, and guided by collaborative recommendations of national stakeholders in the Land and Water Forum (see Tadaki, 2018), the centre-right National Party-led Government (2009–2017) issued New Zealand's first National Policy Statement for Freshwater Management (NPSFM). The 2011 NPSFM required councils to set numerical thresholds for water quality and quantity to sustain and protect community values for freshwater such as recreation and food harvesting (New Zealand Government, 2011). Critics pointed out, however, that the NPSFM said nothing about how protective those thresholds should be (see Tadaki, 2018 for a review). In 2014, a revised NPSFM prescribed numerical thresholds for nine attributes of water quality that must be measured and met in every region (see New Zealand Government, 2014, see also Table 1). In 2017 the policy was revised further to require councils to report on ‘swimmability’ publicly, which created new monitoring requirements for E. coli (see Blue and Tadaki, 2020). In 2017 a centre-left Labour-led coalition was elected to government, with freshwater protection as a key election issue, and in 2020 Labour won a clear majority to govern alone. Under the Labour-only government in 2020, the NPSFM was revised once more, expanding the list of measured attributes from 9 to 22 (see New Zealand Government, 2020).
The 22 attributes in the NPSFM that regional authorities must measure and report on. Bracketed terms refer to the type of environment the attribute must be measured for. Attributes in Appendix 2A must meet numerical bottom lines included in the policy. Attributes in Appendices 2A and 2B must be maintained or improved. Attributes in italics were added in the 2020 NPSFM.
The NPSFM directs that waterbodies below prescribed numerical thresholds for the 10 mandatory attributes must be ‘improved’ above those thresholds, and that all other waterbodies must be ‘maintained and (if communities choose) improved’ for all 22 measured attributes (see New Zealand Government, 2020: 10). Thus, councils must monitor all 22 attributes, report these to central government and to the public, and improve any mandatory attributes that lie under the threshold.
In broad terms, New Zealand freshwater management can be described as performance improvement-through-measurement, based on an expansive monitoring apparatus. The NPSFM 2020 can be considered a reasonably strong environmental policy, with science-derived, ‘objective’ numerical referents that allow publics to hold the state to account for effective management of environmental entities. The policy leaves little room to manoeuvre or shirk – the 22 attributes cover all five pillars of ecosystem health that were proposed by a national group of freshwater scientists (see Clapcott et al., 2018). While the new right-wing National Party-led government elected in 2023 has promised to weaken freshwater policy objectives (Prickett and Joy, 2024), it seems likely that it this will be accomplished by lowering the standards rather than by removing the need to monitor them.
To examine how this promise of measurement is being implemented, and what it bumps up against, I looked to the experiences of scientists at regional councils, as well as research scientists. I wanted to learn how the planned monitoring regime is unfolding, how councils are making decisions about monitoring, whether they thought the new regime is likely to achieve better freshwater quality, and why. Through professional connections and snowball sampling, I contacted and interviewed 20 freshwater scientists from all 16 of New Zealand's regional councils (see Figure 1), focusing on people with relevant experience of regional monitoring networks and science investments. I also interviewed 13 research scientists and practitioners, who were involved in developing attributes and methods or interpreting them for court and other decision making settings. Interviews took place over Zoom and in person in 2020 and 2021, and ranged in length from 40–90 min. Interviews were transcribed, checked by participants where requested, and analysed through inductive thematic coding to identify key issues confronting the promise of measurement. Quotes used are either anonymised or attributed with express permission.

The 16 regions of Aotearoa New Zealand. Map by F. Lee.
Limits to the promise of measurement
The NPSFM framework could be interpreted as a major policy initiative to realise the promise of measurement. It not only requires the systematic nation-wide measurement and reporting of 22 freshwater attributes, but also the maintenance or improvement of every attribute. And it does so without using market mechanisms to exchange environmental commodities (New Zealand Government, 2020).
For a country with strong environmental values, a powerful leftist government able to govern alone, and a strong regulatory apparatus, how was this utopia of regulatory measurement and performance improvement being realised in practice?
My research revealed that councils are confronting significant infrastructural, ontological, regulatory and epistemic limits in their rollout of the philosophy of the NPSFM. I call these limits because they are not merely incidental or practical challenges to overcome, but instead lie at the heart of the vision of the world and the theory of change that the promise assumes. Individually and collectively, these limits proscribe what can be achieved through this approach to reform in environmental politics. Below, I expound each limit and show how it confines the realisability of the promise. The limits can be interpreted as the completion of the sentence: We need to measure more of the environment so that the state can be held accountable for environmental improvement, but…
Limit (1) …but resources are scarce, and priorities multiple
One limit to realising the promise of measurement is that monitoring resources are scarce, and often compete with other political priorities that can vary locally. This can make monitoring variegated across space, which in turn affects which environments are regulated.
In New Zealand's decentralised context, regional councils are charged with environmental monitoring and planning in their jurisdictions. New Zealand's 16 regions have significantly different areas, populations, environments, and rates revenue, and some councils – Auckland, Gisborne, Nelson, Tasman – are ‘unitary’ councils that discharge both regional (environmental planning) and district council functions (e.g., infrastructure investment). Furthermore, since regional and unitary councils are elected, political priorities for expenditure vary across the regions.
Table 2 illustrates the heterogeneity of freshwater monitoring across the country in 2018 by indicating each region's number of monitoring sites for river water quality, river ecology (i.e., invertebrate communities), lakes, and groundwater monitoring, along with data on regional population, area, and rates revenue. A comparison between three regions is instructive. The Waikato, Manawatū-Whanganui, and West Coast regions have similar sizes of approximately 8–9% each of New Zealand's land area, and all are not unitary councils. Yet the revenue available for regional councils to fulfil their functions – which include monitoring freshwaters across their land areas – ranges from $85M for Waikato, $40.6M for Manawatū-Whanganui, and $4.2M West Coast, an order of magnitude difference. Even so, despite West Coast having only 5% of the revenue of Waikato Regional Council for a similar sized area, it nevertheless prioritised maintaining 37 river water quality sites, which is 31% of the number of sites in the Waikato.
Information from 2018 on New Zealand's 16 regional councils and their freshwater monitoring networks. Data from PCE (2019).
Although the relationship between council revenue and environmental monitoring is multifaceted, council scientists were clear that monitoring is resource-intensive and that expanding monitoring is constrained by resource availability. Within one regional councils’ day-to-day functions, monitoring was described as ‘the biggest proportion of everything we spend’.
With the new requirements to monitor 22 attributes for all waterbodies in their regions, councils must reallocate scarce public resources. Implementation requires resources for new monitoring gear, lab testing, data warehousing, analysis, and reporting, plus staff time needed for sampling, processing, writing up, and presenting the data back to the regulator (the Ministry for the Environment) and the public, e.g., through public web interfaces. These monitoring costs of the NPSFM hit councils hard. To put this into perspective: for a small unitary council, Nelson City Council, the environmental monitoring team grew from two staff in 2008 to 17 by 2020, and from one freshwater monitoring specialist in 2016 to five in 2021. And that was only to implement the nine mandatory attributes of the 2014 NPSFM, not the 22 attributes required by the 2020 NPSFM. For councils these monitoring costs are massive, and even so they only reflect a fraction of the costs of implementing the NPSFM overall, which also includes policies on no net loss of wetlands, fencing, and stocking rates. One calculation of overall costs to implement the NPSFM estimated these from around $5M per annum (Nelson) through to $47M per annum (Canterbury), with the rest falling between $7–19M per annum (Castalia, 2020). Even for one of the largest councils there is a sense that ‘the NPS asks a lot of us, and it asks more of us than we’re capable of delivering.’
Confronted with limited resources, councils are leveraging the NPSFM's flexibility to triage. The NPSFM does not require councils to monitor every single waterbody, which would be unreasonable; rather, councils are required to monitor ‘freshwater management units’ (FMU) as representing wider areas (New Zealand Government, 2020). Thus, the 22 attributes must be measured not for every waterbody, but for every FMU. FMUs were left for councils to determine themselves based on local conditions (MfE, 2016). This has meant that, since 2014, councils have been redesigning their monitoring networks around FMUs. To implement the 2014 policy, for example, Greater Wellington Regional Council identified five large contiguous areas as FMUs, to reflect shared catchment basins, whereas Northland Regional Council initially selected two FMUs, lowland and hill country, that were not catchment-based. Elsewhere, councils have been realigning their monitoring networks, often reducing monitoring sites to free up resources to meet regulatory requirements for a smaller number of waterbodies. Tasman District Council, for instance, despite a 50% increase in the council budget allocated for river water quality monitoring in 2016, had to halve the number of sites in the network to monitor the new attributes. This council scientist from a different small council grieves the triage process, I’ve been in meetings where managers say ‘We have to drop sites: which?’ I don’t like those meetings ‘cause we invested time and effort, because the site is representative, and if anything you should keep it and add more. But we don’t have those conversations. I put up a case for bare minimum of what's needed to meet the NPS. I thought we needed 11 FTEs (full time equivalents) but we are getting about three in the next five years, or something like that. So that's the scale we’re talking about, in terms of keeping a lid on rate rises, and so forth.
In New Zealand, the promise of measurement enshrined in policy compels councils to ‘measure more’ about the environment, but resource scarcity and allocation mechanisms limit and direct how this plays out. Here, the significant costs of measuring more attributes means that councils like Tasman District Council are monitoring fewer river sites. Furthermore, the extent to which councils can increase or reallocate revenue to expand monitoring is constrained by the electoral dynamics of local government finance and planning processes. New Zealand may be in the strange position of having strengthened protection for a small number of rivers, and rendered previously monitored rivers less visible to regulators.
Limit (2) …but even numerical measures can be ontologically ambiguous
Measuring more things, in consistent ways, allows the state to account for environmental conditions and frame aspirations for improvement. The promise of measurement is that it offers a comprehensive account of the natural world, where different places can be compared and careful logics for intervention created. But while measurements are valued for being objective and portable (Porter, 1995), council scientists expressed frustration regarding how measures can mean different things in different contexts. If the ability of measures to represent what people care about varies from place to place, can state policy be trusted to deliver meaningful environmental improvement?
In New Zealand, the NPSFM provides standards for not only what attributes to measure, but also how to measure. All 22 attributes in Table 1 are required to be measured and reported in a particular way. For instance, for river water quality, nitrogen is measured as nitrate from a sample of river water as (mgNO3–N)/L (milligrams nitrate-nitrogen per litre) and reported against the A-D grades spelled out in the policy. Any monitoring site with an annual median water quality above 2.4 milligrams of nitrate-nitrogen per litre, or where 95% of monitoring samples fall above 3.5, is failing to comply with the policy and must ensure compliance ‘as soon as reasonably practicable’ (New Zealand Government, 2020: 38). Analogous numerical thresholds are described for the 9 other mandatory attributes, and for the other 12 attributes, where numerical grading bands indicate whether councils are maintaining or improving the attributes as required.
By measuring universal attributes in the same way, it becomes possible to report across monitoring sites in a statistically coherent manner. Each year, Land Air Water Aotearoa (LAWA), a multi-council science reporting platform, assembles data summaries of all river water quality sites (as well as lakes and groundwater) reported by councils, along with their regulatory bands (see Figure 2). Graphs also show how sites in native forest, exotic forest (i.e., forestry), pasture, and urban land use types compare (Figure 3). By treating monitoring sites as statistically-aggregable entities, LAWA provides a collapsed view-from-a-distance of hundreds of monitoring stations across the country, making it possible to have headlines like ‘More than 60 percent of New Zealand rivers are unswimmable, in poor condition’ (Piper, 2020), ‘Two-thirds of New Zealand's monitored river sites ecologically impaired’ (LAWA, 2021) and ‘More than 80% of New Zealand's low-lying lakes and rivers surveyed “poor” or “very poor”’ (RNZ, 2022). The completeness of such claims speak to the competence of the state, while the dire results reported may bolster a sense of the state's transparency and trustworthiness.

Percentages of monitored sites within each attribute band for six attributes in the NPSFM. Bands were calculated for 1 July 2017–30 June 2022. MCI is macroinvertebrate community index and DRP is dissolved reactive phosphorous. Note that for any Freshwater Management Unit, ammonia, nitrate, and E. coli must be improved above an E grade; the others must have action plans for to either maintain or improve grades. From https://www.lawa.org.nz/explore-data/river-quality.

Percentages of monitored sites within each grading band for the macroinvertebrate community Index (1024 sites) and dissolved reactive phosphorous (872 sites) between four land cover classes. Bands were calculated for 1 July 2017–30 June 2022. Note that for any Freshwater Management Unit, these attributes must be either maintained within a grade or improved to a higher grade. From https://www.lawa.org.nz/explore-data/river-quality.
But while these representations of rivers seem logical and even politically compelling, they are nevertheless state simplifications. It is common knowledge that monitoring sites are not evenly distributed across space (see e.g., PCE, 2019), yet LAWA summaries convey the sense that they are independent and statistically equivalent. For instance, if 90% of river monitoring stations were on a single river, it would be obvious how the claim ‘60% of NZ rivers are unswimmable’ is invalid. LAWA works with data that are available, yet it is well-known that council monitoring networks are biased in several ways. Council monitoring sites were often established above and below major pollution sources such as wastewater treatment plants and factory outflows, and as monitoring expanded, councils focused on larger rivers and highly impacted rivers (PCE, 2019; Tadaki, 2022). Often councils placed sites in convenient sampling locations, such as alongside a bridge, and/or alongside other valued council operations, such as flood monitoring. Thus, large rivers often have several monitoring stations along them, to reflect their larger catchment area, while small streams and tributaries are notoriously under-monitored. Temporal bias is also an issue: monthly grab samples of water are usually collected at convenience of weather and scheduling, for example, and some measures, like river metabolism, peak and trough over the course of a day, meaning that snapshot measures do not adequately represent their ecological function (Tadaki et al., 2014b).
A second observation is that the NPSFM, and freshwater monitoring generally, focus on measuring attributes using methods that are straightforward, reproducible, and, to a certain extent, cheap. The emphasis is measuring consistently; however, what is easy and consistent may not always be the most meaningful. Nitrate, for example, is often analysed by a professional laboratory from a monthly ‘grab sample’ of water at a particular monitoring site. Yet measuring dissolved nitrate has interpretive limitations: The thing that I worry about with nitrate is that it goes down as algae grows, you know, the algae takes it up. So, we’re measuring it because we’re worried about growing algae and the effects of that, the secondary effects. But as the outcome happens, the amount goes down. And so all these councils go ‘oh that's great: nitrogen is reducing!’ Now that's because we’ve got a massive standing crop of algae there. (academic scientist) You can have none measurable in the water column, but it's all in the sediment. So you’ve got a massive amount of benthic algae just basically farming the phosphorus that's in the sediment. the shuffle index is: … you put a white tile in the stream, and you walk two metres upstream and you kick the substrate, and you look at the amount of cloud that forms in the stream. It works really well, but as the summer goes on, we get more and more periphyton in the stream. So as you get more periphyton, when you do the shuffle kick, you get that cloud of stuff which is hiding the tile, and sometimes it's not sediment; it's periphyton. (council scientist)
Limit (3) … but empirical measurement is epistemologically narrow
The promise of measurement, at least in its New Zealand configuration, places a focus upon direct empirical measurement of the environment. Monitoring, in the sense of consistent material sampling of physical, chemical, and biological entities of waterways, is valorised as the primary currency of environmental knowledge. Yet we have already seen how empirical monitoring is prone to ambiguities, which in turn pose difficulties for enacting costly regulation of land use. Is the focus on direct empirical measurement limiting the purview of environmental governance?
Consider E. coli, which can pose an acute human health risk, and which the local public need to know about to manage their swimming behaviour. This scientist explains the inadequacies of empirically monitoring E. coli for this purpose: if my team go out sampling on a Monday to a site, it gets to the lab in Christchurch on a Tuesday, and has the result on a Wednesday evening. So [we’d] probably get the result on a Thursday afternoon, and ideally we send an email out, and we put it in the newspaper for the weekend. So we are notifying people on a Friday of what it was like at 10am on Monday morning. [O]nce you have that, and you have good catchment knowledge, then you can use other indicators to determine the health risk attached to swimming there. [You can] have a monitoring programme that feeds into the model, and helps you work out your risk, as opposed to just telling you how much E. coli is there. (council scientist)
In addition to E. coli, sediment and nutrients were also identified as attributes better suited to management by modelling than empirical monitoring. Measures of water clarity or turbidity can provide proxies for grasping the ecological impacts of sediment, but tell you little about where the sediment came from, which activities are responsible for it, and how much is working its way down the system. For that, sediment source-tracking and catchment budget models are considered superior. For nutrients like nitrogen, there is an urgent need to calculate how to apportion nitrate ‘loads’ in a catchment to arrive within the national standards for a waterway. Measurement can quantify how much is in a river, but not where it came from and thus which land users can be held legally responsible for reducing it.
Arguably, the most powerful mechanism for regulating nitrogen from individual farms came not from high density monitoring, but from the use of a model. Overseer®, a government- and industry-sponsored fertilizer management model, was used in the Manawatū-Wanganui regional plan to quantify how much nitrogen leaching each farm was responsible for (Duncan, 2014). By making certain assumptions of soil characteristics, fertilizer use, land use, and catchment ecology and hydrology, Overseer® was used to calculate quantities for nitrogen leaching that would be needed to achieve the overall river water quality objective for nitrogen. Since the Overseer® estimates had regulatory force, these were fiercely contested in Environment Court, and although the estimates were upheld, this led to intense political scrutiny of the model and contestation by industry and others over its adequacy for regulatory use (see PCE, 2018).
In addition to modelling, scientists also identified other forms of knowledge as relevant. In many places, for example, it is obvious that streams have been polluted, sedimented, and unshaded, and that remedial action should not wait for, nor be reliant upon empirical measurement of that phenomenon. One council scientist bemoaned, We know that a lot of these rivers have really high temperatures, particularly our lowland unshaded smaller streams in pastoral landscapes. They’re hot. And the dissolved oxygen fluctuates dramatically in the summertime. And we’ve just got to get on and plant them.
Limit (4) … but past measures have not driven needed improvement
A fourth limit to the promise of measurement is that environmental measurements affect the world in limited ways. It is often assumed that more measurement will drive improvements in environmental performance – but through what mechanism? A simple way to explore this is to consider how decision-making institutions use past and current environmental data; then trace how additional data are likely to be used.
In New Zealand, it could be asked: how do current institutions use existing monitoring data? My interviews with scientists revealed shared pessimism on this point. A key problem is that New Zealand has suffered from an incremental and permissive permit-by-permit system, where uncertainty in the environmental impact of a land use tends to be met with permission being granted so long as some mitigation efforts are undertaken (see Brown et al., 2016; Peart, 2007). Scientists often highlighted that the burden of demonstrating ‘adverse environmental effects’ of any prospective development has in practice been placed on councils rather than developers: For a developer, they’re looking at us and going ‘well, if you’re telling me I can’t do this then you’re going to have to prove it, otherwise I’m going to take you to court.’ (council scientist)
Interestingly, even when councils have monitoring directly above and below a site and can isolate the environmental effect of a single land use parcel, the signal is not always as clear as hoped. An academic scientist recounts: when I was in Environment Court over the discharge of a Fonterra waste plant into the Manawatū River […] the MCI had bottomed out way above that plant, because of all the other impacts. We couldn’t show an impact on MCI because it was already off the bottom of the scale. So you’ve got MCI: a good robust measure of the health of a river system, based on the presence or absence of species of different sensitivity. But you know, if someone says to me ‘well you know, the MCI score for that particular stretch of water is 80 or whatever, generally poor,’ there's no magic wand I can wave and say ‘ok, well I can get that from 80 up to 100 by doing A, B and C.’ I have to actually really understand the system. Now, regional councils should be able to do that, they should be able to stand in those streams and say ‘we understand why the score is poor, and what we need to do to get it up there.’ But at the moment there's no magic bullet for that. (council scientist) I am totally unaware of any major links between my results – where I go off and do all these monitoring and get the MCI score for example, and give them to LAWA and whatever – and what our policies and plans are doing. I have never had a policy planner come to me to say ‘oh that stream is degrading over time,’ or ‘it's not showing the MCI score we want – what can we do about that?’ […] If I don’t see those links now, I don’t see how they’re going to magically get better by changing the [monitoring] network design? [A]ctually we do have a pretty good understanding of what's going on out there. That doesn’t mean that we have necessarily set the regulations in the right place, and set the policy in the right place to stop degradation. But I don’t think it's through a lack of monitoring, I think it's more a lack of a will to make those decisions. I’ve produced three major State of Our River Water Quality reports since I’ve been here, and each one basically has the same conclusion. And the last time I did it, I told councillors in a workshop that I’m really getting sick of telling the same thing, and if we’re really not putting a lot of resources into doing something about it, what's the point in keeping on monitoring?
Unpacking the politics of environmental governance by measurement
Looking at the New Zealand story through the lenses of measurement as accountability, performativity, and a mode of governance helps to clarify the what is at stake with the evolving NPSFM regime. In turn, analysing the New Zealand experience helps to extend critical understanding of the role of measurement in statecraft.
Through requiring monitoring and improvement of the NPSFM's 9 and then 22 attributes, we see the state taking on more accountability to the citizenry for its environmental territory. Environmental concern is being enshrined in the operating infrastructure of environmental monitoring and reporting, creating a framework for knowledge production and enabling citizens to evaluate state action. Through the 22 biophysical attributes of the NPSFM, the state has performed environmental concern in concrete yet reductive ways. Selecting attributes like nitrogen, visual clarity, and dissolved reactive phosphorus that are consistently and repeatably measurable facilitates creation of state simplifications like LAWA showing water quality grades across different regions, attributes, or land use types, offering an authoritative ‘view from nowhere’ (Jasanoff, 2017). Finally, the NPSFM conjures measurement as a mode of governance, committing to empirical monitoring as a key source of truth and foundation for decision making. Grounded in ideas of performance improvement, the NPSFM establishes monitoring requirements and then relies on the systems of regional politics, budgeting, and planning, permitting and enforcement to improve environmental outcomes despite little evidence that science has driven better outcomes before (e.g., Brown, 2017). The New Zealand experience thus illustrates the political dynamics of measurement in environmental governance, while also adding dimensionality to the critique.
Accountability in tension
The organisation of environmental responsibility in New Zealand complicates the story of measurement as accountability because there is no singular, unified state. Arguably, central government gains legitimacy by using expert-derived measurements to demonstrate a comprehensive and trustworthy picture of the environment, and by categorically claiming a policy of environmental improvement. However, decentralised implementation foists both costs and responsibility for improvement onto local government, a key pathology of neoliberal environmental governance (Cohen and McCarthy, 2015; McCarthy and Prudham, 2004). If the environment improves, central government can claim credit by its policy, while if the environment degrades, councils receive blame. Councils are accountable to local ratepayers for expenditure of scarce financial resources, and while the NPSFM says councils ‘must’ measure and then maintain or improve freshwater attributes, councils juggle many ‘musts’ (see e.g., Kirk et al., 2020; Tadaki, 2022). It is also councils who are left with the complicated task of how to achieve required improvements, where high evidentiary standards of the courts, development pressure on politicians, and limited resources all constrain land use regulation. Only in Otago, where Ministerial direction circumvented local political processes, was a step change in freshwater monitoring capacity forthcoming. It may appear that the state is becoming more accountable for environmental attributes within its territory; but the reality is that responsibility for improvement is being devolved to cash-strapped, overloaded, politically hamstrung councils.
The accountability moment of measurement in the New Zealand case may be more performative than substantive, with central government spending effort deploying ‘visual, verbal, and gestural symbols’ (Ding, 2022: 7) toward ‘fostering an impression of good governance’ (Ding, 2022: 11, emphasis added) instead of directly regulating land use. Thus, even in democratic and socially progressive contexts like New Zealand, more measurement does not simply translate into accountability when resources and powers to regulate do not follow. Accountability through measurement, while an important foundation for claims-making in environmental politics, can risk becoming a hollow state performance.
Performativity of infrastructures as well as attributes
New Zealand's experience illuminates how monitored biophysical attributes as well as wider environmental monitoring networks are contingent and performative. The NPSFM attributes were developed with non-state experts, generating a perception that the attributes have scientific legitimacy and are not simply a reflection of an interested, captured state (see Koolen-Bourke and Peart, 2022). Attributes like nitrate, dissolved reactive phosphorus, visual clarity, and E. coli were considered suitable due to the existence of standard methodological guidelines, continuity of existing records, consistency of measurement (sensitivity), and ease of sampling, among other criteria (Clapcott et al., 2018). This emphasis on sampling consistency for statistical comparison in turn implies that monitoring sites are equivalent in their ability to represent ‘New Zealand rivers’ as an entity (see also Tadaki et al., 2014a, b; for an illustration see McDowell et al., 2024). Such a statistical view enables LAWA's summative view of water quality and allows the public to grasp the correlates of decline, but this aggregation obscures the ontological ambiguity of measurement and conceals the well-known spatiotemporal biases and limitations of monitoring site measurements. (PCE, 2019; Tadaki, 2022). Furthermore, with the NPSFM forcing resource triage, we can expect less spatial coverage, meaning that some environments remain – or become – ‘inscrutable’ (see Kroepsch and Clifford, 2022) and therefore less available for management attention and effort.
Whereas previous analyses of the performativity of measurement focussed on how biophysical attributes constitute selective representations of the world and therefore management needs; New Zealand's experience reveals the world-making features of the monitoring network writ large. The spatiotemporal organisation of measurement infrastructures is also contingent – which rivers, lakes, and groundwater systems are monitored – and this will continue to shape how freshwater problems are understood (e.g., real-time swimmability versus long-term trends), and therefore how responsibility for environmental improvement is attributed and implemented. Claims about what is desirable and what has or hasn’t been achieved in New Zealand freshwater politics, will be configured by these biases and therefore need to be interpreted within them.
Alternative modes of governance
As a mode of governance, the NPSFM regime conjures empirical measurement as a key object for investment and decision making in environmental politics (see also Hesse et al., 2023; Shapiro et al., 2017). Claims about what is changing, what is driving change, and what is acceptable change, will likely hinge on the quality, resolution, and interpretation of monitoring data. This gives specific biophysical attributes and measurements new claim-making power, but at the same time, predicating action on empirically measured change could also crowd out other bases for making claims (e.g., modelling, Indigenous knowledge, theoretical knowledge) and having political standing in environmental governance (see also Loring et al., 2021; Shattuck, 2021). Underpinning this empiricist mode of governance lies the assumption that better knowledge of the environment can and will unambiguously direct remedial action (Kuch et al., 2020). While this assumption is core to the promise of measurement in science and environmental policy, New Zealand's experience exposes how even fine-grained state of the environment data may not overturn the political economic powers driving environmental degradation unless the burden of proof for environmental harm can be reconfigured. Furthermore, councils’ reallocation of scarce resources toward monitoring infrastructure reduces resources to invest in other mechanisms of environmental improvement and regulation, such as riparian planting or nutrient modelling.
Here, the promise of measurement remains alluring because it focuses on the benefits of knowledge while the costs and dilemmas of producing desired environmental changes remain out of frame (see also Lahsen and Turnhout, 2021). This leads to a sense that, as one council scientist concluded, ‘some people would prefer more science than have nice water quality.’ The New Zealand case thus illustrates how a focus on measuring environmental condition, even with a regulatory command to improve that condition, can work to defer the difficult (and much more important) struggle over land use and pollution. This pushes the real politics of freshwater into the much less visible domains of the courts and planning where powerful moneyed interests, under-resourced local governments, electoral politics, and a skewed burden of proof tend to tend to favour further development despite a record of environmental decline.
Conceptualising the NPSFM approach as a mode of governance helps to identify its blind spots, and therefore elements that an expanded political program could prioritise. First is the issue of resourcing measurement, and the unevenness this entails. Options include to centralise resourcing for environmental monitoring or to force adequate provision by local governments, either by centrally raising rates or compelling sufficient prioritisation through another mechanism. Second is to reform decision making institutions to enable ‘good enough’ knowledge of the environment (Gabrys et al., 2016) to direct permitting decisions and thus avoid costly debates about science in court. This should include reversing the burden of proof to favour environmental protection, where the types of knowledge permitted should include empirical monitoring data but also modelling, theoretical knowledge, and Indigenous and local knowledge (see Tadaki, 2022).
Conclusion
A recent study found that to statistically detect change in E. coli in New Zealand's monitoring network within 20 years, the sampling frequency would have to double from present, quadrupling the cost (McDowell et al., 2024). Just as New Zealand has begun to make headway into driving consistent and comprehensive measurement of freshwater attributes in space and time, the promise of measurement recedes beyond the horizon. As Shapiro et al. (2017) note, the environment can never be enumerated enough; there is always more measurement to do. Given this, where should we go from here?
This paper has examined how the promise of measurement has played out in New Zealand, a potentially best-case scenario of science-informed policy where measurement has been given the political force to change the world. By exploring the issues faced by scientists tasked with realising this new freshwater management regime, this study reveals how the promise of measurement is fundamentally limited. New Zealand's experience unravels the common sense that ‘you can’t manage what you don’t measure’, illuminating how:
measurement must be justified against other demands on scarce resources, measurement may not consistently represent phenomena of concern, empirical observation may not pinpoint management interventions, we often don’t effectively manage what we already measure.
A response of ‘more measurement’ will not adequately address these issues, which are more social than they are environmental. Understanding these limits as constitutive of the promise of measurement – rather than lying outside it – means that scientists can include and articulate redistributive elements within their pursuit of measurement (see also Goldstein, 2022; Hesse et al., 2023; Lahsen and Turnhout, 2021; Shapiro et al., 2017). Scientists, activists, and policymakers could work together to advocate for equitable arrangements for monitoring and science finance (e.g., PCE, 2019, 2023) and defensible regulations based on ‘good enough’ knowledge (Gabrys et al., 2016). Through engaging with the politics of environmental measurement, scientists and environmentalists can pursue a more radically responsible vision of a good Anthropocene.
Highlights
Measurement can play different roles in environmental governance, from accountability, to performativity, to a mode of governance
New Zealand has enshrined measurement as a mode of governance under favourable political conditions, yet encounters distinct limits
Recognising the limits to measurement helps to identify more productive roles for science in environmental governance
Footnotes
Acknowledgements
Thanks to the interviewees for sharing their thoughts and experiences on freshwater monitoring, the freshwater Special Interest Group of regional councils for sponsoring my applied report, and the anonymous reviewers for helping me lift the contribution. The Manawatū Branch of the New Zealand Geographical Society and the School of Geography at the University of Otago provided situations and support to develop the ideas. Thanks to my wonderful colleagues Joanne Clapcott, Katie Clifford, Adrianne Kroepsch, Rebecca Lave, and Kiely McFarlane, for their feedback on drafts and support throughout.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Marsden Fund (grant number CAW1901).
