Abstract
National statistics offices are increasingly exploring the possibilities of utilizing new data sources to position themselves in emerging data markets. In 2022, Statistics Norway announced that the national agency will require the biggest grocers in Norway to hand over all collected billing data to produce consumer behavior statistics which had previously been produced by other sampling methods. An online article discussing this proposal sparked a surprisingly (at least to Statistics Norway) high level of interest among readers, many of whom expressed concerns about this intended change in data practice. This paper focuses on the multifaceted online discussions of the proposal, as these enable us to study citizens’ reactions and feelings towards increased data collection and emerging public-private data flows in a Nordic context. Through an explorative empirical analysis of comment sections, this paper investigates what is discussed by commenters and reflects upon why this case sparked so much interest among citizens in the first place. It therefore contributes to the growing literature of citizens’ voices in data-driven administration and to a wider discussion on how to research public feeling towards datafication. I argue that this presents an interesting case of discomfort voiced by citizens, which demonstrates the contested nature of data practices among citizens–and their ability to regard data as deeply intertwined with power and politics. This case also reminds researchers to pay attention to seemingly benign and small changes in administration beyond artificial intelligence.
Keywords
Introduction
A large increase in the amount of data collected by private sector actors has urged national statistics agencies globally to reposition themselves in an emerging big data landscape by considering new data sources and “modernize” the overall operation of official statistics (d’Alva and Paraná, 2024; Struijs et al., 2014). This ties in neatly with an overall attempt by the public sector to extract value from data through technologies such as data platforms and artificial intelligence (Esko and Koulu, 2023; Mergel et al., 2016). Therefore subtle, yet all-encompassing changes to data practices across administrative levels and areas have become a key subject for policy action (Dencik and Kaun, 2020; Lämmerhirt et al., 2024; van Ooijen et al., 2019).
One such example is the increased interest of public sector actors to acquire private sector-produced data (European Commission, 2024). In 2022, Statistics Norway announced that the public agency was requesting that all receipt data from grocers be collected to produce more detailed statistics, explaining in an online news article that “We saw that there was an opportunity to get a much richer data base and remedy major quality challenges we have had with under-reporting and other biases” (Egge-Hoveid in Gundersen, (2022)). Statistics Norway's proposal caused much discussion online and led to both grocers and the public requesting that the National Data Protection Authorities step in, which ultimately stopped the collection of receipt data for official statistics in May 2023 (Coll, 2023). It is, however, not Statistics Norway's proposal or the agency's ruling that is at the core of this paper, but the discussions arising in the aftermath of Statistics Norway expressing interest in collecting billing data.
While there is an increasing scholarly focus on both algorithmic resistance and resignation (Bonini and Treré, 2024; Draper and Turow, 2019; Velkova and Kaun, 2021), subtle changes in data practice in the public sector have received less attention, both when it comes to understanding public perceptions of these changes and the methodological challenges of accessing these. At first sight, Statistic Norway's request seems like a very small change to the current data practice. The receipt data is already collected and analyzed by grocers, intensified through various loyalty programs (Turow et al., 2015). Statistics Norway already has access to large amount of personal and anonymized data. Consumer statistics are already produced by Statics Norway through other means such as surveys and diaries. Why then did this case spark such public engagement?
Norway is an interesting case, at it has a long history of comprehensive state data practices. The establishment of the Nordic welfare states has led to a long and wide-ranging collection, management, and use of data on citizens that provides a unique context of datafication (Tupasela et al., 2020). National registers on all aspects of social life track citizens from “the cradle to the grave” and have done so for several decades in the Nordics (Smith Jervelund and De Montgomery, 2020). National surveys show high levels of trust towards public institutions’ data practices in Norway (Datatilsynet, 2020). The increasing push for datafication in the public sector has, however, raised various issues that remain largely unseen by the public. As Broomfield and Reutter (2022) point out, the context, values, and agendas of datafication are often obscured from citizens, as datafication is largely portrayed as an apolitical issue in the public sector discourse in the Nordics. There is little news coverage on issues beyond artificial intelligence (Nguyen, 2023; Scott Hansen, 2022). Although data governance has gained scholarly and administrative attention in recent years, public views on questions of what constitutes a “good datafied life” remain largely absent in debates, as these are dominated by experts and the tech sector (Hartman et al., 2020). Nordic citizens are often portrayed as highly digitized with generally positive attitudes toward public sector digitalization and data management in policy papers (Broomfield and Reutter, 2022). As previous research on public attitudes toward data and automation shows, however, the picture is far more complex (Hartman et al., 2020; Kalmus et al., 2022; Kaun et al., 2023; Snell and Tarkkala, 2019).
It is this complex picture of the inherently contested nature of datafication which is at the core of this paper. The case of statistics Norway presents a case where issues of increased public-private data flows and changing data practices are openly communicated and therefore visible and accessible to public scrutiny. It presents a sensemaking event and therefore can offer a compelling insight into how discussions on datafication might unfold in the public sphere and provide a point of departure for accessing citizens’ voices and topics of debate beyond survey and interview data. As Armstrong et al. (2023) stress, greater scholarly attention is needed to investigate the everyday datafication of citizens’ lives and their understanding and management of mundane data flows. Rather than presenting a generalizable measurement of public opinion, this paper aims to demonstrate the value of researching such discussions to supplement our methodological toolbox. Using the relatively small and limited case of Statistics Norway, the paper aims to both show what is discussed by citizens in the aftermath of the request and explore why this case sparked such public engagement despite the seemingly benign proposed change of data practice. It demonstrates how meaning-making on changing data practices goes beyond issues of privacy and surveillance and touches upon a variety of topics associated with changing data practices across citizens’ everyday lives, thereby providing an interesting counter narrative to the often-simplistic view of public opinion in policy papers. This paper therefore contributes to the emerging literature on citizen perspectives in and on datafication beyond resistance or defiance.
Datafication of public administration
Public administration has long been a data-intensive realm of society, as statistics are both used to dictate the duties of the modern state and measure its success (Crosby, 1997; Desrosières, 1998). National statistics agencies are, however, increasingly pushed to reposition themselves in relation to emerging data markets dominated by private sector actors (d’Alva and Paraná, 2024; Struijs et al., 2014). This is accompanied by an increased interest in data beyond statistical purposes in public administration globally (Dencik and Kaun, 2020; Yeung, 2023). Public administration datafication has therefore attracted scholarly attention across disciplines (Mergel et al., 2016; Tupasela et al., 2020).
The datafication of public administration consists of two interwoven processes: (a) the use of more, different, or new data in administrative practice, and (b) the introduction of ever-more complex technologies, such as machine learning and data platforms, to recirculate data into practice. It is therefore both about changing data practices and the introduction of new technologies. Statistics Norway's request is both about acquiring new data sources and the use of more sophisticated analytical methods. An emerging scholarship has, however, argued that public administration datafication is far from benign, as the ability to put together, analyze, and manage new and larger amounts of data provides the state with increased possibilities to surveil, predict, and classify citizens’ behavior (Hintz et al., 2018).
Governmental use of data has long been a focus point for critical scholars. There is a significant body of literature on state surveillance and the increased use of technologies such as CCTV and facial recognition by state actors (Introna and Wood, 2004; Kalmus et al., 2022; Kostka et al., 2021). In addition, we can see a growing scholarly interest in the use of data-driven technology in decision-making processes in public administration and its consequences for society (Choroszewicz and Mäihäniemi, 2020; Kaun et al., 2023; Lomborg et al., 2023). However, subtle changes in data practices remain understudied, especially when it comes to public perspectives on issues of datafication.
Citizen perspectives on public administration datafication
As Jørgensen (2021) argues, new data practices in public administration reconfigure citizens’ rights and power. These changes might however happen in a way that is little accessible to the public. In an age of intense datafication, the risk of widening the information and power asymmetry between the individual and the state is substantial – a risk that might be further fueled by optimistic narratives of what data-driven governance may accomplish. (Jørgensen, 2023: 12)
What then do we know about citizen experiences with and attitudes towards datafication? In their extensive literature review on public understanding and perception of data, Kennedy et al. (2020) highlight the fact that knowledge and attitudes toward changing data practices are highly heterogeneous. However, there is a general agreement in literature that people are concerned about changes in administrative data practices. Here, emotions play an important role in the understanding and perception of data practices. As Ruckenstein (2023) points out in her work on algorithms, personal anecdotes and feelings are important sources of knowledge to balance datafication. These might, however, be difficult to access. Current methodological attempts mostly focus on interview and survey studies, which often include specific framings or leading questions (Kennedy et al., 2020). Several researchers have argued that citizens often perceive datafication as complex, illusive, and inevitable and therefore seldomly engage with issues of datafication (Rijshouwer et al., 2022). As Lomborg et al. (2024) argue, however, rather than being seen as discouraging, this should be taken as an invitation to dig deeper and extend and triangulate our toolboxes.
A promising way of accessing public perception of datafication is advocated for in data activism studies, which have started to explore the various ways in which people resist and contest datafication in everyday life in different ways, focusing on organizations and social movements (Gutiérrez, 2018; Lehtiniemi and Ruckenstein, 2018; Velkova and Kaun, 2021). These contestations can take various shapes and take both individual and collective forms, ranging from appropriating data for other purposes to resisting data collection (Beraldo and Milan, 2019). However, data activists are themselves often experts, and activism requires a certain degree of formal organization, which is not always sufficient to access citizen perception.
The paper contributes to the emerging literature on citizen voices in datafication (Hintz et al., 2018; Kaun et al., 2023; Kennedy et al., 2020; van Zoonen, 2020). Through this paper, I aim to demonstrate that we can find the contentious politics of data in various spaces, beyond organized activism or artificially produced settings of research projects. Including studies of public sensemaking, where citizens are confronted with changing data practices and both individually and collectively attempt to make sense of these changes, will allow researchers to tap into the “complex ecologies of trust” (Steedman et al., 2020) among citizens and their perception of norms for appropriate data flow. These have already been studied in the aftermath of data scandals (see for example Dencik and Cable, 2017), but not in connection with proposed changes in data practice. It is therefore important that researchers and journalists critically engage with how issues of datafication are presented and communicated to the public (Nguyen and Hekman, 2024) and provide new alleyways for research.
Researching citizen voices
As mentioned above, empirical studies on datafication often focus on user practices, understandings and experiences of data within a specific context, researching these through interview studies and surveys (Flensburg and Lomborg, 2023; Kennedy et al., 2020). Many of these studies also focus on measuring data literacy or algorithmic awareness, rather than engaging with citizen meaning-making. More bottom-up approaches are needed, where citizens themselves identify issues, rather than being fed predefined frames (Lomborg et al., 2024). Where then can we find citizen perspectives on datafication beyond artificially produced settings?
Comment sections are by no means a representative sample of the population (Toepfl and Piwoni, 2015; Weber, 2014). They often display extreme views and consist of only a small and skewed sample of society. Indeed, a 2022 study in the Norwegian context showed that 87% of the population never or rarely expressed their opinions online (Fladmoe et al., 2022), which again reminds us of the importance not to generalize from online newspaper comment sections. This paper does not attempt to measure public opinion but seeks to identify and discuss meaning-making activities and how these might inform future research on public perceptions of datafication and how they might be used to triangulate findings obtained through artificial and resource-intensive methods such as workshops, interviews, or large-scale surveys (Lomborg and Kapsch, 2020).
The sample presented is relatively small and collected over a short period of time, studying one specific sensemaking event on a specific platform but should be regarded as worthwhile in that we can find counternarratives to the dominant resignation or acceptance discourse in these spaces. To avoid the more extreme comments, I have chosen to only sample the comments on the original article on the publishers’ web pages (Table 1). These are not moderated, but the platform has utilized a system which requires the writer of comments to answer two to three questions about the article before they are able to comment. This is intended to circumvent some of the trolling often associated with comment sections. As the analysis shows, there were surprisingly few uncivil comments detected, which is consistent with earlier research on online newspaper comment sections (Rowe, 2015).
Overview of nrkbeta articles on the case of Statistics Norway.
All comments on the initial four articles are included in the analysis presented below. In total, the sample consists of 350 individual comments (collected in the second week of May 2023), which allow for an in-depth analysis of the data material. The four articles were all published on nrkbeta, which is the technology section of the Norwegian public broadcaster's (NRK) online-only news outlet. Nrk.no is the second largest online news outlet in the country, with 1.46 million unique readers daily in 2022 (Mediebedriftenes Landsforening, 2024). An approximate number of direct clicks each article has received is provided by NRK, although this does not guarantee that the numbers provided are fully accurate, as indirect traffic is difficult to measure. However, they do give some indication about the reach of each article. Especially the first article listed here got a relatively high number of clicks compared to other articles published in the same year on nrkbeta.
The analysis consisted of several steps, starting from a descriptive analysis of all comments in context, and going on to sort similar comments into thematic groups and categories. To analyze the comments, I used an abductive analysis, relying on elements of surprise, going back and forth between the data material and previous research in the field. I was especially intrigued by comments that spread over several of the initially identified categories and by outlier comments that did not discuss the most common topics, which were (a) using cash and (b) surveillance. The surprising element here was the richness and depth of comments. It is these comments I want to draw attention to in the following analysis, as they might also contribute to answering the question of why this example sparked such intense public debate and demonstrate the complexity of the arguments put forward. To provide a more systematic overview of the empirical data, a descriptive quantitative content analysis of all comments was conducted after the initial reading and qualitative analysis. A codebook and variable definition are attached to the paper. Variables include a measurement of attitude towards the proposed change in data practice, emotional responses, topic discussed, and the overall length of the comments. I believe that both the quantitative and qualitative analysis offer intriguing insights into citizens’ perceptions of public-private data flows.
The case: Statistics Norway wants your billing data
Before elaborating on the content of the comment sections, I want to briefly introduce the articles that sparked the debate to provide context for the analysis presented below. In addition, written material produced by Statistics Norway on the case and its arguments are reviewed to explore their justification for the proposal.
In March 2022, the Norwegian online news publisher nrkbeta published a long piece on Statistics Norway's request to collect anonymized billing data from Norway's biggest grocers. “Statistics Norway now needs to know exactly what Norwegians buy in the supermarket." The online article included accounts from Statistics Norway, a short expert comment by Author 1, and expressions of concern from representatives of grocers and from a lawyer at the Data Protection Agency. In the article, the journalist explains that Statistics Norway has ordered the supermarket chains NorgesGruppen, Coop, Bunnpris, and Rema1000 to share all their receipts data. The same is required from Nets, the company that processes most in-store payments. The statistics agency stress that they are only interested in aggregated and anonymized data, pointing out that Statistics Norway believe they need access to all the data in order to create detailed statistics at the household level. And although the data is on an individual level, Statistics Norway stress that they are only concerned with groups of people.
This initial presentation of the issue was followed by three follow-up articles on the topic. The first follow-up article, published in March 2022, described how several political parties reacted to the request for billing data from supermarkets. This article also included a statement by Statistics Norway on the necessity of encouraging lively public debate on the topic and an update with the news that several of the grocers and Nets had now involved the Data Protection Agency. It's also interesting to note that the director of Statistics Norway wrote an opinion piece entitled “SSB [Statistics Norway] does statistics, not state surveillance” in August 2022 (Axelsen, 2022). The second nrkbeta follow-up article was published in October 2022 and updated readers on the grocers and Nets’ request to Statistics Norway to stop this proposed change to data practice, which had not been heard. The case was referred to the Ministry of Finance. In November 2022, the last article analyzed presented an interview with the Data Protection Agency, which was taking the case and assessing its necessity, alerting the public that they would probably stop Statistics Norway from collecting billing data. All four articles were written by the same journalist, Martin Gundersen.
The case took a final turn in May 2023, when the data protection authority declared: “We believe that there is no sufficient legal basis for such extensive processing of personal data” (Coll, 2023), which ultimately ended with the request for receipt data being withdrawn. Again, it is not the court ruling or the details of the reporting that are of interest here, but the discussions arising in the comment sections of the four articles.
Interestingly, Statistics Norway has recently published a working paper describing their rationale behind and experiences with accessing privately held data to develop a new household budget survey (Sæbø and Dimakos, 2023). In this working paper, legal, and technical issues are discussed, in addition to the benefits of having such data sources. Obstacles to these data practices include “skepticism from private data owners and in the media” (11), a problem that, according to the authors, is all about communication. The slight hint of paternalism in the arguments might be offset by taking a closer look at the actual discussions generated in the aftermath of making the request for billing data public.
Contesting public-private data flows?
How did citizens perceive this proposed change in data practice? First of all, the quantitative content analysis indicates that the comments were dominated by negative attitudes toward the proposal, as 59% of comments expressed such views. Nevertheless, there were also several balanced (7%) and positive (5%) comments in the sample. Balanced here indicates comments that presented arguments both supporting the proposal and expressing understanding for more negative responses. The sample contained a significant number of threads where commenters were debating issues such as how to protect your data from being collected, sharing links to other articles, or asking questions about previous comments. Therefore, not all comments showed clear attitudes or emotions towards the proposal itself. The most dominant emotional responses were anger, annoyance, and worry (Figure 1). However, not all emotional responses were negative, as several comments also expressed curiosity or hopefulness, which were often associated with seeing potentially positive uses of billing data.

Emotion.
Although the cash versus card debate was the most discussed topic in the comment section, discussions on privacy and surveillance also took up a significant amount of space, together with debates on statistics and the cost benefit of collecting billing data. All in all, the topics discussed (Figure 2) were surprisingly varied, ranging from issues of privacy to the future potential of misuse and suggestions on how to produce these statistics without collecting all the data.

Topics addressed.
Measuring the overall length of arguments reveals that 79% of the comments consisted of several sentences, often elaborating on the position taken. Many of the comments presented complex arguments, justifying their worries or annoyance, or calling for action in various ways. A systematic content analysis can provide us with some insights into the nature and content of the comments and provide us with a first overview of the data material. The following sections will nuance this picture and show some of the more surprising arguments made. Again, this is not to measure citizen literacy or pass any kind of normative judgement on the arguments.
Cash versus credit card
At first sight, the comments were dominated by discussions on the use of cash versus credit cards. Comments such as “No thanks. Might have to start using cash again then” or “I’m dealing with the consequences of this and starting to withdraw cash for daily use” can be found under all four articles. These were often among the 21% of short comments and then frequently followed by others sharing strategies on how to use cash in the Norwegian context, which is a country that has a very low usage of cash, especially after the pandemic. Commenters claiming to only use cash from now on were challenged by others who argued that this was impossible. In addition, commenters were reminded that only using cash also requires you to avoid all supermarket loyalty programs (which we will come back to later). This simple call for “cash only” indicates dissatisfaction with the proposed collection of billing data, offering a way to counteract or opt out. The dissatisfaction is then also directed toward the seeming impossibility of living a credit card free life in Norway. As Lauer (2020) argues, payment cards have become a key vehicle for consumer surveillance in the digital economy.
Discussions on cash are not just about the avoidance of producing data, but also about discussing strategies to sabotage and skew the data produced, as illustrated in this comment. Of course, you buy everything healthy by card and the rest in cash. Only buy salt, deodorant, and cream on the card. Let them fool around a bit. Lots of fun ways to manipulate your own references for sure.
Surveillance, privacy, and democratic governance
The request represented a breach with public expectations, shown by the several comments that expressed disbelief, questioning the very essence of national identity and norms for appropriate data flow: “Norwegian values are based very much on trust in the government and public institutions, I know my limit has been reached.” This can be seen from the surprising observation from initially reading the comment section that commenters frequently associated the change in data collection with totalitarian systems (especially comparisons to China) and regarded the notion of collecting receipt data as being un-Norwegian and undemocratic. “This is simply an attack on democracy. SSB must be scrutinized for what they do and why. Not least the last proposal.” Of course, we can argue that these comments represent a few individuals’ skepticism about the state, but many comments went beyond conspiracy theories and anti-state sentiments, questioning power relations, state responsibilities, and public values. In addition to the obvious challenges posed by the attitudes of Statistics Norway with regard to fundamental privacy protection, there is good reason to remind Statistics Norway, the Ministry, the Government and the Storting [Norwegian parliament] of the following: Their mission is to look after the population's interests, and it is the population that is their client. Their roles exist only to safeguard the interests of the population. The population does not exist for you, it is the other way around. This is a completely elementary concept in a democracy, and when you move away from this, you also move away from democracy and basic human rights.
Is it SSB itself that makes the cost/benefit assessment? In that case, can one trust that assessment, as the agency will obviously place great emphasis on the benefit for itself and is unlikely to consider the cost to business and the population equally.
Not surprisingly, the term “surveillance” was used more than 50 times in the comments on the articles, often contrasted with issues of privacy and freedom. However, surveillance was also often connected to discussions of state power rather than to an individualized understanding of privacy.
Isn't this a form of surveillance? Invasion of people's privacy. A democracy requires vigilance. Is this THAT necessary? If you accept it then…. what could be next?
This fear of state surveillance prompted the administrative director of Statistics Norway to address such concerns in an opinion piece where he states that “Statistics Norway does statistics, not state surveillance,” going on to argue that “Official statistics are a common good that provide the factual base for democratic debate and decision-making” (Axelsen, 2022). This was, however, counteracted by commenters highlighting the potential for future misuse and the impossibility of fully anonymous data sets.
Anonymous data yes. But they are never anonymous enough. NRK has shown this before when they were able to purchase location data. This is unsustainable. I'm actually a little scared by this.
Despite these explanations and the initial account that Statistics Norway was not interested in individual linked data, commenters were not convinced. Several comments argued that complete anonymity of data was not possible, often linking their comments to previous articles written by nrkbeta on the issue of anonymous or synthetic data. Although Statistics Norway explains that it is not individual but group data they are interested in, several comments pointed out that in order to create aggregated data (example, male, 30–40 years, rural), the billing data must at one point be linked to other information in public registers, as explained by Statistics Norway itself: “Bank transactions will be linked with a public registers of bank accounts (tax directorate) and a household register based on the central population register” (Sæbø and Dimakos, 2023). Although this will not appear in the official statistics produced, commenters were clearly concerned that these processes were required to produce the statistics.
Statistics in society
This concern is closely associated with various perceptions of what kind of role statistics play in state administration. What the purpose of statistics is and how much data is enough data are questions that are widely debated in the comment section, as indicated by the systematic content analysis.
Several commenters point out that there are other ways to obtain the information requested by Statistics Norway and stress that the request therefore seems disproportionate and unnecessary, as illustrated here. By collecting transaction data from 10,000 random households, Statistics Norway will be able to produce equally useful real-time statistics on consumption, prices and inflation. The increase in utility/quality of monitoring 100% is marginal compared to collecting data from a representative sample.
The whole point is to find out how much money each social group spends on food. When you have this information, you can answer the question: ‘How much support should a low-paid family with 3 children and 1 adult receive in support to enable them to have normal food consumption.’ Or: ‘What is the difference in the consumption of fruit and vegetables between different income brackets and parts of the country.’
By linking detailed statistical data to issues of social support or health, this comment highlights that receipt data can provide information to the state for social welfare purposes. The SIFO budget, a reference budget that is used to calculate social support money for social welfare payments, was named several times. The commenters felt it was crucially important that the figures used in the reference budget were both accurate and continually updated in order for families to receive the right amount of money, showing solidarity and empathy for people receiving benefit and pointing out the necessity of sharing data in order to do so fairly.
When debating the issue of how much data is enough, there were also several voices asking Statistics Norway to elaborate on their justification for requesting the receipt data. If one is to problematize data collection, it is probably a good idea to talk about the benefits so that people can form a reasonable opinion of the benefit/harm ratio. Some people oppose any data collection in a way that makes me think they don't like fact-based descisions. We should have no sympathy for the ideal of making decisions based on pure opinion and belief.
Many of the comments identified as “balanced” in the systematic analysis asked Statistics Norway to show a more nuanced understanding of the possible harm and future potential for misuse by both the public sector and hackers.
Connections to other cases of discomfort
Interestingly, the commenters make a variety of links to other changing practices and examples which they associate with surveillance or discomfort, both in the private and public sectors. As everyday life becomes increasingly datafied, commenters use their experiences with datafication as a reference point to reflect upon the request for receipt data. Several data scandals both locally and globally are mentioned in the comments.
Commenters argue that this kind of data collection might seem mundane or benign right now, but data might be misused at a later point in time. One comment pointed out that we don't know anything about future politicians’ intentions, while another considered the possibilities that new kinds of computing might provide, both indicating a certain level of uncertainty and discomfort about what might happen if this kind of data is first acquired and accessed by public sector institutions: “It is very unpredictable what this type of ‘innocent’ information can ultimately be used for.” Comments like these indicate that citizens situate this case in a wider landscape of datafication and an increased push for data use. They are interested in the long-term consequences of data collection beyond immediate individual consequences or added value for statistics.
As pointed out in a previously presented comment: “If you accept it then…. what could be next?” Misuse here is not always linked to the Statistics Agency or to grocers. Comments point out that this kind of data can be (mis)used by other actors in the public sector. Take for example this comment, which points out that the commenter trusts Statistics Norway to handle data in a responsible way but expresses concerns that other public sector organizations might use the data to profile citizens. Although SSB itself is unlikely to misuse the numbers to profile me, I do not trust in the least that other agencies cannot, or will not be able to, request access to the information to use for their own purposes.
I can very well imagine public agencies that would be able to use my purchases and my consumption pattern as an argument for the offer or the treatment I receive.
As the Norwegian public sector has introduced several initiatives to encourage data sharing in the public sector, this is an interesting discussion to follow. The commenters here see how this kind of data can be used in public administration beyond statistics. This feeds into concerns about a decreasing universalism and increased means testing in the Norwegian public sector, as the above comment indicates, and leads us to larger discussions on private versus public sector datafication. Indeed, Statistics Norway itself does not look at this in an isolated matter, as they argue in their working paper that: Statistics based on big data may be more relevant than existing official statistics by describing new phenomena, increasing timeliness and the level of details. To improve the quality of official statistics it is necessary for the NSIs to access and reuse these data. (Sæbø and Dimakos, 2023)
This then stresses that it is not “just” billing data which is of interest for the agency, and that this acts as a test case for investigating and exploring how to feed privately held data into Statistics Norway.
The role of the private sector versus public sector
In the first analysis of the comments, the discussions on private versus public data collection, trust, and power were the ones that intrigued me the most. While there were several comments defending Statistics Norway and explaining why they didn't regard the request as a problem, I detected very few individualistic comments along the lines of “I have nothing to hide.” This is interesting because much of the literature has focused on just this issue (Lomborg and Kapsch, 2020).
Many of the commenters pointed out that supermarket chains and other private sector actors such as Google already collect huge amounts of data, especially through loyalty programs. Other commenters, however, perceived a difference between private and public collection of data, and pointed out that large-scale data collection by private actors does not automatically legitimize the public sector to do the same: The fact that some private actors carry out the large-scale collection of consumer information does not mean that it is okay for the state to do the same, especially not without consent.
Honestly? I trust SSB SIGNIFICANTLY more than these companies that store exactly the same data through their benefits program such as Trumf, Æ, Coop membership card, etc. Rema 1000, for example, has already proven on several occasions that they cannot protect this data.
Hinting at several privacy breaches by one of the loyalty program owners, the comment above indicates that the commenter trusts public data managers more than private businesses. There is, however, also an interesting ambivalence that can be seen in the comments. While many commenters express the notion that they trust public data managers more than private data managers, this does not automatically mean that they support the data flow between the private and public sectors, as the grocers will still own the data. Several of the commenters, in fact, react to the fact that the supermarkets are against the collection of billing data. Why are the supermarkets so taken up with safeguarding buyers’ privacy? Do they have something to hide? This is especially tied to the increased cost of purchasing goods in supermarkets in Norway: I wonder if the supermarket chains don't have their own agenda in this, in that when 'someone' (SSB) can extract detailed prices (and product names), they might be afraid that this information can be used against them, e.g., to support criticism of high prices?
These comments identify the potential for revealing bad practices among grocers.
While at first sight, the comments were overwhelmingly negative toward the new data collection, there were also several voices that defended Statistics Norway or welcomed this practice, as they saw clear benefits in the collection of more detailed data: I am somewhat surprised that the Norwegian Data Protection Authority allows private companies such as Google, banks, insurance companies, retail chains, credit card companies and more to buy and sell information about us. For example, when we had children there were suddenly lots of advertisements for nappies, and children's clothes and equipment. How did the suppliers know that we had had a baby? If the Norwegian Data Protection Authority is going to refuse Statistics Norway to obtain information about what is normal consumption in Norway, I am somewhat surprised. After all, this is information that is used in, among other things, the SIFO reference budget, which is in turn used when calculating social benefits. We already allow commercial enterprises to do this for commercial purposes. But when the state wants to do the same, out of consideration for the good of society, all the politicians, and the Norwegian Data Protection Authority, will come out and say something about it.
Who is responsible?
Finally, I want to take a quick look at who commenters believe is responsible for fixing or addressing citizen discomfort. While there were some anti-state sentiments in the sample, these were few and far between. As mentioned above, many people regard the state as being responsible for serving not its own interests, but that of the population. In other words, commenters expect the “state” to take responsibility in some way or another, either by refusing the request itself or by using the data collected in a way that benefits society. Several comments mentioned that it was legislation that could stop this request from being granted, as they couldn't see that the request complied with GDPR or the constitution of Norway. The system should therefore be able to regulate itself.
In particular, the Data Protection Agency was regarded as being responsible for regulating and safeguarding citizen rights. Several commenters therefore called for the Data Protection Agency to become involved, as illustrated by this comment placing the responsibility of pulling the emergency brake with the Agency. When will someone pull the emergency brake? And say enough is enough? Because this is really way over the line and very invasive in people's lives. Hope the Norwegian Data Protection Authority puts its foot down!
I can't believe anything other than that the Norwegian Data Protection Authority will say no here, but I am shocked that Statistics Norway even thought this was a good idea.
At the same time, there were also calls for the supermarkets to take action and protect their customers from wide-ranging data collection by the state, which is interesting.
There is clearly nothing proportionate about this data collection. SSB must have completely lost their minds if they think Norwegians want to be monitored in this way. Here, the supermarket industry must stand up for its customers. Even Coop, which is probably one of the most eager to store and use data about its own customers.
However, this view on who is responsible for protecting citizens was rare, as most commenters laid the responsibility for refusing the request on government or on the data protection agency or on Statistics Norway itself.
Discussion: Understanding public perceptions of public-private data flows
Public administration has produced, and still produces and manages, large amounts of detailed data on citizens, which is increasingly regarded as a key resource for improving public administration (Dencik and Kaun, 2020; Mergel et al., 2016). This push for using data in new ways is legitimized in various ways. Public-private data flows are encouraged to secure the future of the (welfare) state through supposed increased innovation and modernization of operations in various national contexts (d’Alva and Paraná, 2024; Reutter and Åm, 2024). However, this is mostly steered by economic rather than citizen interests (Valli Buttow and Weerts, 2022).
The two dominant ways of portraying citizen attitudes towards datafication are those of resignation and support. We can, for example, detect a note of paternalism among public bodies, illustrated by SSBs own working paper (Sæbø and Dimakos, 2023), where citizen concerns are simply dismissed as skepticism that can be overcome by improved communication. Similar ideas of “they don't know what's best for them” have been found in previous research on citizen voices in public administration datafication (van Zoonen, 2020). Indeed, citizens are often portrayed as requesting and supporting public administration datafication with generally positive attitudes towards data and technology in policy papers and political discourse (Broomfield and Reutter, 2022). However, the public is rarely consulted directly in questions dealing with increased datafication (Esko and Koulu, 2023; Räisänen, 2023; van Zoonen, 2020). The assumption that citizens accept existing data practices may therefore overshadow or undermine the fact that they may also contest certain data practices.
Datafication is often seen as complex, illusive, and inevitable by citizens (Rijshouwer et al., 2022), which seems to hamper discussions on the societal implications of technology for the public and scholars alike. What then can we learn from the case of Statistics Norway, a small-scale and very limited case in a very specific context? Despite the discursive construction of digitalization and datafication as apolitical in media and policy, this case demonstrates that citizens are able to understand data flows as deeply intertwined with power and politics. Ranging from concerns of totalitarianism and issues of surveillance to discussions on data ownership by private versus public bodies, trust in public institutions and future potential misuse, these comments reach beyond ideas of individualized privacy or Orwellian fears of total surveillance and provide us with a starting point to critically engage with citizen voices on datafication.
There is growing interest in understanding data resignation and activism in current scholarship (Draper and Turow, 2019; Gutiérrez, 2018). Many of the comments, however, express what I call data discomfort, a feeling of unease or “breaching the line” among citizens. The comments express annoyance or disappointment, rather than resignation. Following Ruckenstein's (2023) work, this paper therefore aims to focus attention on the affective responses of citizens to changing data practices in the public sector. In other words, it investigates how wider political and economic processes are felt in people's everyday lives and how citizens might respond to them. There is a current lack of these often-mundane everyday perspectives in both research and practice (Armstrong et al., 2023).
Coming back to the illusive nature of datafication, I would argue here that the strength and power of this case lies in its relatability. The proposal by Statistics Norway presents a very concrete example of everyday datafication that many people can relate to. After all, most people engage in the mundane everyday practice of buying food in supermarkets. This might make it a more concrete instance of datafication than for example more complex and general issues of how to govern AI in public sector decision-making or data collection in social media. As the analysis shows, the case offers an entry point into broader issues of the politics of datafication, as commenters linked this example to other instances of discomfort in their everyday lives or linked the example to broader political issues of datafication and the future potential for misuse. The comments uncover a complex ecology of trust, which deserves our scholarly attention. Trust and distrust in data practices is highly context dependent (Steedman et al., 2020). The commenters operate within a data context which extends far beyond the isolated case of Statistics Norway, while simultaneously discussing the local specifics of the context.
In addition to the relatable nature of billing data, one might argue that the comments show that citizens hold the public sector more accountable and have higher expectations of the public sector when it comes to datafication than other areas. While citizens often feel a sense of resignation in relation to big tech companies (Draper and Turow, 2019), there is still a sense of power and accountability when it comes to public administration. Linking back to people's feelings about the case, we can observe clear sentiments of annoyance, disapproval, and disappointment. Statistics Norway's data request might, therefore, have given commenters a sense of agency to express their concerns and voice their discomfort. Indeed, the proposal was deemed disproportional by the data protection authority in spring 2023, due at least in part to several inquiries from the public, which shows a degree of agency.
This paper demonstrates how citizen meaning-making activities can play out online. Although they constitute a very small, limited sample, the comments demonstrate the potential for researching datafication beyond measuring awareness and data literacy or carrying out investigations through artificially produced and resource-intensive research designs. The paper presents an interesting way to access citizen concerns through public sensemaking events via a case which engaged citizens and made datafication accessible. By analyzing what was discussed in the comments sections of newspaper articles on Statistic Norway's request for billing data and reflecting upon why this case attracted the attention of the public so much, it offers useful insights into how debates on public administration datafication might develop in the public sphere. The case clearly shows the importance of the traditional media's role in bringing the issue of datafication to a wide audience beyond reporting on artificial intelligence (Nguyen, 2023; Nguyen and Hekman, 2024; Scott Hansen, 2022). As many of the decisions on changing data practices are taken behind closed doors, this presents a rare glimpse of the potential for public sensemaking on these issues.
Future directions on how to critically engage with citizen voices in datafication
Comments on the initial online newspaper articles are by no means generalizable measures of citizen attitudes towards datafication. They do, however, function as an interesting entrance point into the inherently contested nature of datafication and bring to the fore heterogeneous discussions on datafication, which then again can act as a point of departure for doing further research on the topic or for informing policy. There is a need to supplement empirical research on citizen voices with bottom-up approaches where citizens identify social issues themselves rather than responding to predefined framing by researchers.
A key methodological issue addressed by many scholars is the illusive and complex nature of datafication, which seems to hinder discussion. The case of Statistics Norway clearly sparked public engagement. The issue of receipt data could therefore serve as a vehicle for initiating discussions with citizens on datafication and its implications. This could include its use in citizen-centered workshops and assemblies (see for example Aitken et al., 2018; Lomborg et al., 2024; Wong et al., 2023), in the designs of surveys such as presented by Hartman et al. (2020) or as a way of problematizing public-private data flows in interviews. This case reminds us to take the mundane seriously and to situate our research in the everyday life of citizens.
Datafication is often portrayed as new and revolutionizing. However, state data collection has a long history and so do discussions on the appropriate use of data in administrative practices (Crosby, 1997). To understand which data practices are regarded as legitimate, the comments remind us to look beyond the immediate and to place discussions within a wider context of statistical and data practices in governmental initiatives. Another interesting point here is the way that citizens invoke past experiences and make connections to other datafication practices in everyday life in their meaning-making activities, while simultaneously problematizing long-term perspectives in their comments. These past and present repositories deserve further scholarly attention. How do citizens use their context and experiences to make sense of datafication and evaluate its public value?
Finally, through this seemingly benign case of Statistics Norway, I want to argue for an increased scholarly focus on data practices beyond AI or automated decision-making in the public sector. Public administration datafication is about more than introducing machine learning into the inner workings of bureaucracy. It is about using more and different data in all aspects of administration in ever-more complex ways, which deserve our scholarly attention even if they seem rather insignificant at first sight. While scholars such as Yeung (Yeung, 2023) direct our analytical attention toward the algorithmic ordering of administration, I want to draw attention to the more mundane and often overseen practices and data flows that are incrementally changing the way public administration is done, in the Nordic public sector and beyond. As d’Alva and Paraná (2024: 14) write: “big data represents an element of a broader dispute between nation-states and private corporations for informational capital accumulation and control.” How official statistics agencies and other public actors position themselves within data markets and citizens’ understanding of this therefore demands further scrutiny.
Supplemental Material
sj-pdf-1-bds-10.1177_20539517251320008 - Supplemental material for Researching data discomfort: The case of Statistics Norway’s quest for billing data
Supplemental material, sj-pdf-1-bds-10.1177_20539517251320008 for Researching data discomfort: The case of Statistics Norway’s quest for billing data by Lisa Reutter in Big Data & Society
Footnotes
Acknowledgements
I would like to thank the reviewers for their thoughtful comments on this paper. I am grateful to Aviaya Leonora Thorhauge Valeur for her assistance in preparing the data material and the Datafied Living project for their comments and feedback on the initial draft of this paper.
Declaration of conflicting interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the European Union (ERC, DATAFIED LIVING) under the Grant (947735). Views and opinions expressed are however those of the author only and do not necessarily reflect those of the European Union or the European Research Council Executive Agency. Neither the European Union nor the granting authority can be held responsible for them.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
