Abstract
In this piece, which frames the special issue, “The State of Google Critique and Intervention,” we provide an overview of research focusing on Google as an object of critical study, fleshing out the European interventions that actively attempt to address its dominance. The article begins by mapping out key areas of articulating a Google critique, from the initial focus on ranking and profiling to the subsequent scrutiny of user exploitation and competitive imbalance. As such, it situates the contributions to this special issue concerning search engine bias and discrimination, the ethics of Google Autocomplete, Google's content moderation, the commodification of engine audiences and the political economy of technical systems in a broader history of Google criticism. It then proceeds to contextualize the European developments that put forward alternatives and draws attention to legislative efforts to curb the influence of big tech. We conclude by identifying a few avenues for continued critical study, such as Google's infrastructural bundling of generative artificial intelligence with existing products, to emphasize the importance of intervention in the future.
This article is a part of special theme on The State of Google Critique and Intervention. To see a full list of all articles in this special theme, please click here: https://journals.sagepub.com/page/bds/collections/stateofgooglecritiqueandintervention
From PageRank to “assetization” of audiences: Articulating Google critique
Google's celebrated PageRank algorithm was critiqued quite soon after the launch of the search engine in 1998. The innovation in search results ranking was the initial employment of the number and quality of hyperlinks a website receives to evaluate a website's value, in the tradition of citation analysis (Mayer, 2009). As early as 2000 Introna and Nissenbaum pointed to the emergence of information hierarchies by arguing that PageRank would favor large, well-connected and often commercial websites at the expense of smaller ones and would therefore undermine the early democratic ideals of the web (Hindman et al., 2003; Introna and Nissenbaum, 2000; Rieder, 2012). Empirical studies followed, such as those from the healthcare context, which reaffirmed the findings by demonstrating how SEO’d or “search engine optimized” websites such as commercial portals tended to be ranked higher than smaller websites of self-help groups (e.g. Mager, 2009; Nettleton et al., 2005; Seale, 2005). Additional research pointed towards media convergence and concentration with established institutions and commercial concerns foregrounded in search engine results (and sponsored ads) at the expense of counter-cultural or more critical voices (Eklöf and Mager, 2013; Mager, 2012a; Nettleton et al., 2005; Rogers, 2004).
This initial search results critique developed into a more fundamental criticism of gender and race bias in algorithmic systems. The more dominant Google became, and the more websites, data and images it ingested, the greater the biases grew over time. In her popular book, Algorithms of Oppression, Noble (2018) collected devastating examples showing that search terms like “black girls” or “gorillas” produced discriminatory results ranging from massive porn content to images of African Americans tagged as apes owing to data bias, corporate dynamics, and ill-trained image recognition software. While Google quickly patched these search associations, the structural bias and discrimination search engines and other recommender systems produce have still not been resolved. On the contrary, with the integration of more and more data-driven algorithms, analytics, and artificial intelligence (AI) in both commercial and public domains, algorithmic bias and asymmetries continue to lead to inequalities and social disadvantages (Allhutter et al., 2020; Benjamin, 2019; Eubanks, 2018). Following this line of research, three contributions to the present special issue explicitly focus on search engine bias and discrimination in the context of extreme-right dynamics of exclusion (Norocel and Lewandowski, 2023), the ethical dimensions of Google Autocomplete (Graham, 2023), and Google's balancing of suggesting and moderating offensive content (Rogers, 2023).
The empirical bias studies are forms of algorithmic auditing, which themselves draw on social scientific “auditing” traditions of uncovering particularly racial discrimination in housing and loan applications. The “fair housing audit,” for example, seeks to identify “systematic differential treatment” of the same kind of housing applicants, save their race (Galster, 1990: 165). When applied to search engines (and social media feeds) the techniques, used by researchers and journalists alike, compare the results of ostensibly the same queries, albeit with switched gender, race, or other intersectional markers (Collins, 2019). The findings often point to either the perpetuation of particular stereotypes and biases or their outright blockage (Leidinger and Rogers, 2023).
A second strand of search engine critique that emerged in the 2000s focused on Google's revenue model based on consumer profiling. Van Couvering's (2008) was among the early scholarship discussing the commercialization of search engines, tracing Google's history from its early roots in academic research at Stanford University towards the introduction of its AdWords and AdSense advertising platforms. That lineage has been discussed in terms of “informational capitalism” (Fuchs, 2010; 2011), “cognitive capitalism” (Pasquinelli, 2009) as well as “surveillance capitalism” (Zuboff, 2015; 2019). At the heart of this critique is the “service-for-profile” business model (Elmer, 2004), where users receive services for free, while paying with their data. User data are translated into user profiles and sold to advertising clients. As such, consumer profiling has been described as “an ongoing distribution and cataloging of information about desires, habits, and location of individuals and groups” (Elmer, 2004: 9). Based on users’ search histories, locations, and search terms, search engines started to develop detailed user profiles, capturing desires and intentions of individuals and groups of users. Google's multitude of services in combination with Android, its mobile phone operating system, provided “data points” for the creation of these profiles. The level of granularity of user profiling for online advertising platforms was revealed after data activists and journalists unearthed a file on the website of Microsoft's ad platform, Xandr (Keegan and Eastwood, 2023). It contains 650,000 “audience segments,” capturing and combining categories ranging from health conditions and religious preferences to mental states.
Intrusive practices of user profiling have been conceptualized in the field of surveillance studies for some time now (Christl and Spiekermann, 2016; Lyon, 1994; 2003; 2007). While Elmer (2004) discussed search engines as Google as a “Panopticon” enabling user surveillance and shaping user behavior, Pasquinelli argued that the metaphor should be turned around: “Google is not simply an apparatus of dataveillance from above, but an apparatus of value production from below” (Pasquinelli, 2009: 153). Following a Marxist tradition, Pasquinelli (2009) argued that Google's PageRank algorithm would exploit the collective intelligence of the web since each link Google uses to measure a website's value would represent a concretion of intelligence to create surplus value. In a similar way, Fuchs (2011) elaborated how Google exploits not only website providers’ content, but also users’ practices and data. He labeled Google as the “ultimate economic surveillance machine and the ultimate user-exploitation machine” (Fuchs, 2011: 44, see also Mager 2012b).
More recently, big tech's means and mechanisms to turn user attention into “assets” through the measurement, governance, and valuation of digital traces and user engagement have been criticized in the tradition of audience commodification by media corporations (Fuchs, 2012; Smythe, 1977), also referred to as the creation of an “attention economy” (Birch et al., 2021; Pederson et al., 2021). Accordingly, not only the accumulated data, but especially the large-scale measurements and metrics conducted by platforms like Google, and their advertising networks, enable the commodification or “assetization” of audiences and user engagement (Birch et al., 2021). Reflecting these concerns in this special issue, the commercial dynamics of Google are traced back to Brin's and Page's first description of their PageRank algorithm (Ridgway, 2023) and embedded in the political economy of “technical systems” (Rieder, 2022). Moreover, the study of Google audiences, particularly the means by which the engine directs attention, contributes not only to what is visible and amplified, but also to ignorance (Haider and Rödel, 2023).
Tracing European interventions
Starting from Google's commercial dynamics, Mager (2012b) showed that “the new spirit of capitalism” (Boltanski and Chiapello, 2007) becomes embedded in search algorithms by way of social practices. Both website providers and users should be seen as not only passively exploited by Google, and other big tech companies, but rather as actively contributing to Google's “capital accumulation cycle” (Fuchs, 2011) with their own sociotechnical practices. They also co-produce its “algorithmic ideology” (Mager, 2012b; 2014a). Shifting the perspective from the political economy of search engines towards power relations in the making and stabilization of corporate search engines like Google enables us to start thinking about “social or political interventions that pave the way towards change” (Mager, 2012b: 783).
In the European context, some of the earliest discussions of alternatives were mobilized by critical librarians (Jeanneney, 2008). They also coincided with the “Googlization” critique, or the view that Google's “free” model would take over not only industry after industry but also cultural institutions such as the library (Vaidhyanathan, 2006). Buoyed by the framing of Google's hegemony as an European issue, quite a number of political and legal interventions have taken shape since then, especially after the US National Security Agency (NSA) leaks by Edward Snowden. In June 2013, Snowden revealed practices of mass surveillance conducted by American and British intelligence agencies. He accused big tech companies such as Google, Facebook, Apple, and others of collaborating with the NSA, which led to heated media debates (Mager, 2014a). With large-scale online surveillance and privacy violations pushed into the limelight, the European Union (EU) has tried to exert varying measures of control over big technology companies from the USA, and increasingly from China. Especially the entanglements between corporate surveillance and state control shaped policy debates and legislative acts within the EU. The rising salience of privacy issues led the EU to fend off lobbying attempts by big tech companies and helped privacy advocates to incorporate their interests into the General Data Protection Regulation (GDPR) (Regulation (EU), 2016/679). The Snowden revelations “saved” the GDPR, as Rossi (2018) straightforwardly concluded. Even though critique of the GDPR has emerged over the past few years, especially of its narrow, individualistic concept of personal data and its strong reliance on the “notice-and-consent” model (Mayer-Schönberger and Padova, 2016; Marelli et al., 2020; Prainsack, 2020), it is nonetheless considered a milestone in the EU's attempt to regulate big tech and their intrusive business and data practices.
In the aftermath of the NSA leaks, a number of significant court rulings and legislative acts have been passed in the EU. The European Court of Justice (ECJ), most notably, made crucial interventions, where the first important court ruling was “the right to be forgotten case,” which the ECJ passed against the company in 2014. Based on the former European data protection directive, the ECJ forced Google to delete illegal or inappropriate information about a person from the Google index if the person concerned requests it (at least from its European databases). This judgment has been described as remarkable, since it successfully applied European data protection legislation to a US technology company for the first time. This right to erasure has later been integrated into the GDPR.
In 2015, Google was faced with its first antitrust actions when the European Commission accused the company of cheating competitors by privileging its own shopping service in its search results (Lewandowski et al., 2018). Two other cases have resulted in formal charges against the company for privileging the Android operating system as well as Google AdSense. These three court rulings resulted in total fines of 8.25 billion euros (Chee, 2022).
More recently, the EU has adopted a number of legislative acts aimed at containing and controlling big tech companies including the Digital Services Act (DSA) (Regulation (EU), 2022/2065), the Digital Markets Act (DMA) (Regulation (EU), 2022/1925), and the European Data Governance Act (Regulation (EU), 2022/868). A fourth, the Artificial Intelligence Act, is still under negotiation. Within these legislative efforts, the amount of rhetoric concerning the preservation of “European values” has increased and been expanded from an earlier, narrow focus on privacy issues towards such notions as digital sovereignty, transparency, fairness, and sustainability—sometimes configured under the more generic phrasing of ethics. While formal EU policy tries to promote its “human-centric approach” towards digitalization, a growing body of research has shown how supposedly European values appear to be fragile, contested, and contradictory when examined with greater scrutiny. Research on digital innovations and most importantly AI has pointed to competing imaginaries regarding the EU's normative goals of promoting European values and the EU's economic interests, which are reflected, for instance, in the long-standing notion of the Digital Single Market (Ulnicane, 2021).
While there is research focusing on EU's digital innovation policies and the sociotechnical imaginaries surrounding them (Barais and Katzenbach, 2022; Krarup and Horst, 2023; Mager, 2017; Mager and Katzenbach, 2021; Ulnicane, 2021), relatively less is known about technology projects and infrastructures developed in Europe. Such a paucity should be addressed, given the burgeoning interest over the past few years in building European digital technologies and platforms to address the dominant American and Chinese “platform ecosystems” and their infrastructural power (Rieder, 2022; Van Dijck 2021a; 2021b). In fact, there are a number of European technology projects in the pipeline, often below the radar of public attention and overshadowed by Silicon Valley rhetoric. Besides big flagship initiatives such as the European Human Brain Project (Mahfoud, 2021) or the European cloud infrastructure Gaia-X (Baur, 2023), a series of digital tech projects aim at social change rather than market dominance. In the area of search, there is a multiplicity of search engines, applications, and initiatives that seek to provide an alternative to hegemonic search engines like Google.
The German/French project Quaero was one of the first search engine projects that aimed at creating a European alternative to big US-based search engines (Lewandowski, 2014). In 2005, the project was announced by Jacques Chirac as an attempt to “rival” Google and Yahoo, given the “threat of Anglo-Saxon cultural imperialism” (Litterick, 2005). Even though the project did not succeed in building a competitive European search engine, the idea of creating an alternative search infrastructure in Europe endured. In 2014, Lewandowski suggested creating an open, publicly funded web index, ideally as a “pan-European initiative” (Lewandowski, 2014). Moreover, search engines with a social cause have been created that piggyback on well-established search indexes and search results such as the “green” search engine Ecosia, meta-search engines of various sorts, or privacy-friendly search engines (Mager, 2014b). This aspect is explored further in the present special issue, in particular the manner in which European search engine providers counter-imagine and counteract hegemonic search through alternative search engine projects (Mager, 2023).
Contemporary Google studies: Special issue contributions
This special issue collects five original research articles (Haider and Rödl, 2023; Mager, 2023; Norocel and Lewandowski, 2023; Ridgway, 2023; Rogers, 2023) and two invited commentaries (Graham, 2023; Rieder, 2022), all of which are devoted to the state of Google critique and intervention by engaging critically with its study as well as prospecting for alternatives. Three contributions focus on search engine bias and discrimination in the context of right-wing extremism (Norocel and Lewandowski, 2023), Google Autocomplete (Graham, 2023), and content moderation (Rogers, 2023). Another three tackle the commercial dynamics of Google, tracing its roots to the first description by Brin and Page (1998) of their PageRank algorithm (Ridgway, 2023), embedding it in the political economy of “technical systems” (Rieder, 2022) and relating it to how search engines contribute to the creation of ignorance (Haider and Rödel, 2023). Finally, the last contribution analyzes how to step beyond big tech and create alternative search engines and infrastructures in particular in the European context (Mager, 2023).
In the first piece, Ov Cristian Norocel and Dirk Levandowski (2023) develop a critical big data perspective to explore the manner in which search engine users may be directed towards extreme-right content, despite Google's proclaimed quality control and content moderation. Norocel and Lewandowski gauge the tentative contours of data voids whereby Google queries return skewed and manipulatory content, which reflect extreme-right dynamics of exclusion in the aftermath of the 2015 humanitarian crisis in Europe (Hellström et al., 2020; Norocel, 2017). They add complexity to existing analyses of data voids by expanding the framework of investigation outside of the US context by concentrating on Germany and Sweden. Building on previous big data analytics addressing the politics of exclusion, Norocel and Lewandowski develop a catalog of queries concerning the issue of migration in both Germany and Sweden on a continuum from mainstream to extreme-right vocabularies. This catalog of queries enables specific and localized queries to identify data voids. Examining critically the results of these queries, Norocel and Lewandowski argue that Google's reliance on source popularity may lead to extreme-right sources appearing in top positions. Furthermore, using platforms for user-generated content provides a way for these localized websites to gain top positions.
In their research commentary, Rosie Graham (2023) approaches the issue of the ethical dimensions of Google Autocomplete, highlighting some of the key ethical issues raised by Google's automated suggestion tool that provides potential queries below a user's search box. Much of the discourse surrounding Google's suggestions, or ‘predictions’, has been framed through legal cases in which complex issues can become distilled into black-and-white questions of the law. In turn, in their commentary Graham argues that by focusing primarily on the legal aspect, it obscures many other moral dimensions raised by Google Autocomplete. Building on existing typologies, Graham first outlines the legal discourse, before exploring five additional ethical challenges, each framed around a particular moral question in which all users have a stake.
In the third contribution, Richard Rogers (2023) deploys “algorithmic probing” as means to investigate the manner in which Google balances prompting and moderating offensive results. The contribution begins with the observation that Google results have been initially examined for what they privilege (in terms of the surface web, the optimized and personalized pages, and/or their own properties), but more recent scholarly efforts have concentrated on scrutinizing another topic, namely the recurrence of offensive results. Adopting “algorithmic probing,” Rogers revisits a selection of offensive and other problematic results, which had initially been identified by either journalists or other researchers. He re-runs the original queries to study the potential moderation of results in Google Web and Image Search, but mainly in Google Autocomplete. The purpose of the study is to examine the extent of moderation pertaining either to a different kind of privileging—Google's hierarchy of concerns—or specific categories or languages. Rogers finds that Google appears to heavily moderate issues of religion, ethnicities, and sexualities (though in a selective manner), whilst issues of stereotypical depictions of gendered professions and ageism are left largely untouched. Concerning languages, content in English is moderated to a greater degree in comparison to Southern European and Balkan languages. In conclusion, Rogers discusses the stakes of Google's moderation, especially with regard to its uneven coverage.
In the group of articles concerning the commercial side, Renée Ridgway (2023) examines the deleterious consequences of the manner in which Google's original sociotechnical affordances have shaped the “trusted user” by means of ubiquitous googling and smart algorithms in surveillance capitalism. Departing from the fact that Google dominates over 90% of the search market worldwide (as of late 2022), Ridgway argues that its hegemonic position in search is hardly accidental, arbitrary, or (un)intentional. She revisits Brin and Page's original paper (1998), drawing on six of their key innovations, concerns, and design choices (namely counting citations or backlinks, trusted user, advertising, personalization, usage data, and smart algorithms), in order to examine how Google's hypertext search engine technologies evolved by means of “moments of contingency,” which then led to corporate lock-ins. Building on earlier research (Zuboff, 2015), Ridgway describes the manner in which Google as an infrastructure is intertwined with big data's platformization and the ad infinitum collection of usage data, beyond for personalization only. This extraction and refinement of usage data as “behavioral surplus,” she argues, results in “deleterious consequences”: a “habit of automaticity,” which shapes the trusted user through “ubiquitous googling” and smart algorithms, whilst simultaneously generating prediction products for surveillance capitalism. As such, Ridgway contributes a new taxonomy of Google sociotechnical affordances to critical science and technology studies, media history, and web search literature.
Bernhard Rieder (2022) in his research commentary proposes a conceptual framework to enable the study of big tech companies like Google as “technical systems,” which organize their operation around the mastery and operationalization of key technologies that facilitate and drive their continuous expansion. Using Google as an example, Rieder shows how to interrogate software and hardware through the lens of transversal applicability, discussing software and hardware integration. He proposes the notion of “data amalgams” to contextualize and complicate the notion of data. The goal of his commentary is to complement existing vectors of “big tech” critique with a perspective sensitive to the materialities of specific technologies and their possible consequences.
In turn, Jutta Haider and Malte Rödl (2023) analyze the relationship between Google and different kinds of ignorance related to climate change. Haider and Rödl build their study on concepts from the field of agnotology to examine the manner in which environmental ignorances, in particular those related to the climate crisis, are shaped at the intersection of the logics of Google Search, everyday life and civil society/politics. They pursue their argument by means of four vignettes, each of which explores and illustrates how Google Search is configured into a different kind of socially produced ignorance: (1) ignorance through information avoidance: climate anxiety; (2) ignorance through selective choice: gaming search terms; (3) ignorance by design: algorithmically embodied emissions; and (4) ignorance through query suggestions: directing people to data voids. As such, Haider and Rödl highlight that while Google Search and its underlying algorithmic and commercial logic pre-figure these ignorances, they are also co-created and co-maintained by content producers, users, and other human and non-human actors, as Google Search has become integral of social practices and ideas about them. They conclude by drawing attention to a new logic of ignorance that is emerging in conjunction with a new knowledge logic.
Last but not least, Astrid Mager (2023) zooms in on the European context and studies how European search engine projects have attempted to counter-imagine and counteract Google's hegemonic position. Mager examines how developers of alternative search engines to Google have construed counter-imaginaries of search engines centered around social values, thereby competing with the corporate imaginaries centered on mere profit maximization. By means of three in-depth case studies of European search engines, Mager evinces how search engine developers build out these counter-imaginaries, which social values underpin them, and how they are intertwined with the developers’ sociotechnical practices. She shows how such notions as privacy, independence, and openness, by being treated as context-dependent and changing over time, lead to specific “value pragmatics,” which enable the projects to scale beyond their own communities of practice. Furthermore, she unveils how broader notions of Europe as “unified and pluralistic” are constructed and co-produced by and through the developers’ attempts to counter corporate imaginaries about search. In conclusion, Mager suggests three points of intervention to enable alternative search engine projects, and discusses how “European values,” in all their richness and diversity, may contribute to such an effort.
The way ahead? Concluding remarks
AI has entered the conversation about the future of search as well as the future of alternatives to Google, albeit divorced from the discussion above on alternatives following a social cause, at least to date. One example is Google's Bard, an AI generative text system which advertises itself as being helpful in “explaining to your kids why the sky is blue,” together with “helping with lines of source code” and “drafting an email,” and then importing it into Gmail (Google, 2023). Google's Bard is yet another example of interspersal product development in the Google infrastructure, touched upon above, and a possible way ahead for how Google envisages the integration of generative AI as sets of suggestions that stand alone as an answer machine, but can also be linked to other products.
By accumulating infrastructural power in using AI to couple products, Google invites AI engine critique somewhat differently from earlier search engine criticism that concentrated more on socio-epistemological concerns such as how the algorithms marginalize some sources and promote others. Such critique may focus on the way value is created from infrastructural complexity and how the integration of large language models into search products is used to extend market power by Google, but also by Microsoft as it hopes to catch up with its “AI-powered” search engine Bing and its Edge browser framed as “an AI copilot for the web” on the official Microsoft blog (2023).
The new “bundlings” of products and services pose further infrastructure governance and competition challenges. As AI search product development presses on, much of the attention and innovation are in the regulatory and legislative arena, especially in Europe; such as the AI act, as discussed above. Emerging themes in the realm of platform governance are interdisciplinary oversight bodies as well as platform or social media councils. These independent monitoring bodies, unlike Facebook's oversight board, would not only be situated in the legal and ethical realm. They would also represent the interests of the users and the public interest more broadly (Efferenn, 2023). Calls for big tech and social media observatories are further evidence of the broadening of the scope of platform governance considerations. For example, Rieder and Hofmann's “European Platform Observatory” would be “driven by a public interest mandate” (2020).
These initiatives and others put Google critique and intervention into practice, building on over two decades of work studying how the search engine privileges certain voices and marginalizes others, introduces and reifies bias, extracts data and sells profiles in exchange for its free services, and creates surplus value from the collective work that is the web, as mentioned above. There have been inventories of the critique (Rogers, 2018), but few of the alternatives. Given the opportunity to study and learn from alternatives, from Europe and beyond, we could not just anticipate the AI search products ahead but provide frameworks and imaginations for critical intervention.
Footnotes
Acknowledgments
We would like to thank the Pufendorf Institute for Advanced Studies at Lund University for sponsoring “In Search of Search and Its Engines” (initiated by Olof Sundin and Alison Gerber) that brought together the co-editors and other contributors to this special issue. We would also like to thank the Institute of Technology Assessment, Austrian Academy of Sciences, for helping us to organize the writing workshop in Vienna (April 2022), which was supported by the Austrian Science Fund (FWF). Finally, we would like to extend our gratitude to all the workshop participants for their generous and helpful comments and feedback on earlier versions of these articles and commentaries.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The work for this special issue was supported by the Austrian Science Fund's grant number V511-G29 (Mager), the Swedish Research Council's (Vetenskapsrådet) grant number 2019-03363 (Norocel), and the Digital Methods Winter School, Media Studies, University of Amsterdam (Rogers).
