Abstract
The paper examines some of the processes of the closely knit relationship between Google’s ideologies of neutrality and objectivity and global market dominance. Neutrality construction comprises an important element sustaining the company’s economic position and is reflected in constant updates, estimates and changes to utility and relevance of search results. Providing a purely technical solution to these issues proves to be increasingly difficult without a human hand in steering algorithmic solutions. Search relevance fluctuates and shifts through continuous tinkering and tweaking of the search algorithm. The company also uses third parties to hire human raters for performing quality assessments of algorithmic updates and adaptations in linguistically and culturally diverse global markets. The adaptation process contradicts the technical foundations of the company and calculations based on the initial Page Rank algorithm. Annual market reports, Google’s Search Quality Rating Guidelines, and reports from media specialising in search engine optimisation business are analysed. The Search Quality Rating Guidelines document provides a rare glimpse into the internal architecture of search algorithms and the notions of utility and relevance which are presented and structured as neutral and objective. Intertwined layers of ideology, hidden labour of human raters, advertising revenues, market dominance and control are discussed throughout the paper.
Introduction
Traditionally, media companies had to find a way to commodify cultural and information goods since these are limitless and their use value cannot be destroyed or consumed by use (Garnham, 1986). For example, advertisements are strategically placed during specific broadcasting times in order to reach the widest possible audience. Critical theorists argued that it was not information goods that were sold, but audience and its labour (Jhally and Livant, 1986; Smythe, 1981). The audience labour perspective has recently been taken up by many scholars and expanded to take into account social media platforms (e.g., Fisher, 2015; Fuchs, 2010; Fuchs and Sevignani, 2013; Mosco, 2011). The main argument is that these platforms, apart from providing space to communicate and collaborate, take advantage of the time the users spend in order to monitor their behaviour and extract value and profit from their online data traces. A vital part in this value chain is the processing and packaging of data to the primary clients of digital platforms: advertisers and marketers (Comor, 2015: 17).
Towards this end, Google commodifies users’ search queries and search results by selling consumers' keywords to advertisers as insights into consumer interests (Turow, 2011). The process of information circulation and marketisation involves three distinct stages. First, internet users query the search engine to find information. Second, Google creates and maintains indexes of content providers that want users to reach them. Third, advertisers are trying to attract visitors beyond the traffic received from so-called organic results 1 and pay for placed ads. Information search and content indexing are done for free while the advertisers pay for every click and thereby finance the platform (Rieder and Sire, 2014). Because Google vertically integrates the search engine, advertising agency and the rating system (Lee, 2011), advertisers have no need for intermediary organisations that specialise in advertising, or market research and rating. Access and interpretation is only possible from within the company, which means that advertisers ‘have to learn the Google way of interpreting information’ (Lee, 2011: 445). In essence, Google transforms words into commodities and sells a limitless resource since infinite combinations of keywords and search queries are always possible. The accumulation of this data is to a large extent dependent on user activities and search, which is a form of unpaid labour contributing to direct accumulation of profit for the company (Fuchs, 2010). With the growth of internet infrastructure, Google expands to various markets where different linguistic combinations of keywords, search queries, user intentions, users and labour are available. New markets bring new advertisers and new profit margins.
Having its search engine succumb to irrelevance and spam is one of the chief concerns and business risks for the company: [w]eb spam and content farms could decrease our search quality, which could damage our reputation and deter our current and potential users from using our products and services (…) If our search results display an increasing number of web spam and content farms, this could hurt our reputation for delivering relevant information or reduce user traffic to our websites.
This paper draws on the social contextualisation and social arrangements of machines and algorithms (Gillespie, 2014; MacKenzie, 1984, 2014) and, more broadly, theoretical approaches from the critical political economy of communication and digital labour (Comor, 2015; Fisher, 2015; Fuchs, 2010, 2015; Fuchs and Sevignani, 2013; Garnham, 1986; Mosco, 2009, 2011; Robinson, 2015). The social, legal, political and economic dimensions of search engines are a well-documented and studied phenomenon (Granka, 2010; Grimmelman, 2009; Hargittai, 2007; Hazan, 2013; Introna and Nissembaum, 2000; Pasquinelli, 2009; Van Couvering, 2007). Here the focus is on the technical decisions and choices for tweaking the search algorithm in a broader context of socio-economic considerations such as ideological layering, labour, advertising revenues, market dominance and control. The paper examines some of the publicly visible processes behind algorithmic changes and Google’s continuous positioning to maintain market dominance. It attempts to untangle the mechanisms that connect the culture of in-company engineers and computer scientists with hidden and invisible human raters and search engine marketers. Human raters are hired by Google via third parties to perform search quality rating in accordance with a provided set of guidelines. As such, they present a compelling case of the global division of labour.
The main data in this study consists of market reports, public documents from Google and reports by online media specialising in search engine optimisation (SEO) business and online marketing. 4 The approach requires a flexible and inductive reasoning to determine ‘contingencies’ (MacKenzie, 1984) behind technical decisions, as well as to see how, and where, they interweave with profit motives and discourses legitimising these choices. Google started as a company which provided a pragmatic, technical solution to an untraceable clutter of websites in the 1990s. The Page Rank simply won against its competition as a useful solution to web search in the developing online information system. However, with the evolution of the web in the past two decades, the notion of relevance for internet users and online advertisers became increasingly complex and moved away from simple calculations based on the number of incoming links. Maintaining the dominant position that Google built becomes increasingly hard and the company currently uses more than 200 signals besides Page Rank to determine search relevance, a point that it regularly stresses. 5 In order to maintain the relevance of its search results, Google needs to promote an ideology of a neutral and objective search engine based on technical innovations if it wants to keep the position of a dominant solution to web search.
The first section of the paper reviews the fluid relation between algorithms and users and the ways the inner workings of search are presented, constructed and described by Google to create a specific image of the company. The second section focuses on the importance of culture by examining the ideologies of Google, the quality rating process and the SEO industry. Neutrality and objectivity are complex ideologies which consist of elements such as calculated relevance and utility of search results, both drawing on computer science and engineering discourses deeply embedded in the culture of the company. The strong emphasis on mathematical and algorithmic solutions to information search, however, is contradicted by the use of human quality raters and their contextualised cultural knowledge and labour. It displays the limits of technical solutions for full interpretation of human intentions and information needs associated with the shifting notions of relevance and utility promoted by Google.
The search engine
The technological artefact, the search engine, is the point of interaction between Google, advertisers and internet users. The relationship is structured in, and through, technical decisions and political and economic structures that are hidden from sight. One of the reasons for Google's economic success was that it provided a solution for the lack of easy web navigation in the 1990s. Google’s technical autonomy was predicated on the idea that search should be as useful as possible, which would benefit the users and encourage them to search more often (Hillis et al., 2013: 36). The Page Rank algorithm calculated webpage relevance in terms of links it received from other web pages. It mimicked the citation count in academia where more citations mean more relevance. The search engine crawls publicly available web pages, gathers the pages during the process and then creates an index for retrieving these pages. The algorithms search the index to find a calculated estimate of what the users will find relevant. Google creates a facade of user control through utility improvements and streamlining of the visual design and technical characteristics of the search engine into an empty space that users fill in with search terms. Whatever information users want, Google strives to be an objective courier for that information regardless of the fact that algorithmic analysis is in itself a form of pre-selection, ordering and bias based on calculations of what is to be presented as a thing of interest, and what is to be discarded and/or annulled (Amoore and Piotukh, 2015). Over time, Google structures user habits and creates a sense of utility in everyday life that goes unquestioned. Bad usage, multiple search as well as trial-and-error search add labour time to the usage of the engine. As Gillespie (2014: 187) states: ‘Google’s solution is operationalised into a tool that billions of people use every day, most of whom experience it as something that simply, and unproblematically, works’.
The estimate on what the users want is based on algorithmic calculations including Page Rank and other signals such as terms on websites, geographic location, previous search history, quality of content on websites, recommendations from users' social networks, etc. In the words of the company: ‘[o]ur goal is to get you the answer you’re looking for faster, creating a nearly seamless connection between you and the knowledge you seek’. 6 Google continually updates the algorithms in the hope of finding a technical solution for user intentions in information search by including new tests and signals of relevance. The algorithms, and displayed search results, aim to configure the user’s character and capacity (Woolgar, 1991) as well as possible future actions in relation to the machine. This limits the possibilities and future options for search and creates a form of control: ‘[i]t “forecloses the creative mutation” of the affective potential of the search subject, instead channelling that potential into pre-ordained forms’ (Jarett, 2014: 24). On the other hand, the search engine and the advertising industry cannot operate without active users. The amount of transaction-generated information (Gandy Jr, 2011) becomes the foundation for the economic success of Google.
While engineers and computer scientists work on updates, changes and tests of the algorithm, data is also continuously produced for free by search engine users (Fuchs, 2010). The digital labour sustaining the company is, therefore, highly diversified and includes various types of paid and unpaid labour. The paid labour of in-company engineers is certainly an important part of the culture of Google and a strong promoter of the discourse of objectivity and neutrality. The less celebrated is the unpaid labour by search engine users, while the least transparent type is the hidden labour of human quality raters. The complex co-construction between the algorithms and users and between the utility of the platform and profit generation is an intricate balancing act where both technological and social affordances (Postigo, 2016) play an important role. In other words, technical and economic considerations go hand in hand. However, what keeps this act afloat is a convoluted set of values that permeate the internal culture of the company and the public image it works hard to sustain. The importance of culture for patching up this unstable relationship is the focus of the following section.
The cultural engine
The role of culture for Google’s economy is complex and multi-layered. First, a set of cultural and ideological values are important for the internal company culture and bolstering of technical neutrality and objectivity. Second, quality assessment culture is a key aspect of Google’s global expansion into new cultural and linguistic areas and markets. Through assessment, Google search taps into lived cultural meanings of its users and their local culture to provide calculated relevance in accordance with the algorithm and the needs of local advertisers. A third integral (albeit opposing) part of the cultural engine of Google is the industry of SEO. Technological tweaks to the algorithm create opportunities for search engine marketers who attempt to decipher these changes in the hope of providing up-to-date information to their clients and maximising their own profit.
Company culture
The company culture is permeated by contradicting ideologies of cultural liberation and economic profit within internal work relations and external public relations. Internally, the company promotes a type of ‘network work’ (Fisher, 2010) which is argued to be more liberating and allows for more personal expression and freedom, creativity, play and joy. It creates a ‘cultural infrastructure’ (Turner, 2009) for emerging forms of media production alongside its extremely profitable business. As Hillis et al. (2013: 46) state, Google occupies a hybrid position in which it generates mass audiences and huge profit and also maintains its association with non-economic imperatives such as refusing to mix paid and unpaid advertising. The inclusion of non-economic values helps build consumer trust and legitimacy and allows for the accumulation of economic and cultural capital.
Google is not promoting any political or socio-cultural, exclusionist ideology in a traditional sense. However, it manages to harness and steer the creative capacities of their workers to maximise profits beyond standard working hours and limits: ‘the argument that Google is changing the world and changing it for the better encourages employees to align their sense of personal mission with that of the company’ (Turner, 2009: 80). This constructed benevolence is perhaps best summarised in the official motto of the company: ‘Don't be evil’. It encapsulates the craft of opposing corporate culture while simultaneously retaining the competitive spirit and drive for profit accumulation. This openness to personal expression and creativity collides with the promotion of the technical discourse and the fixed meanings ascribed to search results. Thus, Google advocates algorithmic ideology (Mager, 2012), technological neutrality and objectivity while it simultaneous promotes both the creative capacities and interpretative resilience of its workers. The fact that there are continuous changes to the algorithm and the assessment of its results is an apt example of the basic contradiction embedded in the discourse of neutrality surrounding its search engine.
Quality assessment culture
Countering the algorithmic ideology is Google’s use of human quality raters for fighting spam from as early on as 2004. 7 Various versions of the Search Quality Rating Guidelines (SQRGs) document have reportedly been ‘leaked’ 8 with new technology pundits and online marketing experts immediately commenting and providing advice to clients seeking top search results. 9 The documents vary in size and scope 10 although they remain the same in declared purpose, which is to provide quality URL results for search queries. The so-called Quality Raters perform specific types of tasks such as rating the quality, utility and relevance of search results based on provided criteria. The examined SQRG version in this paper is the ‘official’ one available on the Google ‘Inside Search’ web page. 11 Apart from the fact that other versions are proprietary and confidential, the official version offers an insight into the curation activities and construction of users by the company. The document is presented as technical and neutral in structuring work relations, algorithmic relevance and information search on the web. It directs the work process of the human raters and embeds the results into algorithmic updates and changes. However, the process is not straightforward as any understanding of a text, including that of guidelines and manuals, is in itself a culturally ambiguous and hermeneutic process of interpretation (Ricoeur, 1971).
The advertising model embedded in the Google search engine cannot be profitable without a large number of users, attracted by the use value of the free services for searching the web and advertisers seeking to realise surplus value and users (Robinson, 2015: 46). In order to sustain this model, it needs to construct search importance and connect users’ intentions with displayed estimates of utility. Because relevance is a highly contextually dependent concept – it can be interpreted differently based on the social situation, previous experience, values, norms and interests of individuals and social groups – Google spends considerable time constructing this metric. In order to make results contextually relevant and important for the local communities and users, some human intervention is necessary. However, this type of work management is ‘handled as a computational problem’ (Irani, 2015) in which work relations are performed under clandestine conditions. To put it differently: ‘… in machinery, the capital attempts to achieve by technological means what in manufacture it attempted to achieve by social organisation alone’ (MacKenzie, 1984: 487). Labour power is cheapened and is available globally with the help of information and communication technologies and micro-management of the work process into modular assignments and search quality assessments.
The scope and details of quality rating assessment are difficult to determine precisely. Unlike other types of paid and unpaid digital labour (Fuchs, 2010; Fuchs and Sevignani, 2013; Postigo, 2016; Scholz, 2013) which create direct economic benefit for platform owners (e.g., Amazon’s Mechanical Turk, YouTube, etc.), the algorithm is an object that obfuscates value creation and work relations between the company and human raters. Work of the quality raters is hidden behind the user interface and technical design. In other words, while it has already been described how the use of the platform creates value for social media companies, it is not fully clear how the paid work of human raters impacts the Google search algorithm and, more importantly, if it affects page rank. However, previously mentioned documents provide a rare glimpse into the internal architecture of search algorithms and the techniques of careful engineering of public discourse about Google. Close examination of these manuals provides the means to uncover where ideology and discourse intersects with direct economic benefit and how they interact around technical decisions and choices.
The 2012 SQRG document is divided into three parts: Rating Guidelines, URL Rating Tasks with Query Location 12 and Webspam Guidelines. 13 The Rating Guidelines section outlines the search quality rating program, ways of understanding the query, the language of the landing page, the rating scale, the process of rating and flags for specific types of websites. In terms of the intended purpose of the document and the overall relation between the algorithm and the users promoted by the company, this is the most important section. User intent is defined as a type of query by which a user tries to accomplish something such as finding information or purchasing an item online. The intent will be highly dependent on the language and location of the user performing the query. Hence the utility, which is defined as a measure of how helpful the page is for user intent, becomes the most important aspect of search engine quality. Needless to say, utility is the most important part of the business model of Google wherein the lack of connection between queries and search results would lead to the breakdown of the advertising model based on paid search advertisements. An important aspect of quality ratings is that human raters are expected to ‘represent the user from their task location who read the task language’. It is, however, unclear from the document how this representation works and who the actual human raters are. It is implicit from the document that they should also live and work in the same location as the users whose queries and search results they are testing and thus seemingly necessitate a set of raters as globally distributed as users.
There have been online reports
14
indicating that Google performs these types of tasks through third parties such as Lionbridge Technologies Inc, Appen and Leapforce. These companies offer part-time, work-at-home opportunities as well as translation and product localisation, speech and search technology services for various clients. According to the annual report for Lionbridge,
15
Google accounts for 11% of their revenue in 2015. It is unclear what types of services it offers to Google from the annual report. However, as reported on the Lionbridge website, among other services it offers so-called enterprise crowdsourcing in the context of ‘search relevance testing’: [w]e gather people from all over the world especially those who are bilingual. Using our innovative cloud technologies, and our worldwide crowd of more than 100,000 cloud workers, we provide integrated solutions that enable clients to successfully market, sell and support their products and services in global markets.
16
There are other nuances of user intent that are even more difficult to untangle. For example, the document specifies that many queries have multiple meanings, or ‘query interpretations’ in the jargon of the document. Three levels are presented in the document. First, ‘dominant interpretation’ is defined as ‘the interpretation that most users have in mind when they issue the query’. 17 Second, ‘common interpretations’ can have several interpretations, none of which are dominant. 18 Third, ‘minor interpretations’ are defined as ‘interpretations that few users have in mind’. 19 Although possible interpretations are presented in an orderly and structured way, the possibility for a human rater to have insight into the level of understanding of an average user is highly dependent on his own experience, knowledge, education, etc., and the categories such as ‘most users’, ‘common interpretations’ and ‘few users’ are highly ambiguous. Furthermore, user intentions within the SQRG are reduced to the traits of rational, information seeking and information consuming omnivore. Social and cultural meanings are scaled down into simplified classifications called ‘Action, Information, and Navigation or “Do-Know-Go” queries’. With ‘Action intent’ it is assumed that users want to ‘accomplish a goal or engage in an activity, such as download software, play a game online, send flowers, find entertaining videos, etc.’ These are called ‘do’ queries. Regarding ‘Information intent’, or ‘know queries’, users want to know something. Finally, the ‘go queries’ or ‘Navigation intent’ is assumed to be connected with the users wanting to navigate to a website or a web page.
Based on the estimate of user intent and the utility of the search results or ‘landing pages’, the human raters assign the rating scale to URLs that are returned after a specific query is entered into the search engine. It is a five-point scale including options such as vital, useful, relevant, slightly relevant and off-topic or useless. There is an additional category of unratable for websites that do not load or are in foreign languages other than English and the language of the human rater. Rating tasks involve clicking through a list of search results or URLs and providing a rating that best fits the specific URL in the list. The SQRG document continuously reiterates the importance of user intent and page utility. Specifically, it focuses on representing users from the rater’s task location, giving lower ratings to unlikely interpretations, being wary of a possible difference between users and human raters, and keeping in mind the importance of location. The connection between user intent and page utility is to be secured by careful consideration of the experience in the task location with the task language, common sense, web research and location-specific results. 20 The Rating Guidelines section concludes with directions on assigning flags to specific types of pages that include spam, pornography and malicious content.
The ‘Inside Search’ Google web page 21 emphasises the neutrality of the SQRG document by contextualising it into a technical multi-stage evaluation and experiment. The first stage consists of ‘precision evaluations’ which entail feedback from evaluators, or human raters, who evaluate based on the guidelines. The second stage involves ‘side-by-side experiments’ where evaluators are given two sets of search results: from the old and new algorithm. The third stage is ‘live traffic experiments’ where Google changes search results for a small percentage of real Google users to see how they interact with the results. Finally, there are ‘launches’ where search engineers review the data from the experiments and decide if the changes should be launched. In 2012, Google reports to have performed 118,812 precision evaluations out of which 665 were approved for launch and included in the algorithm. 22
The SQRGs document is much less about what it says, and much more about what it does not say. The context of this publicly available document is rather unclear since it does not explicitly name its targeted audience. While it is written for human raters, this version is strategically placed on the ‘Inside Search’ Google web page. This indicates the intention of providing a curated and purified public form of a document that is used in a different format and in a different social context than the implied use during specific work activities of quality rating. In other words, the importance of the document for the quality assessment process is not entirely disclosed and elaborated. There is no mention of who the human raters or ‘evaluators’ are. It is unclear whether they work from inside the company or are outsourced and hired in different locations to perform constantly updated mechanical rating. Moreover, the described tasks are presented as rather straightforward and not requiring high technical know-how. Instead, the emphasis is upon the relevancy of understanding in a specific language-bound context. Both culture and geographic location are described as playing a vital role in understanding and estimating the intentions of the user and providing high utility search results. It is also clear that the ideology of neutrality and objectivity of Google algorithms masks the full extent of the need for human interventions in creating good search results and ultimately profit for the company. Google effectively taps into local cultural and contextual knowledge through the work of location-bounded quality raters hired through various third parties but keeps this process clouded by a thick layer of technical and engineering discourse.
SEO culture
The advertising industry pays for placed ads on Google search while the SEO techniques focus on website content that might lead the search algorithm to assign higher relevance to the websites and their owners in the organic search results. The scarcity of information on concrete inner workings of the algorithm makes the SEO experts at watching every public move and change by Google in a continuous effort to decipher the algorithm and get better knowledge on how to improve search results and make profits from their clients. 23 Google does not make any revenue from SEO and is locked in a continuous battle with SEO operators over the economic value of links and quality of search results.
The most recent set of algorithmic changes announced by Google involves, for example, updates with regard to mobile versions of websites and Twitter data. An algorithmic change was announced in 2015 in order to increase the relevance of websites with mobile phone versions in an attempt by Google to tap into the expanding market of mobile phones for internet search. SEO journals have dubbed the date of introduction ‘mobilegeddon’. 24 The algorithmic change reshuffles the rankings based on the existence or non-existence of mobile website versions and in accordance with other signals of relevance. 25 The same year it announced the introduction of more Twitter streams into the search index and the introduction of Twitter data as a new signal of relevance for the search algorithm. 26 Both of these changes highlight the dynamic nature of ‘relevance’ as defined by the company and the continuous struggle to maintain the dominant and monopolistic global position in the search engine market and online advertising. ‘Rankings are old-school’ states one online marketing expert giving advice on how to keep up with the latest developments in SEO techniques. 27
Constant changes of the search algorithm make the relationship between Google and SEO companies highly dynamic, antagonistic and fluid. Google monetises its unique visitors and information search. The connection between user intentions and useful search results is a moving target tangled in a web of socio-technical and economic decisions. By manoeuvring and manipulating this line of tension, Google frames global, online content production, making algorithmic changes much more than simple technical exercise. First, they affect the user and his/her search results, information habits and knowledge production/consumption by steering perceived intensions into paid websites and/or organic search results based on numerous signals that the user might not find relevant in the first place. Second, they affect the online marketing industry by changing relevance and rankings that consequently influence paid search advertising, SEO and market competition. Third, they influence content production for commercial websites which try to follow these changes and are paying for various types of digital marketing strategies to enhance the content on their websites and improve their visibility in an unstable and rapidly evolving digital economy.
Conclusion
Google employs powerful ideological engineering of neutrality and objectivity in order to keep the full context of its search engine hidden from everyday users. However, as was shown throughout this paper, ideology and economy intersect in various stages of the development of the algorithm. Google commodifies search queries in order to sell keywords and search results via a vertically integrated system (Lee, 2011) that maximises profit for the company. Simultaneously (and in order to keep the largest number of users on the platform), it carefully constructs the search engine interface and its visual design to form a facade of user control and to promote itself as an objective courier of online information empowering global internet users.
The construction of utility and its economic viability go hand in hand. A global division of paid, unpaid and hidden labour supports this cultural economy. Values embedded in the company, its hiring procedures and human resource management promote a flexible working environment. Culture also plays an important part for Google’s expansion into diverse language markets, as is visible from the SQRG document and the quality assessment process it describes. Search quality assessment by human raters hired through third parties adds a layer of hidden labour to Google’s cultural economy and the fluid construction and reconstruction of utility and relevance. Tracing changes of utility and relevance feeds the SEO industry and influences commercial, online content production.
The search logic presented by Google is a complex array of socio-technical decisions behind algorithmic changes, cultural values promoted by the company, and contextualised cultural values and interpretations derived from continuous quality assessments. Layers of ideology, labour and advertising revenues sustain its market dominance and control. Simultaneously, technical decisions, introduction of new signals of relevance and utility cloud full public scrutiny of these processes. Web search is much less a culture of significance which the users themselves have spun, to paraphrase Geertz (1973), and much more a culture that one of the most powerful and influential information and communication technology companies has engineered behind closed doors.
Footnotes
Acknowledgements
I wish to thank my colleagues from the Dynamics of Virtual Work COST action, especially Vladimir Cvijanović who provided valuable input in the early stages of this study. Special thanks goes to Brian Beaton from California State Polytechnic in San Luis Obispo, USA for close reading of the first draft, highly informed and constructive feedback. I would also like to thankthe anonymous reviewers from Big Data & Society for their valuable comments.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This paper was supported by the IS1202 European Cooperation in Science and Technology (COST) action Dynamics of Virtual Work.
