Abstract
Accounts of big data practices often assume that they target individuals. Personalization, with all the risks of discrimination and bias it entails, has been the critical focus in accounts of consumption, government, social media, and health. This paper argues that personalization through models using large-scale data is part of a more expansive change in probabilization that, in principle, is not reducible to individual or ‘personal’ attributes and actions. It describes the ‘personalization’ of an online grocery shopping recommender system to list a small number of grocery items of personal relevance for each of the millions of online grocery shoppers at a major UK supermarket chain. Drawing on a theory of probability proposed by the philosopher of science Karl Popper and anthropological work on shopping, it suggests that the attempt to generate personalized predictions necessarily incorporates impersonal relations to others and things. Using a mixture of discourse analysis and code-based reconstruction of key elements of the recommender system, it suggests that personalization is one facet of an open-ended weave of propensities associated with people and things in contemporary big data configurations. The paper explores how, in the context of recommender systems, the constitutive incompleteness of shopping lists, their propensity to expand or change, might be more important than their capacity to be personalized.
Introduction
Big data practice often converts an existing state of affairs by adding a model or predictive elements based on the extraction and analysis of data. This paper recounts the conversion of an online grocery ordering system from an older demographically based recommendation system to a ‘personal relevance’ model. The main argument of the paper concerns how personalization occurs in big data ‘conversion events’. The term ‘conversion event’ has a dual sense. In recommender system design and web e-commerce, it refers to those occasions when a customer, user or site visitor clicks a specific hyperlink or purchases items displayed to them. I also use it to refer to the common before–after narrative forms that proponents, developers, and advocates of recommender systems adopt in describing, promoting, explaining or otherwise talking about big data, analytics or predictive modelling practices. The operational and narrative dimension of conversion events lie at the core of this paper. It may be that the two senses of the term – the technical and the figurative – cannot be kept apart.
In recent research and opinion on recommendations, personalization has been seen as the goal of predictive modelling. A glance at the conference proceedings of the annual ACM RecSys (2017) conference will show many papers with ‘personalization’ in the title. Big data discourse in its promissory mode attributes potency to personalization: ‘most important, using big data we hope to identify specific individuals rather than groups’ (Mayer-Schönberger and Cukier, 2013), even if personalization has a much longer history in commerce and e-commerce in particular (Goy et al., 2007; Sarwar et al., 2000). Conversely, current debates about the problems of big data emphasize the need to protect people from personalization. Joseph Turow et al. (2015: 476) conclude in their critical account of the transformation of retail space by business analytics: ‘through it all, knowingly and not, and away from the spotlights of fierce social debate, retailers are encouraging daily routines that accept data-driven personalization as a centrifugal public’. Analysis of internet filter bubbles and the growth of predictive platforms in industry and government (O’Neil, 2016; Pariser, 2011) also centre on personalization.
In this paper, I confirm that a transformation or conversion event is occurring, but one which also, even in the midst of much ongoing personalization in the interests of platform capitalism’s extraction of profits (Srnicek, 2016), opens the possibility of conceiving of social ordering processes afresh. Like Louise Amoore and Viola Piotukh’s (2015: 360) call for a new epistemology of population, I will argue that contemporary recommender systems afford us the opportunity to re-conceptualize long-standing assumptions about social order and structure. The average everydayness and familiarity of the example I discuss – online grocery shopping – allows some first-hand engagement with the messiness, the entanglements and potentials of the conversion event in ways that other interesting settings – social media for instance – do not.
Contemporary ‘data-driven personalization’ relies on predictions, themselves dependent on data practices ranging across acquisition, storage, transformation, exchange, integration, modelling, and experimentation. These are complex platform-scale systems, with many different elements. I reconstruct certain predictive elements of an online grocery shopping recommender system associated with probabilities and their role in modelling and experimentation. The reconstruction is a quasi-empirical philosophical undertaking aiming to construct an argument about transformations in probability rather than characterizing the platform situation. Although probabilistic calculations of events have long-standing importance in many settings (insurance, medicine, experimental and field sciences, operations research, risk analysis, engineering design, public health, economic modelling, etc.), probabilities have taken on an altered operational function in digital platforms. They have become mundane parts of the weave of infrastructures, transactions, and practices found in everyday life. In order to take the operational reality of probabilities into account, I understand probabilities in quasi-physicalist terms defined by the philosopher of science Karl Popper (1990: 19) as a ‘world of propensities, as an unfolding process of realizing possibilities and of unfolding new possibilities’. This operational-physicalist or sociotechnical notion of probability shifts our understanding of data analytics, especially in association with platforms (Gillespie, 2010), away from personalization. I draw on Popper’s reframing of probability as propensity to engage with grocery recommendations in their worldly enactments.
The media theorist Mark Hansen (2015: 111–112) has recently used Popper’s account to argue that ‘predictive analytics are discoveries of micrological propensities that are not directly correlated with human understanding and affectivity and that do not by themselves cohere into clearly identifiable events’. Hansen (2015: 120) links data to things via probabilities: ‘whatever explanatory and causal value predictive analytics of large datasets have is, I suggest, ultimately rooted in this ontological transformation whereby probabilities are understood to be expressions of the actual propensity of things’. Hansen presents big data as an ‘ontological transformation’ that deploys probabilities in an evermore closely woven and encompassing expression of animated, eventful, propensities of things. Predictive analytics discovers probabilistic calculation as operating on the propensities or the mutable associative agencies of things. The pragmatic and indeed empirical question for me is whether such transformations or ‘conversion’ in relations between probabilities and ‘the propensities of things’ can be detected and articulated in the prosaic setting of shopping lists and recommender systems. If it can, then the narratives of personalization that frame many big data conversion events might be recounted differently.
In exploring big data conversion events, the argument concerning probabilities unfolds in four main steps. The first main section introduces the case study – the Tesco online grocery recommender system as presented at an industry/academic conference – as an instance of the narrative of a conversion event. The second section frames grocery shopping in terms of sociological and anthropological accounts of shopping lists and social order, highlighting how both the constitutional incompleteness of lists as inscriptive devices and the group-structured relationality of shopping overflow operational notions of personalization. The defining predictive operations of contemporary recommender systems are situated by a brief archaeology of Tesco’s shift from census-based data mining to web-based recommender system. The last section is the main body of the paper and approaches recommendations in terms of the different kinds of probability calculations that both target persons and diffuse beyond them. This section of the paper extends Popper’s propensity-based account of probability. It attends to the different kinds of probabilities subjected to calculation, the shifts in modelling practices that include many more propensities, the different operational conditions under which probability calculations take place and the implications of lists and shopping for probability.
The reconstructions of the model will explore the implications of this alternative account of probability for regular narratives of personalization. The reconstruction of the conversion event is empirical in several respects. Discussion departs from an ethnographic moment: being a member of the audience at an industry/academic conference where a data scientist was describing how personalized recommendations were created for online grocery orders. It refers to existing anthropological and sociological research concerning shopping and list-making to locate recommender system amongst the social ordering practices of shopping and lists. It makes use of archaeological approaches, in the sense developed by Michel Foucault (1972) to identify and map functional statements and practices configuring the knowledge and order generated by such systems over time. It conducts several small-scale code-based experiments in order to reconstruct, using widely available code resources such as Application Programming Interfaces (APIs) and software libraries for machine learning, some prototypical elements of the system in question. 1 Like some recent work in science and technology studies, anthropology, and media studies (Bogost, 2012; Marcus, 2014; Marres, 2017), the paper is shaped by practical encounter with the ‘object’ – code, data, web platforms – it describes. Code experiments, as Ian Bogost (2012: 100) writes, can ‘act as a theory, or an experiment, or question – one that can be operated’. The motivation is both phenomenological – to reactivate the sense of relationality at work in the model – and philosophically reconstructive – to ground an encounter with the troublesome power of recommendation in what John Dewey (1957: 199) terms ‘specific inquiries into specific structures’. At times, this mode of empirical philosophy may seem overly fixated on the technical minutiae of algorithms and recommender systems and inattentive to lived experience or practices. With some forbearance on the part of readers, however, the argument of the paper should resonate sociologically: it concerns the relationality of social life, or the consistency and regularity of relating and acting in a world where probabilities and operations based on probabilities are widespread.
The conversion from demographic to personalized recommendation
At one of the many industry-meets-academia events occurring in data science-oriented higher education institutions in the UK, speakers from industry, government, and commerce described their work with predictive models. 2 Their narratives often followed the ‘before-Big Data and after-Big data’ conversion form. Shreena Patel, a PhD in statistics and operations research, works as a data scientist for DunnHumby (2017), a well-known customer science company. Her work at DunnHumby focuses on online grocery shopping at the supermarket chain Tesco. Speaking to an audience of statisticians and operations researchers, Patel focused on the development and operation of predictive models underlying shopping list recommendations. The presentation was filled with graphs, numbers, and tables concerning ongoing development of the ‘Have you forgotten?’ recommender system.
Against the background of the sheer number of commodities and their distribution of prices, Patel’s presentation presents two opportunities. First, by recounting, contextualizing, and commenting on the main steps in making the ‘Have you forgotten?’ list, we might follow some of the predictive sense-making done by data scientists and customer analytics teams working with transactional data in a typical commercial setting. Patel mentioned many of these steps only fleetingly in the presentation, for they are largely taken for granted as part of predictive analytic practice. Second, Patel focused on the renovation and updating of long-standing data mining practice via a much more explicitly ‘big data’ and ‘machine learning’-oriented implementation. Her presentation concerns a typical big data conversion event in which a long-standing recommender system was replaced by a ‘big data’-style system delivering ‘personally relevant’ recommendations. What stands out from the presentation is not any state-of-the-art innovation, but the ongoing life of the recommender system: the new predictive model is part of a long-standing and ongoing transformation of grocery shopping. Her presentation is the basis for this discussion: it shapes the choice of technical literature, code and data samples, as well as the consideration of underlying philosophical issues. From a sociological standpoint, the interest lies less in specific technical innovations and more in how the various facets of the recommender systems she describes relate to the role of prediction and probabilities in social order more generally.
Shopping list orderings
Online grocery shopping at Tesco includes recommendations for further grocery purchases under the title of ‘Have you forgotten?’ When Tesco customers shop for groceries online, a list of five recommendations appear at the checkout stage. The recommendations are the product of a recommender system, an important category of operational device in big data (see, for instance Hallinan and Striphas (2014) for analysis of Netflix recommendation; Morris (2015) or Seaver (2015) for an account of music recommendations). The question ‘Have you forgotten?’ is followed by a list of some grocery items that could have been or are usually on a shopping list. The title of the suggestions is a bit misleading. The recommender system, as we will see, is not concerned with forgetting, with the many slips and oversights associated to which shopping is prone, but rather with substituting and adding items that customers had not selected perhaps because they had never thought of buying them in the first place.
Compared to music, film, travel, fashion, and book purchases, online grocery shopping is difficult to personalize. The anthropologist Daniel Miller has argued (Miller, 2012) that all shopping negotiates discrepancies between normative and actual social order (for instance, between the ideals of health emblematized by organic products and commitments to thrift embodied in lower cost generic products). Miller (2012: 72) suggests that household shopping practices attempt to resolve differences between how people think they should live and how they actually live: ‘we have to watch how shopping helps resolve these discrepancies between the normative and the actual, but we also need some ideas of where the normative comes from in the first place’. More provocatively, Miller claims that ‘shopping is largely a technology for the expression of love’ (85). Whatever theory we have of how normative social order arises, Miller’s argument implies that shopping inhabits a mundane but highly variable space between normative/idea and actual order. Importantly, and this will be a key consideration for the Tesco recommender systems, grocery shopping is not necessarily personal or individual. It is saturated by fluxing forms of social order concerning family and other forms of social grouping, and their associated relations (love, etc.).
In contrast to many other forms of online and offline shopping, shopping lists distinctively, albeit provisionally, stabilize the complex social ordering of grocery shopping. Shopping lists provide important clues as to how local social order is constituted, maintained and repaired in grocery shopping. As people shop, either by trawling along aisles packed with thousands of products, or scrolling down screens or searching for particular brands amidst search results, lists may filter or reduce the excessive abundance, claims on attention, dazzle, and distraction of commodities to the practical social order of domestic economy. The shopping list, whether written on the back of an envelope, saved as a list in an online grocery shopping system, or memorized, lies at the intersection of logistical flows, infrastructural orderings, and lively negotiations around actual and normative social orders. 3 Shopping lists are intersectional ordering devices that encapsulate a universe of possible references and a teeming multitude of propensities with an actual local order. 4 In contrast to much e-commerce where each purchase concerns small numbers of items, the numerous items on a grocery shopping list suggest that online grocery shopping, and attempts to personalize it, will both provide many opportunities for recommendation (and hence conversion events) as well as many complexities in addressing normative and actual social orders.
Handwritten shopping lists have ongoing practical importance and mixed ordering practices (see the montage of handwritten shopping lists at Grocery List). Online grocery shopping lists, by contrast, reside at the intersection of web and internet infrastructures, supply chain logistics, individualized practices and habitus, and increasingly, the predictive operations of recommender systems. Whereas the aisles and shelves of a supermarket present a densely woven mesh of objects competing for visual attention by offering distinctions of taste, thrift, expedience, novelty, indulgence, health, online shopping recommender systems generate lists that seek to align people to products that they otherwise might have little relation to (see Turow et al., 2015 for an overview of the development of these systems).
Archaeology of recommendations: From 1984 to 2007
Although the changes Patel described are configured in Tesco-specific ways by DunnHumby, they are also broadly typical of a big data conversion event. Analytics service providers such as DunnHumby attempt to convert their customers by stories of conversion. 5 Patel’s presentation was part of this effort. The main narrative of Patel’s conversion narrative concerned Tesco’s shift from a well-established loyalty card-based data mining model developed in the 1990s to a predictive, probabilistic, ‘personal relevance’ model that would append items to the shopping list in almost real time. Tesco is the largest supermarket chain in UK and a notable success story for DunnHumby. Tesco’s customer loyalty and targeted marketing programme known as ‘Tesco Clubcard’ started in 1991. DunnHumby – founded by operations researchers Edwina Dunn and Clive Humby – is said to have convinced the CEO of Tesco sometime in 1991 that a loyalty card programme could change the supermarket chain’s relationship to its customers. 6 Clive Humby’s academic publications are hard to track down. An early paper given at the Conference of Young Operational Researchers in Nottingham in 1984 (see Figure 1) suggests the direction that he, the company Dunn and Humby formed (DunnHumby) and later Tesco would take in constructing lists (O’Keefe, 1984). The abstract for Humby’s presentation prefigures an ongoing trajectory for data mining techniques aimed at eliciting detailed information on individual customer references.
Humby (1989) highlights the need to add lifestage data to neighbourhood data in marketing research. Even if Tesco succeeded in data mining its customers using demographic segmentation, and perhaps became the UK’s biggest supermarket with the help of data mining in the 1990s, the shopping environment in 2017 is markedly different (Turow, 2017). It is no longer organized around campaigns involving special offers or redemption of points for demographically segmented loyal customers (DunnHumby made heavy use of UK Census data). It can no longer rely on placement of goods in carefully chosen locations in stores. It needs, or at least might want to, up-sell and cross-sell to customers who only sometimes visit the supermarket itself.
Abstract from a Clive Humby presentation in 1984 (O’Keefe, 1984).
How could we characterize the shift from ClubCard data mining to online grocery markets? If ‘Tesco is the clear winner in the online grocery market, in fact it takes almost 50p of every £1 spent on food shopping on the internet’ (Silverwood-Cope, 2014), then has Tesco itself undergone some kind of conversion? Patel described her work at DunnHumby as converting the recommender system from a ‘rules-based list’ to a ‘relevance model’. The relevance model affected the construction of the ‘Have You Forgotten’ list. This very localized intervention is typical of broader reorganization of prediction. The shift in models results in a more probabilistic structuring of lists. As is so often the case in big data conversion narratives, the triviality of the ‘Have you forgotten?’ recommendations provides only peripheral signs of the complex predictive infrastructure underpinning them.
Academic researchers first began writing about personalized recommender systems in the mid-1990s. From the outset they highlighted a potential shift from demographic-driven market research or data mining techniques to personalized recommendations. For instance, writing in 1997 in a special section of the Communications of the ACM on recommender systems, Paul Resnick and Hal Varian (1997) (at that time Dean of Information Sciences at UC Berkeley, but currently Chief Economist at Google) made much of this personalization. Resnick and Varian (1997: 56) emphasized the need to distinguish the emerging practices from data mining: In everyday life, we rely on recommendations from other people. … Recommender systems augment this natural social process. In a typical recommender system, people provide recommendations as inputs, which the system then aggregates and directs to appropriate recipients. In some cases the primary transformation is in the aggregation; in others the system’s value lies in its ability to make good matches between the recommenders and those seeking recommendations.
Probabilistic conversion events
Given personalized recommendations stretch back two decades, what in the newly implemented recommender system changes? The components of the new recommender system – predictive models and their parameters, infrastructural provisioning to run the models, and platform-scale deployment – address the challenges of personally relevant recommendations at a specific point in time, the moment when a customer is close to finishing their grocery order. The recommender system juggles many products, changes over time, and unstable propensities of things in their associations with people. The ‘personal relevance model’ more generally has a troubled relation to a social order because grocery buying, as accounts of shopping suggests, plurally social. The model will need to render grocery shopping calculable in a way that somehow includes the associated social ordering and negotiation.
The model, like many big data practices, predicates probabilities as a way to render the situation calculable. From the standpoint of probability, recommendations are conditional probabilities or probabilities whose calculation takes into the account the occurrence of other events. But what is a probability today? In exploring Patel’s account of the recommender system, any account of probabilities needs to reframe the operation of the recommender system in a way freed from lingering incompatibilities between calculation and social life (Dewey, 1957: 26). In reframing probability, I draw directly on the work of Karl Popper. In an essay written towards the end of his career, Popper (1990) presents a non-standard account of probabilities as real processes. He argues that probabilities have a reality equivalent to forces and fields in physics. Against standard interpretations, Popper does not identify probabilities with either degree of belief (likelihood) or frequency of events. Instead Popper (1990: 14) suggests ‘they should be regarded as inherent in a situation’. In his account, probabilities are tendencies towards realization inherent in a situation. Probabilities express or indeed are propensities, tendencies to realize the event (p. 11). While Popper’s concept of probability as propensity might seem remote from the concerns of online grocery shopping, it applies quite well to Patel’s presentation of the personal relevance model and the ways in which it seeks to calculate probabilities of purchase.
Apriori conditional probabilities
Probability has been difficult to work with philosophically according to Popper because dice rolls, coin tosses, urns with balls, and other seemingly random events have occupied centre stage. Although dice rolling and coin tossing have been enormously productive and transformative in scientific thought and practice, it privileges absolute probability at the expense of conditional probabilities. ‘We need’, Popper (1990: 16) urges, ‘a calculus of relative or conditional probabilities as opposed to a calculus of absolute probabilities’. Relative or conditional probabilities are propensities that depend on other events for their own realization. All events require other events, so all probabilities are conditional, even if probability calculations typically abstract or ignore the inevitable physical conditioning of their realization. 7
Viewed from the standpoint of probabilities, predictive systems are highly crafted arrangements for the calculation of conditional probabilities. For instance, the first element of the new recommender system consisted in a change of the underpinning algorithms and model. Patel described move away from ‘a rules-based system’. It is likely that what Patel describes as the ‘rules-based system’ refers to the extremely well-known association rules or apriori algorithm learning technique, developed by computer scientists Rakesh Agrawal and Ramakrishnan Srikanti working at IBM Research Alameda in the early 1990s (Agrawal et al., 1994). A now-classic approach to ‘market basket analysis’, it was listed as one of the top ten data mining algorithms in a survey conducted amongst data miners (Wu et al., 2008) and usually attracts a chapter in data mining and machine learning textbooks (e.g., Hastie et al., 2009). The interest of apriori for our purposes is that it begins to address the problem of understanding large numbers of shopping transactions as a matter of conditional probability.
The notion of conditional probability at work in apriori is relatively simple, and assumes that the frequency of co-occurring items provides the best guide to what shoppers are likely to buy. The apriori algorithm finds sets of items that commonly occur together in transactions. In this sense, it is still oriented by a notion of absolute probability, inflected by some elements of conditional probability concerning the relations between things. Commonly occurring sets are expressed as ‘association rules’. For instance, applied to a dataset of generic, unbranded grocery purchases, the apriori algorithm counts frequencies of purchase in the overall set of all items purchased in a supermarket (the groceries dataset was acquired from a ‘local German supermarket’, Hahsler et al., 2006). Figure 2 shows how often the most common items appear. Whole milk appears most frequently.
Frequency of items in the grocery dataset.
The first five association rules for the groceries dataset.
2500 sauces: Apriori meets the API
Even as apriori expresses associations between things as relative probabilities, it struggles with the propensity of commodities to multiply, especially in supermarkets and grocery shopping. A simple illustration of the combinatorial problem faced by recommender system can be drawn by bringing the grocery dataset together with the actual list of items that Tesco sells online. If we take all the items in the grocery dataset and paste them into the ‘shopping list’ box on the Tesco grocery website (or as I did, run them as searches on the TescoLab Product API, Tesco, 2016), each of the 169 generic items in the grocery dataset matches dozens and sometimes thousands of products in the Tesco inventory.
Items in the Groceries dataset proliferate into a Tesco’s list of branded products. The 169 items of the Groceries dataset expand into roughly 38,114 Tesco items (see Figure 3). Recommender systems confront, I would suggest, a logistical proliferation of commodities. The association rules derived from the grocery dataset becomes more open to identifying sets of items that have only low association with each other. Given almost 2500 sauces and 1200 rice products listed by the Tesco API, a tremendous number of associations between sauce and rice are possible. The propensities of any given sauce and rice product to find themselves together in a shopping basket vary greatly. The proliferation of things on the shelves of supermarket or grocery warehouse produces a combinatorial problem for data mining machine learning approaches such as association rules. A total of 38,114 products (actually Patel mentioned 200,000 products) can be combined in many ways. If a typical shopping list has 20 items, then there are 1.711594e+73 possible lists. Most possible shopping lists have propensities or tendencies to realization expressed as probabilities close to zero. Others with somewhat higher propensities might furnish the basis of interesting recommendations.
Tesco grocery items with more than 50 products.
More importantly, the combinatorial proliferation of association rules suggest why personalization or a ‘personal relevance’ model might become relevant. Even if the association rules provide recommendable sets of frequency-weighted associations, an apriori-based recommender system has no way of narrowing its recommendations. Its probabilization of shopping is incomplete since it only works on associations between things. The tendency of milk to find itself in a shopping basket alongside bread attests to an ordinary propensity in certain parts of the world. The conditional probabilities implicit in the association rules do not, however, include much of the world. These associations are not trivial, but they are very open ended. Put in terms of Popper’s account of probability as propensity, the rules-based system has, in principle, limited means of crystallizing a limited or enclosed set of possibilities. 8
The list as relational field
Predictive models have become central technical elements in many big data conversion events because they offer a way of operationally calculating probabilities that narrow the propensities – to purchase a magnum of French champagne for instance – inherent in situations. In the new recommender system described by Patel, the business goal is to extend the list of the items customers have selected for purchase with a few recommended items. In order to achieve this seemingly modest goal, the list of items selected for purchase will be extended by recommendations that have, according to a predictive model, the most chance of ‘conversion’ or actual purchase. Most recommender system designers assume that modelling ‘personal relevance’ is the best way to do this. The predictive model carries the burden of calculating the conditional probability of purchase given everything known about the person at a given moment in time.
Persons are strangely remote from such models. Patel introduced the new ‘personal relevance model’ with a data graphic familiar to machine learners and statistical modellers. (A sketch of her graph appears in Figure 4.) Patel assumed that the audience of data scientists understood the working of logistic regression, association rules, and random forest models, and the bulk of her presentation concerned the obstacles and problems that arise in trying to personalize recommendations in ways that lead to the much-desired ‘conversion events’ or sales. The graph plots the precision – the proportion of the recommended products that customers actually purchase – for several different statistical models. The graph indexes the ‘causal efficacy’ of the recommender system, its capacity to include and transform propensities or ‘real potentialities’ into operational events or purchases. Patel used the graph to compare the previous rule-based recommender systems with some of the ‘personal relevance model’ alternatives – logistic regression, random forests, gradient boosting, and a few others – in terms of their predictions and how those predictions turned out. Patel dismissed most of the models quite quickly and focused only on one, the logistic regression model, which did as well or slightly better than the alternatives.
Precision measurements for different machine learning models.
The models Patel mentioned are all ‘classifiers’, predictive models that ‘classify’ particular outcomes by calculating their probability of membership of some class of outcome. Much of the core architecture of the machine learning classifiers received glancing reference in Patel’s presentation. In DunnHumby’s work, the classifier at the heart of the recommender system calculates for a given customer the most likely products to be purchased. The predicted classes are binary: recommended or not recommended. Products are allocated to one of these classes depending on their calculated probability. A probability greater than 0.5 typically would be recommended. The logistic regression model generates probabilities of purchase for each product for each customer. The fundamental shift from the rules-based model is that the classifier model extends the reach of the recommendations generated for a customer to not only the 250,000 items in the Tesco inventory but to any item, relationship, similarity, or event that can be constructed as a variable in the classifier model. In this sense, the recommendations shown to a customer in the ‘Have You Forgotten’ list derive from a much more extended conditional probability statement, woven from, as we will soon see, an open-ended field of relations ranging beyond associations between grocery items.
Repeating sufficiently often to matter
I will not discuss all the dimensions of probabilization implicit in the architecture of the new model but highlight those elements that appear as part of the description of ‘personal relevance’ yet remain irreducible to personalization. These include the problem of repetition and temporality of calculation, the platform as experimental site, and attempts to create new views of the data.
One fundamental difficulty is that the propensity of a customer to buy a recommended product changes. Popper’s notion of probabilities as tendencies to realization inherent in a situation implies this. In many cases, as Popper (1990: 17) observes ‘the propensities cannot be measured because the relevant situation changes and cannot be repeated’. Measuring propensities and expressing these measurements as numbers between 0 and 1 becomes difficult because as a situation changes, the propensities themselves change. No doubt the contingencies associated with grocery shopping are legion, perhaps more so than other kinds of products (books, music, films), and any attempt to measure propensities associated with particular commodities will encounter many changes.
One way in which recommender systems might attempt to address the problem of the relational dynamics of propensities is by assuming that purchases will be repeated. Patel mentioned that the new model uses ‘52 weeks of data’ for each customer. In including this data in the model, the assumption is that the probability that a specific customer will buy a specific product will be increased by previous purchases of that product. The history of previous purchases constitutes forms of repetition that imply a stabilization of propensities (e.g., a vegan customer will never have purchased chicken products, so the measured propensities for any of the 1000 or so chicken products sold by Tesco will remain close to 0). But the changing situation of the customer only figures here in the accumulated year of purchase data. Past purchases might be exactly what does not to be recommended.
When the logistic regression model includes the 52 weeks of previous purchases, the conditional probability calculation undertaken by the recommender system ramifies tremendously in several respects. Potentially, each of Tesco’s 200,000 products becomes a variable in the classifier. We can imagine this as an arithmetic sum extending along a series of 200,000 terms. Practically, most of these variables will only slightly influence the sum of the probabilities for recommendations since most customers will have purchased only a fraction of the inventory. As Patel observed, again assuming that this would be obvious to the audience of data scientists, ‘we have lots of zeros’. A matrix that records associations between individual people and products is bound to be mostly empty. Say Tesco has one million online customers. Each online shopper has bought some selection of the 200,000 products. The customer–product data matrix will be 2e+11 in size. The product–customer matrix, the basic vector space in which all recommender systems operate, remains very sparse and unpopulated. Given that any one customer is likely to only have bought 100 or so different products, the matrix contain 99.95% zero probabilities.
Any dataset where the items of interest are much rarer than other values is said to be ‘unbalanced’. The purchase history data is, as Patel put it, ‘massively unbalanced’, and imbalance heavily biases the model towards common and somewhat impersonal suggestions, suggestions that might not produce the desired conversion experience for either the individual customer or DunnHumby’s renovation of Tesco’s recommender system. Since so many people buy milk, the recommendation system might end up always recommending milk. So the data needs to be ‘corrected’ by, as Patel reports, removing – ‘undersampling’ – some of the data for common purchases. ‘Having all the data’, one of the anchoring claims of big data conversion narratives, also creates the need to delete some data.
The problem of repetition, the fact that situations change and therefore propensities change too, is known to designers of recommender systems. But they do not have the ideal conditions envisaged in probability theory (unbiased coins, flipped any number of times). The model’s own operation as a data-intensive calculation need to be adjusted to the scale of values of grocery retail, to the available computational, database and network infrastructures, and to the capacities of the online grocery system to inject recommendations into the flow of grocery orders in a timely fashion. While raw data from Tesco Online transactions feeds into the model’s dataset every hour, a recommendation list for each customer is only generated once a week. Customers shop online every few days at the most, and in some cases, only every few weeks. To update the top 200 recommendations for several millions customers demands much computation. Patel briefly mentioned specific infrastructural elements such as hadoop (Apache Software Foundation, 2009).
The possibility of adjusting the recommendations for each customer every week depends on an infrastructure capable of collecting data and assimilating that data in a predictive model. ‘Personal relevance’ depends on a matrix of probabilities of associations between people and things that shifts in time. Hadoop and its legion of ‘big data’ variants (mahout, spark, hive, pig, yarn) operationalize repetition at an infrastructural rather than analytical scale. Patel’s quick gloss of the infrastructural deployment of DunnHumby’s relevance model is primary to the conversion event: the logistic regression model at the heart of the recommender system is no longer an analytical device but an operational one because it seeks to revise recommendations as situations change.
All of these considerations – the problem of a changing customer, the need to undersample ‘unbalanced data’ for the predictive model to work, the energy and computational time costs of running models, as well as the technical complexity of staging predictions for an online platform – form part of big data conversion events, as they concretely and operationally take place. They do not belong to the data or predictions as such, but to the situation in which recommendations might become purchases.
Platform experiments reduce interfering propensities
In his account of probabilities as physical propensities, Popper (1990: 23) emphasizes why laboratory experiments are important: ‘experiments work … by creating, at will, artificial conditions that either exclude, or reduce to zero, all the interfering and disturbing propensities’.
Experimentalization, the practice of creating conditions that reduce disturbing propensities, runs deep in big data conversion events. The predictions of the recommender system themselves are the subject of experiment. Patel described the deployment of the personal relevance model in a random A/B controlled trial on the Tesco website. All customers were allocated to one of four categories as shown in the table. In the A/B testing, customers receive recommendations from different models (the old recommender systems versus the new one; a logistic regression model versus a random forest model, etc.). The randomized application of competing predictive models draws on protocols for randomized clinical trials first developed in the 1960s and is widely used in social media platforms and hence in the implementation and observation of the effects of recommender systems. 9 Random allocation of customers to the four categories (Test A, Test B, Control A, Control B) adds a layer of probability to the recommender system in the name of statistical validation of the effects or the ‘uplift’ of the model. 10 Ironically, the effects of a predictive model cannot be known in advance. They can only be observed experimentally.
Random allocation of customers to control and test groups occurs without taking individual propensities into account. A/B testing seeks to statistically validate effects – the uplift – of the model on conversion rates by directly measuring the effects of the model on what people do. The uplift refers to conversion events associated with the same groups of people. Effectively an experiment in creating micrologically different worlds, the randomized control trial sets up a control mechanism that connects the predictive model (the logistic regression), and the conversion event more broadly. Without this experimental connection, the conversion event narrative lacks grounding in a state of affairs in the world.
The openness of the data: New features
Whatever disturbing factors or interferences the experimental trials of different predictive models reduce, another stream of propensities runs off the terrain of personal relevance. Popper (1990: 22) suggests that the realization of tendencies is inevitably open ended: ‘What may happen in the future … is, to some extent, open. There are many possibilities trying to realize themselves, but few of them have a very high propensity, given the initial conditions’.
There are different ways of reading this statement. We can read Popper as stating a truism, an obvious consequence of his physicalist understanding of probabilities as tendencies inherent in a situation: anything can happen in the future. An alternative, perhaps more interesting reading focuses on the pivotal phrase: ‘possibilities trying to realize themselves’. What might ‘trying to realize’ mean in practice? DunnHumby’s work on the personal relevance model, and much big data practice in general, presents just such a ‘trying to realize’ in practice.
Several times in her presentation, Patel emphasized the importance of ‘good features’ in the data, and much of her presentation concerned DunnHumby’s efforts to construct ‘good features’. A ‘feature’ in the context of machine learning and predictive modelling refers to a variable included in a predictive model (Domingos, 2012). A ‘good feature’ contributes to the accuracy, precision, specificity, or any of the other measures of prediction applied to machine learning models in practice. The construction of ‘good features’, however, remains open to many different possibilities, some of which are more practically feasible than others, and some of which are more aligned with personal relevance than others.
The main efforts that DunnHumby made to construct good features were not closely focused on individuals but sought to address relations between things in the predictive model. Patel, for instance, described the problem of ‘basket similarity’. The recommender system should not recommend items that are too similar to groceries already in a customer’s basket. A customer might be willing to substitute a similar item for something they have already chosen, but they are more likely to accept a recommendation that complements already selected items. How could the recommender system avoid similarities and prefer complementaries between things for a given customer, especially since a person’s sensibilities and susceptibilities concerning similarity and complementarity are shaped by social groups, orderings, and circumstances? A predictive model could only do that if it had some sense of the relation between items already in the list and the products elsewhere.
The ‘personal relevance’ model sought to include the similarities and complementarities between items. The DunnHumby data scientists constructed a new feature from the previous purchase data measuring substitutability and complementarity between products. Taking all the baskets of items purchased on the basis of recommendations, they derived a new feature, what DunnHumby termed ‘self-learning substitutes’. Added to the relevance model, the ‘self-learning substitutes’ feature adds another data matrix, the product similarity matrix, itself generated from all previous recommendations that have led to conversion events (i.e., ‘a user clicked as the response’). The self-learning substitutes feature is complex. Patel mentioned that it took the form of a ‘design matrix of 14,000 columns’. Fourteen thousand new explanatory variables were added to the logistic regression model. Recommendation would be subtly re-weighted by this complex derived feature. 11
In a world of propensities, the different encounters with the data staged in the personal relevance model cannot be reduced to any simple probabilistic calculation. Rather, as Patel’s presentation of the different models, the logistics of running the models in a platform setting, the experimental validation of the model in A/B testing, and the construction of new features that sought to increase complementarities and reduce similarities suggests, the expression of propensities occurs in a changing weave of infrastructure, mathematics, history, and logistics. None of these has a particularly strong relation to personhood or individual experience, even if personalization the main way this weave is figured.
Conclusion
The personalization of recommendations has been a distinctive feature of big data conversion narratives and practice. The Tesco recommender system, starting from its early experiments with demographic data mining, its later adoption of a rule-based ‘market-basket’ analysis system, and its recent implementation of a machine learning ‘personal relevance model’ documents the trajectory of a personalizing conversion event. Its recent big data conversion event largely takes the form of personalization. Individualizing personalization, however, is ill fit to the socially complex negotiations of grocery shopping and tends to obscure other implications. Conversion events, in both the sense of the purchases made on the basis of recommendation and the narratives of changes in modelling practice related by data scientists do not easily map to personalization. They include relations running between people and things in time. They include an ongoing process of adjusting, scaffolding, intervening, and configuring orders – in several senses of that term – that range across supply chain management.
I have suggested that we might understand recommender system as part of the ongoing operationalization of propensities described by Popper. The conversion described in this case is a propensity in actualization, but a propensity that has the character of expressing propensities through predictions and recommendations. From the perspective of a world of propensities, the constitutive incompleteness of shopping lists and their propensity to expand or change might be more important than their capacity to be personalized. Only because grocery shopping is personally pre-ordered by lists can a predictive recommendation, somewhat conducive to conversion events, gain traction.
We might understand what surrounds and exceeds personalization as probabilization. The framing of prediction as a ‘flat difference of degree, such that it appears as though everything is calculable’ (Amoore and Piotukh, 2015: 361) affirms the importance of calculation but risks misunderstanding the how propensities tend towards realization, and how models express those tendencies. If, as Popper (1990: 14) argues, a propensity-based account of probabilities ‘amounts to generalizing and extending the idea of forces again’, then we should see the computing platforms, databases, software libraries, various predictive models, web interfaces, apps, and global supply chain logistics as part of this conversion event, as tendencies or propensities in the process of realization. When we track what is practically done to construct a personal relevance model and apply it to a shopping list, we see conditional probabilities assembled in predictive models, but also in the rhythms of platform and infrastructural configurations and experiments that gather and process data, and in the open-ended construction of features in data. In various ways, these tendencies shift and complicate the associations between people and things, people and people, and things with things. They cut across boundaries between the personal and impersonal. If social order is made of propensities to associate, if to be social is a propensity to associate, then big data conversion events operationalize association in matrices of propensity.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
