Inflated granularity: Spatial “Big Data” and geodemographics

Abstract

Data analytics, particularly the current rhetoric around “Big Data”, tend to be presented as new and innovative, emerging ahistorically to revolutionize modern life. In this article, we situate one branch of Big Data analytics, spatial Big Data, through a historical predecessor, geodemographic analysis, to help develop a critical approach to current data analytics. Spatial Big Data promises an epistemic break in marketing, a leap from targeting geodemographic areas to targeting individuals. Yet it inherits characteristics and problems from geodemographics, including a justification through the market, and a process of commodification through the black-boxing of technology. As researchers develop sustained critiques of data analytics and its effects on everyday life, we must so with a grounding in the cultural and historical contexts from which data technologies emerged. This article and others (Barnes and Wilson, 2014) develop a historically situated, critical approach to spatial Big Data. This history illustrates connections to the critical issues of surveillance, redlining, and the production of consumer subjects and geographies. The shared histories and structural logics of spatial Big Data and geodemographics create the space for a continued critique of data analyses’ role in society.

Keywords

Big Data geodemographics data science data analytics black box critical data studies

Dear current resident,

Congratulations! You’ve been pre-approved for our special offer …

Some marketing firm thinks that I or at least a “current resident” of my neighborhood is a good (but not great) credit risk. I toss the junk mail onto a growing pile and move on to a more important task: dinner. With many restaurants closed on Monday, I turn to my smartphone to look for ones that are open. Based on my location, past searches, and other information, a targeted ad pops up for a new Indian restaurant a block away. It’s an easy choice. Using spatial “Big Data”, the advertising successfully triggered a craving for rogan josh.

On paper, delivered to every mailbox on my street, the credit promotion feels worlds away from the advertisements for restaurants and clothing brands on my smartphone, targeted at me using location-indexed Big Data.¹ The geographical Big Data targeting of ads on my phone may seem new and exceptional, but it has precursors, including neighborhood geodemographic targeting like the credit offer. Current technical definitions of Big Data tend toward data that pushes existing technology to its limits in three ways: volume, velocity, and variety (Horvath, 2012; Laney, 2001). Such data “forces us to look beyond the tried-and-true methods that are prevalent at that time” (Jacobs, 2009: 44), to a mythological belief that “large data sets offer a higher form of intelligence and knowledge” (Boyd and Crawford, 2012: 663). In a general sense, spatial Big Data is Big Data that incorporates digital location information. Currently, the majority of spatial Big Data is generated through the use of location-based services found on mobile devices (Laurila et al., 2012). It is highly valued for providing “rich information” about the lives of individual end-users, such as my location and penchant for South Asian food (Long et al., 2012) and sits at the “intersection of people, places, and technology” (Evans, 2013: 3).

Despite the concerns over Big Data’s influence on society raised by critical scholars (Boyd and Crawford, 2012; Crampton et al., 2013; Dalton and Thatcher, 2014; Wilson, 2012), for many “[T]he big ethical issue … is that nobody thinks this is an ethical issue” (Paul, 2013). Despite a literature that stretches back to at least 1997 (Cox and Ellsworth, 1997), Big Data advocates and practitioners can avoid ethical and social considerations by framing the field as perpetually new and innovative, thus legitimizing itself as natural and inevitable (Leszczynski, 2014; Puschmann and Burgess, 2014).

We undermine this ahistorical myth by contextualizing spatial Big Data, charting a recent history through a confluence of geographic, capitalist, and technological interests and impetuses. Spatial Big Data did not emerge from a vacuum, but presenting it that way helps attract significant capital investment. Such an ahistorical approach facilitates a simplistic and self-interested recounting of spatial knowledge that obscures the asymmetric relations of power and profit it produces. In particular, we build on recent critical work on the geoweb² and neo-geography (Barnes and Wilson, 2014; Kingsbury and Jones, 2009; Leszczynski, 2014; Wilson, 2012), and older critical Geographic Information Systems (GIS) work on geodemographics (Goss, 1995a; Phillips and Curry, 2003; Graham, 2005), to begin to develop a historically grounded critical data studies approach (Dalton and Thatcher, 2014).

Spatial Big Data is big business. Consumers bought approximately 1.3 billon location-aware smartphones in 2014 alone, each of which collects loads of personal data including location, purchases, status updates, social media connections, calendars, calls, web-browsing, etc. (Arthur, 2014). Companies that collect and process large consumer datasets with a geographic component are increasingly valuable.³ According to BIA/Kelsey senior analyst Michael Boland, location-aware applications offer the “holy grail of advertising:” to be able to tell if any given ad resulted in the targeted consumer going to the advertiser’s store (Peterson, 2013). Location data is a “hot commodity” (Profitt, mobile technology expert, quoted in McBride and Oreskovic, 2013) as it, and the companies built around its creation, analysis, and control, are valued for the targeted advertising opportunities that their data makes possible.

Spatial Big Data isn’t just business, it’s “big” science as well. Universities are forming partnerships with private industry and government agencies to develop algorithms for analyzing “big” spatial datasets. Graphics computer chip manufacturer NVIDIA recently launched the NVIDIA CUDA Center of Excellence Program, which “recognizes, rewards, and fosters collaborations” with research institutions such as UNC-Charlotte’s Center for Applied GIScience (NVIDIA Corporation, 2014a, 2014b). As Kitchin (2014b) suggests with Big Data in general, spatial Big Data is reshaping science (Goodchild, 2013; Gorman, 2013) and society as marketers (Ratner, 2004), urban planners (Townsend, 2013), political analysts (Ansolabehere and Hersh, 2012), and national security agencies (Crampton et al., 2014) use it to understand, model, and attempt to shape the world.

Such initiatives are administratively valorized and well-funded, but often give little consideration to the broader impacts of Big Data science. In this article, we contextualize recent spatial Big Data developments within the longer history of geographically targeted marketing. Historically situating spatial Big Data opens a possibility to learn from existing critical approaches and to develop new ones with the promise of better informed research, critique, and resistance involving Big Data. To that end, the article proceeds in three sections: First, we detail how both geodemographic and spatial Big Data analyses attempt to quantify social identity, though spatial Big Data promises an epistemic break by focusing on an individual person, not a geodemographic area. Second, the shared history illustrates three shared logics that shape how spatial Big Data emerges from the milieu of geodemographic marketing: their market orientation, technological black-boxing, and the promises of ever more fine, ever more relevant analysis. Third, with that foundation, we present approaches to geodemographics, both applied and theoretically oriented, to better understand spatial Big Data. Together, these sections situate spatial Big Data in terms of the past, highlighting its underlying structural logics, issues, and limits.

All in the family: Geodemographics’ and Big Data’s shared foundations

Data is getting big(ger) (Farmer and Pozdnoukhov, 2012). As data storage capacity gets larger and computation faster, the exact technical composition of “big” has endured a “relentless march from kilo to mega to giga to tera to peta to exa to zetta to yotta” and beyond (Doctorow, 2008). For this reason, while there exist a myriad of definitions of Big Data (c.f. Kitchin and Lauriault, 2014; Laney, 2001; Manyika et al., 2011; Mayer-Schonberger and Cukier, 2013, etc; for a general review of 12 definitions see Press, 2014) most emphasize data that stress existing technology, often in terms of “three Vs:” data volume, velocity, and variety (Horvath, 2012; Laney, 2001). At the same time, Big Data promises more than simply large data sets. For its boosters, it has created a “breathtaking time in science” (Frankel and Reid, 2008: 30), in which the “enterprise become[s] a full-time laboratory” (Bughin et al., 2010). In a recent webinar, HP CEO Meg Whitman suggested that Big Data is going to quite literally transform “everything” (Whitman and Youngjohns, 2014). As recent critical scholarship shows, such views are modern myths (Boyd and Crawford, 2012) with their own set of epistemological commitments (Thatcher, 2014). Even as few practitioners focus on the ethics of spatial Big Data (Paul, 2013), popular (Marcus and Davis, 2014), academic (Kitchin, 2014a), and state (Executive Office of the President, 2014) sources are beginning to question both the efficacy and morality of Big Data. These critiques begin to explore how data are always expressions of power (Wilson, 2014a) that are never ontologically prior to their interpretation (Boellstorff, 2013). Spatial Big Data has forerunners in other modern attempts to represent, model, and ultimately produce social geographies of consumption. Situating its preconditions in terms of a history of geodemographics makes clear the shared structural logics, criticisms, and resulting basis for a promised epistemic break in spatial Big Data.

Proponents of geodemographics define it as “the analysis of socio-economic and behavioral data about people, to investigate the geographical patterns that structure and are structured by the forms and functions of settlements” (Harris et al., 2005: 225) or simply the “analysis of people by where they live” (Sleight, 1997: 6). It can involve a range of topics including policing or urban planning, but in practice, geodemographics is chiefly dedicated to consumer profiling and opinion polling based on residency (Burrows and Gane, 2006; Longley, 2005; Sleight, 1997). To develop a new geodemographic system, experts identify a number of statistical socio-economic clusters (profiles) based on dozens of indicators from demographic and/or consumer datasets. Similar clusters are lumped into labeled groups which can range from the “Upper Crust” to “Affluent Achievers” to “Thriving Greys” to “Hard-Pressed Families” to “The ‘Have-Nots’” (Batey and Brown, 1995: 94; Harris et al., 2005: 13). In practice, the geodemographic system uses a proprietary algorithm to assign a cluster/group designation to each geographic area, such as a postal code, in a study region. The underlying geographic logic is that people tend to reside near similar people, making for a socio-economically homogeneous geographical unit. This is typically expressed in the literature in terms of deterministic clichés such as “birds of a feather flock together” (Burrows and Gane, 2006; Flowerdew and Leventhal, 1998; Harris et al., 2005; Longley, 2012; Nelson and Wake, 2005) and “You are where you live” (Burrows and Gane, 2006; Leslie, 1999; Mitchell and McGoldrick, 1994; Phillips and Curry, 2003). Once applied, the classification is a social profile for the postal code, allowing companies and public agencies to better allocate their resources geographically. For example, a company selling luxury cars can focus their advertising on postal codes identified as “Thriving Greys” whereas sub-prime mortgage lenders can market to postal codes identified as “Hard-Pressed Families.”

Geodemographics practitioners point to their own precursors in Charles Booth’s maps of poverty in London in the 1880–1890s, the later Chicago School of Sociology, and mid-century factorial ecology and social area analysis (Harris et al., 2005; Singleton and Spielman, 2014). Geodemographics began to develop as a field and thereafter as an industry in the mid-20th century amidst geography’s quantitative revolution. Scholars, most notably William Warntz, developed forms of spatial analysis based on social physics, a monistic idea that social relations follow the laws of physics, to analyze geographic areas using contemporary computers (Barnes and Wilson, 2014; Warntz, 1964). In the early 1960s, newly granular data in the form of ZIP codes in the US and similar neighborhood data in the UK allowed academic social researchers to study areas comprising 15,000 people or less. Researchers in both countries, including Jonathan Robbins in the US and Richard Webber in the UK, developed algorithms using that demographic data to identify areas in need of public subsidies (Harris et al., 2005).

Direct marketers quickly took notice of how these methods could be used for targeted advertising. Both Robbins and Webber entered the private sector as geodemographics experts. Through the 1970s and 1980s, they and others built the geodemographic powerhouse firms Claritas, CACI, and Demographics Inc. (now Axciom). Analyses in the United Kingdom often used general-purpose, open national classification systems at fine geographic scales, whereas analyses in the United States tended to be more topically focused and closed, utilizing coarser spatial scales, but more demographic clusters (profiles) (Singleton and Spielman, 2014).

Technologically, the implementation was the same. This nascent geodemographics industry grew in connection with contemporary developments of GIS. Connecting tabular and spatial data in a GIS facilitates the spatial designation of geodemographic profiles to areas and subsequent geographic analysis. However, geodemographics had little connection with academic geography research, a tendency apparent in the small number of publications about geodemographics in academic geography journals, particularly in the United States (Harris et al., 2005: 228, 241; Openshaw, 1989; Singleton and Spielman, 2014). By the early 1990s, the combination of unprecedented computing power in GIS, growing market demand, and readily available capital fed an explosion in the geodemographics industry. Geodemographic services included profiles of existing customers to identify those most likely to make more purchases, generating new business through direct mail marketing, credit scoring, analyzing media markets for advertisers, survey design, and planning (Mitchell and McGoldrick, 1994; Phillips and Curry, 2003).

The networked consumer

In the late 1990s and early 2000s, web-based companies outside geodemographics, such as Overture, Yahoo!, and Google, revolutionized targeted marketing. Instead of classifying neighborhoods, they use non-geographic means such as users’ web searches and web browsing histories to offer personalized batches of advertisements (Battelle, 2005). As early as 2003, it was clear to many of these technology firms that location held great potential for personalized marketing (Dalton, 2013). In the mid-2000s, early “Web 2.0” firms, such as Facebook and Twitter, applied this personalized approach using additional user-contributed data such as status updates and tweets in conjunction with location. Tech companies began acquiring or developing geographical functions for their services by adding location information to their consumer data collection. These web-based location-linked services became the basis for many of today’s geoweb applications including Google Maps, Foursquare, Twitter’s geographic services, and millions of secondary geoweb applications (Hoetmer and Marks, 2013). At the same time, firms that performed traditional geodemographic analyses, such as Claritas, Epsilon and Acxiom, emulated the technology firms by offering individualized consumer profiles that involve spatial data (Kitchin, 2014b).

Critical scholars today describe spatial Big Data as part of the shift in “production, dissemination, and institutionalization” of spatial media (Leszczynski, 2014: 62) occurring as mobile applications’ move from simply capturing consumption patterns to attempting to actively shape them (Wilson, 2012). The difference is not in the intent, as shaping consumption patterns has long been the goal of geodemographic marketing (Goss, 1995a), but in the scale and methods through which spatial Big Data functions. Both processes are market driven and demand “a correspondence between the dispositions, attitudes, and socioeconomic characteristics of a significant majority of the data subjects and the classification itself” in order to generate value (Uprichard et al., 2009: 2827). However, geodemographics uses the neighborhood or postal code, an area, as the unit of measure, whereas spatial Big Data promises the same outcomes at the individual level.

Spatial Big Data commodifies the individual. One’s personal locations, dispositions, attitudes, and socioeconomic characteristics are the object of analysis, rather than geographically located populations. Industry hype promises this shift will create the “killer application of the 21st Century”: individually targeted location-specific ubiquitous advertising (Krumm, 2011). Such applications use location data, combined with other information, to serve advertisements. For example, if an application on my phone knows that I’m at Lowes, it could remind me that my mother’s birthday is coming up and suggest I purchase some tomato plants for her garden. If the application records me moving at jogging speed over long distances, it might advertise new running shoes or a gym with a track in the winter. Mobile devices, enabled with location-tracking, accelerometers, pedometers, and even heart rate monitors, transform individual people into both sensors of the surrounding world (Goodchild and Li, 2012) and sensors of themselves. Wolf, Kelly, and others in the Quantified Self movement engage this shift on a personal level (Wolf, 2011). On a wider level, those who purchase, analyze, and leverage this data collect an individual’s information to target ads at that individual person, not families or people residing in the same postal code. More flows of data, not only from one’s purchase history, but also from one’s locations, speed, and even heart rate, facilitate more personalized targeting.

On a fundamental level, both geodemographics and spatial Big Data assume that social identity can be reduced to measurable characteristics that can be algorithmically classified. Furthermore, as with social physics, this assemblage of personal data is predictive, or can be made to be so, commodifying it as valuable in marketing and ultimately making a sale. While geodemographics and spatial Big Data are hardly alone on this point, this commodification proceeds in specifically geographic ways. A neighborhood’s assigned geodemographic class or an individual’s assembled profile can become a self-fulfilling prophecy as people respond to advertising as consumers. Feedback loops may form in which consumers are presented with options based on available data. Their choices are then used by marketers to target subsequent advertisements, advancing some options and limiting the consumer’s perceived choices (Lohr, 2012a). In this way, geodemographics and spatial Big Data do not merely represent or target people, they produce social relations and geographic spaces of consumption (Burrows and Gane, 2006; Goss 1995a, 1995b; Zook and Graham, 2007). Advertising, buying, and subsequently using running shoes may lead a consumer to walk more often, contributing to demand for tracks in parks and more athletic shoe production. That consumer may also be manipulated into spending too much of his/her income on shoes. These processes are already ongoing in everyday life; for example, the online dating serviceMatch.com aggregates and analyzes a variety of individual data points, including location, to determine who is romantically matched with whom (Lohr, 2012b). The stakes of both geodemographic and Big Data analyses are not merely about data, they are about who we are and how we live.

Shared traits: Markets, black boxes, and epistemologies

Market orientation, market epistemology

Building from the shared foundational concept of a measurable, malleable social geographic identity, both geodemographics and spatial Big Data rely on exploratory correlations to arrive at analytical outcomes, instead of more rigorous geographic or sociological methods. As far back as the 1970s, Richard Webber proposed that geodemographic classification was an inductive approach for identifying new insights (1975). “In this regard, geodemographics is regarded as a data exploration tool, not a statistical method of hypothesis confirmation or rejection” (Harris et al., 2005: 15). Regardless of these limitations, geodemographic methods provide sufficient grounds for corporate decision making.

[Geodemographics] has been used in the business sector for 25 years now … and it is still here, stronger than ever! Given the nature of business decisions, the cost of using geodemographics would not be borne if the technique could not prove its worth. (Harris et al., 2005: 225)

The goal is not to understand geographic phenomena, but to be able to effectively target consumers using geographic criteria.

Spatial Big Data is similarly market oriented. Correlative algorithms identify people geographically who are likely interested in a given product; explanation is not the point. At an extreme, this view argues “Who knows why people do what they do? The point is they do it, and we can track and measure it with unprecedented fidelity” (Anderson, 2008). In this formulation, Google “conquered the advertising world” by leveraging more and “better” data for profit without concerning itself with explanation (Anderson, 2008). Anderson’s “end of theory” argument has been criticized both by Big Data evangelists (Silver, quoted in Marcus and Davis, 2014) and critics (Bollier and Firestone, 2010). Though some scholars attempt to critically evaluate the epistemology of Big Data (Boyd and Crawford, 2012; Burns and Thatcher, 2014; Dalton and Thatcher, 2014; Miller and Goodchild, 2014), such debates also contribute to the myth of its newness, serving a market function by building buzz around Big Data as a business necessity, a must-have for competitive targeted advertising. The underlying market orientation, with its goals of predictive, actionable data over austerely correct information, remains in both Big Data and spatial Big Data.

The focus on data correlation for capital gain is apparent in Facebook’s recent deletion of “fake” accounts and imposition of more stringent identification requirements for “authentic” personal accounts. For Facebook, value is generated through a “like economy” whereby a single user’s social actions⁴ are instantly turned into valuable consumer data and enter multiple cycles of multiplication and exchange (Gertlitz and Helmond, 2013). Facebook feared “fake likes” from automated accounts, because they meant that “like” data no longer correlated to reality, a gap that could potentially cost the company revenue if advertisers lost faith in the marketing value of “likes” (Fung, 2014). Therefore, Facebook radically altered who could participate and thus what data is created under the impetus of its explicitly profit driven orientation.

A similar impetus drove Foursquare’s recent reinvention as a geographically indexed place recommendation service. With spatial Big Data, the use of mobile smartphones with GPS technology transformed physical location and experience into a digital commodity that may be bid for, bought, and sold (Thatcher, 2013). Leveraging the data of their 45 million users, Foursquare “can build a cache of personal information [the company] then uses it to provide [users] with suggestions on where to go in the future” (Larson, 2014). By knowing where users are and what they like, Foursquare can sell ads targeted at the individual, as represented by his/her data. Achieving the “killer app,” Foursquare offers the possibility that “when a shopper goes to Best Buy, Samsung will be able to send them an ad for its TVs” (Frier, 2013). Building the “promise of a promise” of profits, Foursquare received a $41 million dollar loan (Cooper, 2007: 142) and $50 million in additional investment (Delo, 2014) to achieve this goal. As Foursquare moved more aggressively into spatial marketing, it expanded its ad sales staff four-fold and allowed all of its registered merchants to purchase location-targeted ads (Frier, 2013). Its profits have allegedly grown from $2 million in 2012 to somewhere between $15 and $20 million in 2013 (Carr, 2013). While geodemographics operates on a different geographical scale than Foursquare, they share a correlative epistemology of “accurate enough to turn a profit.”

“Big” black boxes

The most valuable assets of both IT companies like Foursquare and geodemographics firms like Claritas are their data and the algorithms that produce it. Companies keep such data and algorithms as proprietary trade secrets, resulting in black boxes: an analytical technology defined by “its inputs and outputs and not on its internal complexity” (Latour, 1999: 304). As a result, “scientific and technical work [to create that technology] is made invisible by its own success” (Latour, 1999: 304). For both spatial Big Data and geodemographics firms, black-boxing analytical algorithms and the resulting data is an act of privatization, ensuring that outsiders cannot know in detail how geodemographic or spatial Big Data analytics work. Privatization is a core means of the commodification process (Castree, 2003) and is part of a framework that allows geodemographic and spatial Big Data systems to accumulate capital. In practice, input data is fed into a company’s proprietary “secret sauce”, the algorithms which process and then present the data (Miller, 2009).

The commodification of the analytic process itself, of digital information rather than physical good, creates a monopoly on data output. This commodified, proprietary output data may then be used to target ads internally or to sell portions of it to other firms. Those who buy ad services and data, but who do not know the algorithms that go into its parsing, cannot know the details of how it was created. However, following the market epistemology, so long as it correlates well enough to the real world, it holds value.

The high valuation of software and data in a market context reinforces companies’ drive to classify their analytical procedures and resulting data as secret intellectual property to protect their assets. After all, it is much easier for a competitor to steal and replicate an algorithm than an oil rig or silicon chip. Spatial Big Data and geodemographics share the market orientation and resulting black-boxing of analysis. The exact algorithms that firms such as Claritas used to classify postal codes in the 1990s were closely guarded trade secrets, just as “all of the specific metrics and data-porn that Google considers of competitive significance” that drive its core search algorithm, PageRank, is secret today (Doctorow, 2008: 18).

The black-boxing of analytics and data for the accumulation of capital has profound effects on who has access and thus what can be known through that process. The market orientation of both geodemographics and spatial Big Data means that the accuracy of a company’s output data are verified in competitive marketplaces, rather than more formal scientific or similar scholarly verification processes. As a result, data is “good enough” because it facilitates competitive advantage. On a deeper level, Burgess and Bruns (2012) show that the very structure of data, and how it is accessed through networked streams of data, shapes what can be known through said data. Outside researchers may be unable to conduct research because a data source is cut off for market reasons. For example, once a market developed for Twitter’s data services, it ended its long-standing policy of making its data stream open to researchers free of charge (Thatcher, 2014). Currently, its full, unsampled “firehose” stream is open only to hand-picked researchers selected by Twitter and its resale partner GNIP (GNIP, 2014) and those who can afford to purchase the data. Though rates fluctuate, in August of 2014, DataSift⁵ was selling tweets from its firehose access at $0.10 per 1,000. At the current rate of approximately 500 million tweets a day,⁶ purchasing a full, undiscounted index of Twitter data from DataSift would cost roughly $50,000 a day. Blackboxed tweets hold market value, leaving Twitter little incentive to share them with outside analysts.

This market logic inhibits scientific work by preventing researchers from emulating an approach, much less replicating its results (Longley, 2012). Conversations about understanding how or to what standards the knowledge is produced, much less built-in problems or biases, are cut off by the trade secrets of production. As a result, the commodified trade secrets of geodemographics and spatial Big Data conceal some of their own analytical limits.⁷

One foundational limit remains clear: both the market orientation and its resulting black-boxing rest upon a belief in the quantification of representation. For geodemographics, this quantification occurred at the level of an areal unit, while spatial Big Data promises to go a step further, to fully represent an individual—and allow for the market targeting thereof.

From quantified space to the quantified individual

The ties between geodemographics and spatial Big Data go beyond their market orientation and privatization of analyses and data. The structure and inherent limits of geodemographics laid the epistemological groundwork for spatial Big Data as it exists today. Geodemographics demonstrated the market value of geographically targeted advertising, but it suffered from two core epistemological uncertainties that undercut its promises: first, the lack of diverse data sources and second, the heterogeneity of human populations. Spatial Big Data is the logical outcome of long-running attempts to resolve these two built-in uncertainties of geodemographics. It does so through the promise of representing a fully measured, quantified, geolocated individual, rather than the homogenized, quantified areal units of geodemographics.

For decades, geodemographic analyses in the US and Europe relied primarily on publically generated data, chiefly census data, but also voting records, housing registries, and other, similar sources (Batey and Brown, 1995). Over that time, geodemographic experts recognized populations to be increasingly diverse and that people had increasingly selective tastes as consumers. Modeling heterogeneous tastes required additional data beyond the census and its categorical limits (Longley and Harris, 1999; Phillips and Curry, 2003). Furthermore, relying on governmental data created a boom and bust cycle around governmental data releases and subsequent geodemographic activity (Mitchell and McGoldrick, 1994). As a commodified product, geodemographic firms needed to produce a continual stream of sales (Leys, 2001) that accurately represented increasingly selective consumer tastes. To meet this need, geodemographics needed to diversify its data sources and produce continually relevant-looking results. To this end, the 1990s saw geodemographic firms increasingly turning to “lifestyle” data from consumer surveys, retail purchases, and credit records for their analyses (Debenham et al., 2003; Longley and Harris, 1999; Phillips and Curry, 2003). New data sources involved additional analytical issues. Unlike a census, lifestyle data necessarily represents an incomplete population. It also required other kinds of quantification. As opposed to age or median family income, these inputs involved criteria such as interests in “home baking” or “theatre” from consumer surveys. Practitioners quantified such interests in variables based on standardized check box answers on the surveys (Longley and Harris, 1999; Phillips and Curry, 2003).

Beyond the push for more diverse, continually accessible data sources, practitioners recognized epistemological problems with the areal units at the heart of geodemographics. First, such areal units fall victim to an ecological fallacy, an “error of deduction that involves deriving conclusions about individuals solely on the basis of an analysis of group data” (O’Dowd, 2003). Geodemographics ascribes common, quantified characteristics to everyone in a given area, such as a postal code, based on its analysis. Few areas are actually that socially homogeneous. This problem was recognized as early as the 1890s by Charles Booth in his attempts to map economic classes by city block in London. His cartographic solution defined eight economic classes and represented a given street as a pure- or mixed-class using seven different colors (Booth, 1902; Harris et al., 2005). Booth was prescient in recognizing the problem. However, his fix, along with later increasingly complex geodemographic models, did not resolve the ecological fallacy that haunted geodemographics for the next 120 years. Second, geodemographic analyses are subject to the modifiable areal unit problem (MAUP). Since geodemographic areas are not naturally occurring, the geographic scale and boundaries between studied areas can affect analytic results (Openshaw, 1984).

Geodemographics practitioners from the 1970s through the early 2000s attempted to address these problems with perpetually smaller geographic units. Perceived accuracy was valuable for geodemographics firms, and smaller units looked more accurate. In the US:

They made their locational analysis more and more precise in the desperate belief that at some level – if not 40,000 people then 1,000 people, and if not there, well, then 40 people – they could discover and resuscitate the ideal refuge of a like-minded group of neighbors. (Phillips and Curry, 2003: 144)

Pushing the limits of available data and computational resources, geodemographics could not escape the ecological fallacy. Ultimately, only granularity at the individual level could produce a truly homogenous unit of measure within the context of today’s multitude of subcultural consumer niches, but it remained out of reach. As early as 1989, Openshaw dreamed of modeling individualized consumer behaviors, but thought it was impossible with contemporary resources (1989). There was a market need for quantified, geographically individualized targeted marketing, but no technological means to achieve it.

Spatial Big Data purports to resolve both of geodemographics’ core issues. It offers continuous streams of diverse data generated at the individual level. While Big Data often involves public data sources, like national level censuses, it often also includes other, more granular consumer information such as credit card transactions, frequent customer programs, web-browsing histories, and a variety of social media information such as Facebook profiles, Twitter accounts, and Instagram feeds. For example, Foursquare recently began using its data on users to target them across platforms. Thus, a Foursquare user may see the same advertisement from Foursquare on their phone and Facebook on their computer (Delo, 2014). When used for purposes akin to geodemographics, spatial Big Data adds an additional layer of geo-located information given off through the use of location-enabled devices, such as smartphones, and traditional Internet Protocol (IP)⁸ addresses. These range from an individual’s reported GPS location at a given time to their phone maintaining contact with cell towers, to their recorded locations at the times of their last 1000 tweets. By linking location information from multiple sources across devices, as the case of Foursquare and Facebook makes clear, companies have begun to utilize spatial Big Data for marketing to individuals located in both time and space.

In its totality, Big Data necessarily includes more diverse discourses than geodemographics. For example, Big Data includes the abductive research currently being done on genes (O’Driscoll et al., 2013) and the information generated by the Large Hadron Collider (Doctorow, 2008), both areas where geodemographics has no purview. However, spatial Big Data, generated through mobile device use and for the purpose of targeted advertising, shares foundational assumptions and addresses long-running problems of geodemographics. Ubiquitous, individualized advertising irons out the ecological fallacy of geodemographics, while the continual stream of sensor and consumption information given off by smartphone use solves geodemographics’ boom-bust cycle of data relevancy. Both new companies, such as Foursquare and Twitter, and traditional geodemographic powerhouses, such as Claritas and Axciom, market their spatial Big Data services as new and powerful. While the pitch of technological new-ness makes sales, it doesn’t explain how spatial Big Data works, its limits, or possible consequences. Spatial Big Data’s entanglement with geodemographics illuminates the foundational logics of a market-based epistemology, proprietary algorithms and data, and the drive to geographic individualization (Table 1). Given this shared ancestry, it also opens possible lines of criticism into spatial Big Data.

Table 1.

Geodemographics and spatial Big Data.

	Geodemographics	Spatial Big Data
Geographic scale	Postcode (subject to the ecological fallacy & MAUP)	Individual
Epistemology	Market-based: "Given the nature of business decisions, the cost of using geodemographics would not be borne if the technique could not prove its worth.” (Harris et al., 2005)	Market-based: "Better than Google … better than Yelp." (Hern, 2014)
Analytics are blackboxed trade secrets (unless open source)	Yes	Yes

What can critical data studies learn from geodemographics?

The parallels and connections between geodemographics and spatial Big Data begin to situate the latter within its historical context. As these histories of Big Data and its connections to social physics (Barnes and Wilson, 2014) and geodemographics emerge, its rhetoric of exceptionalism and newness is diminished. Drawing concrete lines between the past and present opens the door to a more rigorous, critical analysis of not only what is, but what might be (Horkheimer, 1995; Wilson, 2015). When spatial Big Data is situated, earlier critical assessments of geodemographics provide useful points of reference. Building from these earlier critiques, this final section outlines both practical and theoretical approaches to spatial Big Data moving forward.

Practical approaches

The proponents of geodemographic analysis continue to address its epistemological uncertainties through the narrowing of concern to methodological issues. They acknowledge the problem of the data’s scale, discussed above, noting concerns with the ecological fallacies of data and the modifiable aerial unit problem (Debenham et al., 2003; Duckham et al., 2001). Nevertheless, such foundational questions are set aside as geodemographic systems’ relation to the real world and hence utility remain judged by their competitiveness on the market (Burrows and Gane, 2006; Harris et al., 2005). This can have unforeseen consequences when, due to funding cutbacks and neoliberalization of geographic government services (Burns, forthcoming; Leszczynski, 2012), commercial geodemographic systems, with their inherent biases and gaps, are applied to public sector issues, such as public housing and elderly care.

While scientists have raised concerns over Big Data’s methodological issues (Lazer et al., 2014), spatial Big Data practitioners lean on market justifications for their products akin to their geodemographic brethren. In the competitive market of multiple mobile applications, Foursquare sells itself as simply better than its competitors, rather than absolutely correct. According to their CEO, they are “better than Google in a lot of cases” and “better than Yelp almost always” (Hern, 2014).

As scholars incorporate spatial Big Data into their analyses, some have articulated other methodological criticisms. First, blackboxing of methods continues to create roadblocks for researchers. Full spatial big datasets can be difficult and costly to access, if they are available at all. In addition, Big Data researchers outside key companies typically do not have access to the core, proprietary algorithms that process and interpret the data. Consequently, the limits of their analysis may be shaped or inhibited in ways that may remain entirely opaque to the researcher. Second, unlike total population data such as a census, geodemographic “lifestyle” data tends to offer less than a complete population, presenting issues of representation and bias in the data around class, language, and use of technology. Recent research has demonstrated that spatial Big Data, like Yelp reviews, has blind spots akin to the digital divide (Baginski et al., 2014). Furthermore, the user-generated nature of many sources makes it extremely difficult and potentially impossible to assess the veracity of such data, at the very least requiring new techniques of quality assurance (Goodchild and Li, 2012). In a market context, such data suffices, so long as it is tied to profit generation, but it presents challenges for the practice of scholarly geographic analysis. Such issues raise concerns that geodemographics in the age of Big Data is becoming less “scientific” (Longley, 2012: 2228), even as that fear that overlooks the shared history of a market-oriented epistemology.

Theoretical approaches

Other scholars offer more theoretical critiques of geodemographic analysis that retain relevance for critically understanding spatial Big Data. Goss (1995a) outlines how geodemographic systems are a strategy for producing a “control society” and the social subjects within that context. Surveillance technologies and practices first collect data about individuals that is classified within geodemographic systems. The resulting socially classified knowledge defines social and geographical subject positions through advertising, reproducing those social, geographical categories in people’s material consumptive practices. In this way, a geodemographic system “produc[es] the conditions of its own reproduction” (Goss, 1995a, 1995b). At stake in such systems is not merely the invasion of privacy, but also the very autonomy to choose individual paths and life outcomes. For example, if based on census data, a particular neighborhood fits into the “Hard-Pressed Families” geodemographic class, that neighborhood may become the target of a direct mail campaign for sub-prime mortgage refinancing. As residents of that neighborhood refinance their properties, the neighborhood itself is materially reproduced by and in the terms of that geodemographic class, making it ever more ripe for subsequent sub-prime marketing.

Though powerful, Goss’ critique is contingent on geodemographic targeting and the resulting advertising actually functioning as well as proponents claim. However, geodemographics need not be wholly effective to raise social and ethical questions.

Lyon (2002) argues that geodemographics relies on a “phenetic fix” of classification with consequences for social opportunity. Geodemographic systems “capture personal data triggered by human bodies and … use these abstractions to place people in new social classes of income, attributes, preferences, or offences, in order to influence, manage or control them” (Lyon, 2002). In practice, some classes of people will be more promising for particular ends than others and will therefore garner more or better advertised opportunities, while others are ignored or offered inferior options. Parker et al. (2007: 917) highlight the recursive nature of this process: “Class places people into different types of places,” in turn producing the spaces of particular classes. “The application and impact of geodemographic classification recursively reinforces this spatialization of class” (2007: 917) by quantifying and codifying it through the presentation of advertising and services meant for that recursively constituted class. Within trends towards more splintered urban realities (Graham and Marvin, 2001), Phillips and Curry (2003) suggest a darker consequence: codifying classes through geodemographic divisions constitutes a form of redlining based on geodemographic classes and geographic units, and therein a loss of the public domain.

Spatial Big Data’s production of social subjects presents similar issues, but at a more personalized scale. For example, a Big Data analysis of data from a smartphone locative application, such as Waze or Gas Buddy, would show that a particular motorist regularly drives a particular stretch of road. Based on that knowledge, gas stations along that route could offer advertisements or even discounts on food with the purchase of gas, producing or reproducing that motorist as a consumer subject at that location. If the motorist in question stops and takes advantage of a deal, that additional data is entered into the related datasets for subsequent marketing. Such a system has the possibility of creating a feedback loop wherein incentivized behavior is fed into the system to shape future marketed options (Lohr, 2012a). Much like geodemographics (Goss, 1995b), social identity is produced through a surveillance-heavy strategic system focused on consumption that by functioning, or even appearing to function, validates itself in the market.

The charge of class-based division and redlining is no less relevant to spatial Big Data, though again at a different scale. Even personalized spatial Big Data algorithms rely on some form classification to match ads with consumers, such as calculated relevance or distance to a given consumer subject. Foursquare, for example, promises to target ads to users based upon “what they do, where they go, and what their friends are doing” (Carr, 2013). This may have material effects on where individuals go and what they consume. In the gas station example, someone who does not generate data through smartphone use will not receive the same discounted food prices. Just as geodemographics produced redlining based on its own classes, spatial Big Data has the ability to produce similar results on the individual level. Similarly, businesses in poor areas have less of a digital footprint with which to attract digitally savvy consumers (Baginski et al., 2014). Much like the practical, methodological concerns, critical approaches to geodemographics help highlight fundamental issues in spatial Big Data. Researchers must understand the inherent processes of surveillance, unequal opportunities, and limited, self-reproducing consumer subjects and spaces at work in these fields.

Conclusions for a critical data studies

Whatever the sales pitch, spatial Big Data is definitively tied to the problems and limits of its precursors. Drawing the connections to geodemographics highlights the shared foundational assumption of quantifiable, predicable social identity. From that common foundation spring shared traits of a market-based epistemology and black-boxing, as well as the problems of data diversity and scale that helped lead to current spatial Big Data applications.

As practices shaped by Big Data fade into the banality of everyday life, it is vital to remember the social contingencies that led to these services and technologies. For consumer users, this context is a means to prevent the complete naturalization of Big Data services. It is important to continually explore creative possibilities found within large data sets and to highlight the development of alternative relations to them.

For scholars and Big Data practitioners, the stakes are even higher. Basing research on individualized spatial data run through black boxed processes with an epistemology of market competition marks several serious issues. What research is too private or connects too many bits of personal information to be ethical? Who defines those standards? With as few as four spatio-temporal points necessary for unique identification (de Montjoye et al., 2013), the power of cutting edge data analytics has far outstripped the protections offered by traditional Institutional Review Boards, and no formal or legal ethical standards exist in the private sector. Beyond fundamental questions of what ethical data is, even with such data in hand, researchers must determine when do the results of an analysis reflect internal algorithmic processes and what biases do those processes bring? How can spatial Big Data studies be replicated and what is the measure of significance? Is “Better than Google” good enough (Hern, 2014)?

The technological basis of data collection and analysis also points to a variety of social issues. As Goss (1995b) and Parker et al. (2007) argued about geodemographics, spatial Big Data can create a societal feedback loop, creating individual subjects and a society whose views and actions reflect the limited choices that their technological devices optimize to their constructed class profile. Technology enables and constrains actions and thoughts (Feenberg, 1999), and we must ask whether our phones have unintentionally locked us into sets of epistemologies and identities; how these tools that enable so much in our daily lives, simultaneously constrain what we know and what we do. Eating at an Indian restaurant? Purchasing the latest pair of running shoes? Furthermore, as Phillips and Curry (2003) pointed out concerning geodemographics, spatial Big Data need not even be successful in that pursuit to have serious social implications. Spatial Big Data presents not only a splintered urban environment (Graham and Marvin, 2001), but one of uneven development globally (Smith, 2008). Spatial Big Data and its analyses force consideration of data divides: between those who produce data and those who don’t (Kelley, 2014) as well as those who have the tools to analyze it and those who don’t (Andrejevic, 2014). The targeting and personalization that spatial Big Data facilitates reflects this uneven geography and social reach. Even in the US and UK, personalized spatial Big Data services by their very definition create different, unequal choices and opportunities, depending on who and where you are. Redlining in the 21st century need not be by neighborhood, it is individualized.

Any spatial Big Data initiative must be prepared to face these issues, for Big Data’s rhetoric of exceptional newness may distract from them, but it cannot resolve them. As scholars and practitioners using spatial Big Data, we are in part responsible for the knowledge and social relations that these issues create and re-create. In this context, it is crucial to develop critical (and self-critical) perspectives and approaches to spatial Big Data and subsequent technologies (Dalton and Thatcher, 2014). Just as GIS practitioners cannot stand aside from the processes and consequences of the technology (Crampton, 2010), we scholars and practitioners of spatial Big Data must evaluate our situatedness and positionality (Haraway, 1991; Harding, 2004) and the forms of knowledge and social relations we help produce. Such an approach entails more than refining practices. It requires reflexively analyzing spatial Big Data and its own analytics in context, not as indicators of some event, but as phenomena and epiphenomena in and of themselves (Wilson, 2014b). Establishing the historical, social context of a technology is a key step in demystifying and denaturalizing it. The historical pre-conditions for spatial Big Data set by geodemographics and earlier developments (Barnes and Wilson, 2014) allow us to learn from those earlier processes to better critically evaluate current technologies and knowledge. Big Data has historical antecedents and so too does critical thought on technology (Feenberg, 1999; Marcuse, 1982; O’Sullivan, 2006) and its spatial aspects (O’Sullivan, 2006; Pickles, 1995; Schuurman, 2000). These approaches can and should inform reflexive, critical engagements of spatial Big Data.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Notes

References

Anderson

(2008) The end of theory: The data deluge makes the scientific method obsolete. Wired. 23 June. Available at: http://www.wired.com/science/discoveries/magazine/16-07/pb_theory (accessed 24 October 2014).

Andrejevic

(2014) The Big Data divide. International Journal of Communication 8: 1673–1689.

Ansolabehere

Hersh

(2012) Validation: What big data reveal about survey misreporting and the real electorate. Political Analysis 20(4): 437–459.

Arthur

(2014) Smartphone explosion in 2014 will see ownership in India pass US. The Guardian: Global Development. Available at: http://www.theguardian.com/technology/2014/jan/13/smartphone-explosion-2014-india-us-china-firefoxos-android (accessed 14 June 2014).

Baginski

Sui

Malecki

(2014) Exploring the intraurban digital divide using restaurant reviews: A case study in Franklin County, Ohio. The Professional Geographer 66(3): 443–455.

Barnes

Wilson

(2014) Big data, social physics, and spatial analysis: The early years. Big Data & Society 1(1): DOI: 1177/2053951714535365.

Battelle

(2005) The Search: How Google and Its Rivals Rewrote the Rules of Business and Transformed Our Culture, New York, NY: Penguin.

Batey

Brown

(1995) From human ecology to customer targeting: The evolution of geodemographics. In: Longley

Clarke

(eds) GIS for Business and Service Planning, London: Longman, pp. 77–103.

Boellstorff

(2013) Making big data, in theory. First Monday 18(10): Available at: http://firstmonday.org/ojs/index.php/fm/article/view/4869/3750 (accessed 24 October 2014).

10.

Bollier

Firestone

(2010) The Promise and Peril of Big Data, Washington, DC: Aspen Institute, Communications and Society Program.

11.

Booth

(1902) Life and Labour of the People of London, London: Macmillan.

12.

Boyd

Crawford

(2012) Critical questions for big data. Information, Communication & Society 15(5): 662–679.

13.

Bughin J, Chul M and Nayika J (2010) Clouds, Big Data, and smart assets: Ten tech-enabled business trends to watch. New York, NY: McKinsey & Company. Available at: http://www.mckinsey.com/insights/high_tech_telecoms_internet/clouds_big_data_and_smart_assets_ten_tech-enabled_business_trends_to_watch (accessed 4 November 2014).

14.

Burgess

Bruns

(2012) Twitter archives and the challenges of “Big Social Data” for media and communication research. M/C Journal 15(5): Available at: http://journal.media-culture.org.au/index.php/mcjournal/article/viewArticle/561 (accessed 18 August 2015).

15.

Burns

(2015, forthcoming) The Inequality of the Geospatial Web: How Emerging Geographic Technologies Influence Humanitarian Practices. PhD Thesis, University of Washington, WA.

16.

Burns

Thatcher

(2014) What’s so big about big data? Finding the spaces and perils of big data. GeoJournal. Guest Editorial. Epub ahead of print October 2014. DOI: 10.1007/s10708-014-9600-8.

17.

Burrows P and Frier S (2013) Apple said to buy HopStop, pushing deeper into maps. In: Bloomberg. Available at: http://www.bloomberg.com/news/2013-07-19/apple-said-to-buy-hopstop-pushing-deeper-into-maps.html (accessed 24 October 2014).

18.

Burrows

Gane

(2006) Geodemographics, software and class. Sociology 40(5): 793–812.

19.

Carr A (2013) Why Yahoo and Apple want Foursquare’s data. In: Fast Company. Available at: http://www.fastcompany.com/3016250/why-yahoo-and-apple-want-foursquares-data (accessed 4 November 2014).

20.

Castree

(2003) Commodifying what nature? Progress in Human Geography 27(3): 273–297.

21.

Cooper

(2007) Life as Surplus: Biotechnology and Capitalism in the Neoliberal Era, Seattle: University of Washington Press.

22.

Cox M and Ellsworth D (1997) Application-controlled demand paging for out-of-core visualization. In: Proceedings of the 8th conference on visualization ’97. Phoenix, AZ, 18–24 October 1997, pp. 1–11. USA: IEEE Computer Society Press.

23.

Crampton

(2010) Mapping: A Critical Introduction to Cartography and GIS, Malden, MA: Wiley-Blackwell.

24.

Crampton

Graham

Poorthuis

(2013) Beyond the geotag: Situating ‘Big Data’ and leveraging the potential of the geoweb. Cartography and Geographic Information Science 40(2): 130–139.

25.

Crampton

Roberts

Poorthuis

(2014) The new political economy of geographical intelligence. Annals of the Association of American Geographers 104(1): 196–214.

26.

Dalton

(2013) Sovereigns, spooks, and hackers: An early history of Google geo services and map mashups. Cartographica 48(4): 261–274.

27.

Dalton

Thatcher

(2014) What does a critical data studies look like, and why do we care? Seven points for a critical approach to ‘big data’. Society and Space Open Site: Run by the Editors of the Journal Environment and Planning D: Society and Space. Available at: http://societyandspace.com/material/commentaries/craig-dalton-and-jim-thatcher-what-does-a-critical-data-studies-look-like-and-why-do-we-care-seven-points-for-a-critical-approach-to-big-data/ (accessed 24 October 2014).

28.

Debenham

Clarke

Stillwell

(2003) Extending geodemographic classification: A new regional prototype. Environment and Planning A 35(6): 1025–1050.

29.

de Montjoye

Hidalgo

Verleysen

(2013) Unique in the crowd: The privacy bounds of human mobility. Scientific Reports. 3: 1–5.

30.

Delo C (2014) How Foursquare uses location data to target ads on PCs, phones. In: Advertising Age, 27 February, 14. Available at: http://adage.com/article/digital/foursquare-location-data-target-ads-web/291883/ (accessed 24 October 2014).

31.

Doctorow

(2008) Big data: Welcome to the petacenter. Nature 455: 16–21.

32.

Duckham

Mason

Stell

(2001) A formal approach to imperfection in geographic information. Computers, Environment and Urban Systems 25(1): 89–103.

33.

Evans

(2013) Spatial data analytics for urban informatics. PhD Thesis, University of Minnesota, MN.

34.

Executive Office of the President (2014) Big Data: Seizing Opportunities, Preserving Values. Executive Office of the President of the United States of America. Report. 1 May. Available at: https://www.whitehouse.gov/sites/default/files/docs/big_data_privacy_report_5.1.14_final_print.pdf (accessed 30 May 2015).

35.

Farmer C and Pozdnoukhov A (2012) Building streaming GIScience from context, theory, and intelligence. In: Proceedings of the workshop on GIScience in the Big Data Age. Columbus, OH, pp. 5–10.

36.

Feenberg

(1999) Questioning Technology, New York, NY: Routledge.

37.

Flowerdew

Leventhal

(1998) Under the microscope. New Perspectives 18: 36–38.

38.

Frankel

Reid

(2008) Big data: Distilling meaning from data. Nature 455: 30.

39.

Frier

(2013) Foursquare gets $41 million investment, time to grow. Bloomberg Businessweek: Technology. 11 April, 13. Available at: http://www.businessweek.com/articles/2013-04-11/foursquare-gets-41-million-investment-time-to-grow (accessed 25 October 2014).

40.

Fung

(2014) This blogger paid Facebook to promote his page. He got 80,000 bogus likes instead. The Washington Post: The Switch. 10 February, 14. Available at: http://www.washingtonpost.com/blogs/the-switch/wp/2014/02/10/this-blogger-paid-facebook-to-promote-his-page-he-got-80000-bogus-likes-instead/ (accessed 24 October 2014).

41.

Gertlitz

Helmond

(2013) The like economy: Social buttons and the data-intensive web. New Media & Society 15: 1348–1365.

42.

GNIP (2014) GNIP and Twitter to offer academic data grants. In: GNIP press release. Available at: http://gnip.com/company/news/press-releases/gnip-and-twitter-to-offer-academic-data-grants/ (accessed 31 October 2014).

43.

Goodchild

(2013) The quality of big (geo)data. Dialogues in Human Geography 3(3): 280–284.

44.

Goodchild

(2012) Assuring the quality of volunteered geographic information. Spatial Statistics 1: 110–120.

45.

Gorman

(2013) The danger of a big data episteme and the need to evolve geographic information systems. Dialogues in Human Geography 3(3): 280–284.

46.

Goss

(1995a) Marketing the new marketing: The strategic discourse of geodemographic information systems. In: Pickles

(ed.) Ground Truth: The Social Implications of Geographic Information Systems, New York, NY: Guilford Press.

47.

Goss

(1995b) We know who you are and we know where you live: The instrumental rationality of geodemographic systems. Economic Geography 71(2): 171–198.

48.

Graham

(2005) Software-sorted geographies. Progress in Human Geography 29(5): 562–580.

49.

Graham

Marvin

(2001) Splintering Urbanism: Networked Infrastructures, Technological Mobilities and the Urban Condition, New York, NY: Routledge.

50.

Graziano

(2013) iPhone growth stalls as Android continues to nip Apple’s market share. BGR. 5 May, 13. Available at: http://bgr.com/2013/05/09/tablet-smartphone-market-share-q1-2013/ (accessed 20 October 2014).

51.

Haraway

(1991) Simians, Cyborgs, and Women: The Reinvention of Nature, New York, NY: Routledge.

52.

Harding SG (ed.) (2004) The Feminist Standpoint Theory Reader: Intellectual and Political Controversies. New York, NY: Routledge.

53.

Harris

Sleight

Webber

(2005) Geodemographics, GIS, and Neighbourhood Targeting, Hoboken, NJ: John Wiley & Sons, Inc.

54.

Hern

(2014) Why Foursquare should be on everyone’s phone. The Guardian. 26 May, 14. Available at: http://www.theguardian.com/technology/2014/may/26/why-foursquare-isnt-just-the-timeline-spammer-you-thought-it-was-and-should-be-on-everyones-phone (accessed 4 November 2014).

55.

Hoetmer K and Marks M (2013) Google Maps: Into the future. In: Google I/O conference, Google Inc., San Francisco, 15 May, p. 13. Available at: https://developers.google.com/events/io/sessions/326458345 (accessed 15 November 2014).

56.

Horkheimer

(1995) Critical Theory: Selected Essays, New York, NY: Continuum.

57.

Horvath I (2012) Beyond advanced mechatronics: New design challenges of social-cyber systems (Draft paper), In: Proceedings of the ACM Workshop on Mechatronic Design, Linz, Austria.

58.

Jacobs

(2009) The pathologies of big data. ACM Queue 7(6): 1–12.

59.

Kelley

(2014) Urban experience takes an informational turn: Mobile internet usage and the unevenness of geosocial activity. Geojournal 79: 15–29.

60.

Kingsbury

Jones

III (2009) Walter Benjamin’s Dionysian adventures on Google Earth. Geoforum 40: 502–513.

61.

Kitchin R (2014a) Short presentation on the need for critical data studies. In: The Programmable City blog. Available at: http://www.nuim.ie/progcity/2014/04/short-presentation-on-the-need-for-critical-data-studies/ (accessed 24 October 2014).

62.

Kitchin

(2014b) The Data Revolution: Big Data, Open Data, Data Infrastructure and Their Consequences, London: Sage Publications.

63.

Kitchin

Lauriault

(2014) Small data in the era of big data. GeoJournal. Epub ahead of print October 2014. DOI: 10.1007/s10708-014-9601-7.

64.

Krumm

(2011) Ubiquitous advertising: The killer application for the 21st century. Pervasive Computing. January–March 11, 66–73.

65.

Laney

(2001) 3D data management: Controlling data volume, velocity, variety. Application Delivery Strategies. Meta Group File 949. Available at: http://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D-Data-Management-Controlling-Data-Volume-Velocity-and-Variety.pdf (accessed 14 May 2015).

66.

Larson S (2014) Foursquare CEO: How we’ll tell you where to eat and what to order. In: ReadWrite, 17 March, 14. Available at: http://readwrite.com/2014/03/17/foursquare-dennis-crowley-ceo-anticipatory-computing (accessed 24 October 2014).

67.

Latour

(1999) Pandora's Hope: Essays on the Reality of Science Studies, Cambridge, MA: Harvard University Press.

68.

Laurila

Gatica-Perez

Aad

(2012) The mobile big data challenge. Nokia Research. Available at: http://research.nokia.com/files/public/MDC2012_Overview_LaurilaGaticaPerezEtAl.pdf (accessed 20 October 2014).

69.

Lazer

Kennedy

King

(2014) The parable of Google Flu: Traps in big data analysis. Science 343(6176): 1203–1205.

70.

Leslie

(1999) Consumer subjectivity, space, and advertising research. Environment and Planning A 31(8): 1443–1458.

71.

Leszczynski

(2012) Situating the geoweb in political economy. Progress in Human Geography 36(1): 72–89.

72.

Leszczynski

(2014) On the neo in neogeography. Annals of the Association of American Geographers 104(1): 60–79.

73.

Leys

(2001) Market Driven Politics, London: Verso.

74.

Lohr

(2012a) Sure, big data is great. But so is intuition. The New York Times,. 29 December, 12. Available at: http://www.nytimes.com/2012/12/30/technology/big-data-is-great-but-dont-forget-intuition.html (accessed 24 October 2014).

75.

Lohr

(2012b) The age of big data. The New York Times. 11 February. Available at: http://www.nytimes.com/2012/02/12/sunday-review/big-datas-impact-in-the-world.html?_r=1&scp=1&sq=Big%20Data&st=cse&gwh=F7CD17271AF71E11F85924B08AF1A18A&gwt=pay&assetType=opinion (accessed 30 May 2015).

76.

Long X, Jin L and Joshi J (2012) Exploring trajectory-driven local geographic topics in Foursquare. In: Proceedings of ACM Ubicomp ’12. Pittsburgh, PA, pp. 927–934.

77.

Longley

(2005) Geographical information systems: A renaissance of geodemographics for public service delivery. Progress in Human Geography 29(1): 57–63.

78.

Longley

(2012) Geodemographics and the practices of geographic information science. International Journal of Geographical Information Science 26(12): 2227–2237.

79.

Longley

Harris

(1999) Towards a new digital data infrastructure for urban analysis and modelling. Environment and Planning B: Planning and Design 26: 855–878.

80.

Lyon

(2002) Surveillance studies: Understanding visibility, mobility and the phonetic fix. Surveillance and Society 1(1): 1–7.

81.

Manyika

Chui

Brown

(2011) Big data: The next frontier for innovation, competition, and productivity. McKinsey Global Institute Report. New York, NY: McKinsey & Company. May, 11. Available at: http://www.mckinsey.com/insights/business_technology/big_data_the_next_frontier_for_innovation/?p=1 (accessed 30 May 2015).

82.

Marcus

Davis

(2014) Eight (no, nine!) problems with big data. The New York Times. 7 April, 12. Available at: http://www.nytimes.com/2014/04/07/opinion/eight-no-nine-problems-with-big-data.html?_r=3 (accessed 24 October 2014).

83.

Marcuse

(1982) Some social implications of modern technology. In: Arato

Gebhardt

(eds) The Essential Frankfurt School Reader, New York, NY: Continuum, pp. 138–162.

84.

Mayer-Schonberger

Cukier

(2013) Big Data: A Revolution that Will Transform How We Live, Work, and Think, New York, NY: Houghton Mifflin Harcourt.

85.

McBride

Oreskovic

(2013) Google eyes Waze as Facebook circles hot web maps property. Reuters. 24 May, 13. Available at: http://www.reuters.com/article/2013/05/24/us-waze-google-idUSBRE94N02H20130524?irpc=932 (accessed 14 June 2014).

86.

Miller

Goodchild

(2014) Data-driven geography. GeoJournal. Epub ahead of print 10 October 2014. DOI: 10.1007/s10708-014-9602-6.

87.

Miller

(2009) Googlepedia: The Ultimate Google Resource, 3rd ed. New York, NY: Pearson Education.

88.

Mitchell

McGoldrick

(1994) The role of geodemographics in segmenting and targeting consumer markets: A Delphi study. European Journal of Marketing 28(5): 54–72.

89.

Nelson

Wake

(2005) Geodemographic segmentation: Do birds of a feather flock together? Foresee Change. (Inc.), 5 August. Available at: http://www.foreseechange.com/Geodemographic%20Segmentation.pdf (accessed 14 November 2014).

90.

NVIDIA Corporation (2014a) CUDA Center of Excellence (CCOE) Program. Available at: https://research.nvidia.com/content/cuda-center-excellence-ccoe-program (accessed 25 October 2014).

91.

NVIDIA Corporation (2014b) UNC Charlotte CUDA Research Center Summary. Available at: https://research.nvidia.com/content/unc-charlotte-crc-summary (accessed 25 October 2014).

92.

O’Dowd

(2003) Ecological fallacy. In: Miller

Brewer

(eds) The A-Z of Social Research, London: Sage.

93.

O’Driscoll

Daugelaite

Sleater

(2013) ‘Big data’ Hadoop and cloud computing in genomics. Journal of Biomedical Informatics 46(5): 774–781.

94.

O’Sullivan

(2006) Geographical information science: Critical GIS. Progress in Human Geography 30(6): 783–791.

95.

Openshaw

(1989) Making geodemographics more sophisticated. Journal of the Market Research Society 31(1): 111–132.

96.

Openshaw

(1984) Ecological fallacies and analysis of areal census data. Environment and Planning A 16(1): 17–31.

97.

Paczkowski J and Gannes L (2013) Apple confirms HopStop acquisition. In: All Things D. Available at: http://allthingsd.com/20130719/apple-confirms-hopstop-acquisition/ (accessed 24 October 2014).

98.

Parker

Uprichard

Burrows

(2007) Class places and place classes geodemographics and the spatialization of class. Information, Communication & Society 10(6): 902–921.

99.

Paul F (2013) Data scientists are sexy, and 7 more surprises from the rockstars of big data. In: Smartbear: Software Quality Matters Blog. Available at: http://blog.smartbear.com/development/data-scientists-are-sexy-and-7-more-surprises-from-the-rockstars-of-big-data/ (accessed 24 October 2014).

100.

Peterson T (2013) Why Google wanted Waze: The local ad market is going mobile. In: Advertising Age. Available at: http://adage.com/article/digital/google-wanted-waze-local-ad-market-mobile/242041/ (accessed 24 October 2014).

101.

Phillips

Curry

(2003) Privacy and the phenetic urge: Geodemographics and the changing spatiality of local practice. In: Lyon

(ed.) Surveillance as Social Sorting: Privacy, Risk, and Digital Discrimination, New York, NY: Routledge, pp. 137–152.

102.

Pickles J (ed.) (1995) Ground Truth: The Social Implications of Geographic Information Systems. New York, NY: Guilford Press.

103.

Press

(2014) 12 Big data definitions: What’s yours? Forbes. 3 September, 14. Available at: http://www.forbes.com/sites/gilpress/2014/09/03/12-big-data-definitions-whats-yours/ (accessed 30 May 2015).

104.

Puschmann

Burgess

(2014) Big data, big questions: Metaphors of big data. International Journal of Communication 8: 1690–1709.

105.

Ratner

(2004) Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data, Boca Raton, FL: CRC Press.

106.

Schuurman

(2000) Trouble in the heartland: GIS and its critics in the 1990s. Progress in Human Geography 24(4): 569–590.

107.

Singleton

Longley

(2009) Creating open source geodemographics: Refining a national classification of census output areas for applications in higher education. Papers in Regional Science 88(3): 643–666.

108.

Singleton

Spielman

(2014) The past, present, and future of geodemographic research in the United States and United Kingdom. The Professional Geographer 66(4): 558–567.

109.

Sleight

(1997) Targeting Customers: How to Use Geodemographics and Lifestyle Data in Your Business, Henley-on-Thames: NTC Publications.

110.

Smith

(2008) Uneven Development: Nature, Capital and the Production of Space, Athens: University of Georgia Press.

111.

Thatcher

(2013) Avoiding the ghetto through hope and fear: An analysis of immanent technology using ideal types. GeoJournal 78(6): 967–980.

112.

Thatcher

(2014) Living on fumes: Digital footprints, data fumes, and the limitations of spatial big data. International Journal of Communication 8: 1765–1783.

113.

Townsend

(2013) Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia, New York, NY: WW Norton & Company.

114.

Uprichard

Burrows

Parker

(2009) Geodemographic code and the production of space. Environment and Planning A 41(12): 2823–2835.

115.

Warntz

(1964) A new map of surface of population potentials for the United States, 1960. Geographical Review 54(2): 170–184.

116.

Webber

(1975) Liverpool Social Area Study, 1971 Data: Final Report. PRAG Techical Papers, TP 14, London: Planning Research Applications Group, Centre for Environmental Studies.

117.

Wilson

(2012) Location-based services, conspicuous mobility, and the location-aware future. Geoforum 43(6): 1266–1275.

118.

Wilson

(2014a) Geospatial technologies in the location-aware future. Journal of Transport Geography 34: 297–299.

119.

Wilson

(2014b) Morgan Freeman is dead and other big data stories. Cultural Geographies 22(2): 345–349.

120.

Wilson

(2015) New lines? Enacting a social history of GIS. The Canadian Geographer 59(1): 29–34.

121.

Whitman M and Youngjohns R (2014) Meg Whitman: Big data changes everything. In: HP Inc. Available at: https://h71044.www7.hp.com/campaigns/2014/promo/big_data/index.php (accessed 25 October 2014).

122.

Wolf G (2011) What is the quantified self? In: Quantified self: Self knowledge through numbers blog. Quantified Self Labs. Available at: http://quantifiedself.com/2011/03/what-is-the-quantified-self/ (accessed 30 May 2015).

123.

Zook

Graham

(2007) The creative reconstruction of the internet: Google and the privatization of cyberspace and digiPlace. Geoforum 38(6): 1322–1343.