Friction,snake oil,and weird countries: Cybersecurity systems could deepen global inequality through regional blocking

Abstract

In this moment of rising nationalism worldwide, governments, civil society groups, transnational companies, and web users all complain of increasing regional fragmentation online. While prior work in this area has primarily focused on issues of government censorship and regulatory compliance, we use an inductive and qualitative approach to examine targeted blocking by corporate entities of entire regions motivated by concerns about fraud, abuse, and theft. Through participant-observation at relevant events and intensive interviews with experts, we document the quest by professionals tasked with preserving online security to use new machine-learning based techniques to develop a “fairer” system to determine patterns of “good” and “bad” usage. However, we argue that without understanding the systematic social and political conditions that produce differential behaviors online, these systems may continue to embed unequal treatments, and troublingly may further disguise such discrimination behind more complex and less transparent automated assessment. In order to support this claim, we analyze how current forms of regional blocking incentivize users in blocked regions to behave in ways that are commonly flagged as problematic by dominant security and identification systems. To realize truly global, non-Eurocentric cybersecurity techniques would mean incorporating the ecosystems of service utilization developed by marginalized users rather than reasserting norms of an imagined (Western) user that casts aberrations as suspect.

Keywords

Regional blocking machine learning classification inequality discrimination security

A blogger in Ghana attempts to purchase a Wordpress template. A teenager in Brazil tries to order beauty products. A student in Russia submits a job application to an international nonprofit. An entrepreneur in Nigeria hopes to become a seller on Amazon. A freelancer in Palestine requests payment from a client. A traveler in Singapore goes to book a flight. A shopper in Trinidad looks to browse the latest fashions at a store she likes. A consultant in Cameroon pursues a Bitcoin purchase.

In each of these cases, people can find their intentions online thwarted by their locale—not by slow or non-existent physical infrastructure, differences in language and currency, or government censorship, but the decisions of corporate actors to restrict access based on geo location. In this moment of rising nationalism worldwide, governments, civil society groups, transnational companies, and web users complain of increasing regional fragmentation online, fearing that the Internet is “breaking up into loosely coupled islands of connectivity” (Drake et al., 2016: 3) as more barriers are erected blocking users attempting to interact with services hosted in other countries. On social media, some users, upon discovering their inability to access commercial websites from their current location, make impassioned parallels to current restrictions on travel and immigration, asking questions like, “Does Macy's want to build a wall, too?” referencing the infamous US–Mexico border wall proposed by American President Donald Trump.

National governments have come under fire as a source of this fragmentation. Scholars have extensively studied restrictions on the flow of information due to government regulations or censorship (Clark et al., 2017; Cory, 2017; Drake et al., 2016; Gebhart et al., 2017; Goldsmith and Wu, 2008; Mueller, 2017). The choices made by corporations around network neutrality, intellectual property, and walled gardens have also been scrutinized (Drake et al., 2016; Goldsmith and Wu, 2008). However, some forms of commercial regional blocking online do not fall under any of these rubrics. Restrictions that so far have received the most discussion, particularly data protection legislation and reduced inter-operability due to proprietary and pay-walled services, cause disruption to a wide range of internet users, including wealthy and well-connected Americans and Europeans. In this article, we identify other forms of fragmentation that are more narrowly targeted. They often specifically impact non-affluent users in less sought-after markets. As with physical border crossings, the exclusion online of people due to national origin is often legitimated through the rhetoric of security, where visitors from certain regions are associated with threats of fraud, abuse, and theft. Yet in many of these cases, decision-makers are private companies operating without the authority or accountability of public governance. While such practices may have practical benefits, without ideological malice, for resource constrained organizations trying to minimize their online threat surface, these efforts are consequential in aggregate, generating unintentionally harsh consequences that extend beyond the immediate limitations.

This paper explores blunt access restrictions (in particular, blocking users from online services based on the country where they reside) and the quest to achieve a “fairer” system through techniques, often drawing from machine learning, intended to more precisely differentiate legitimate and illegitimate use. By speaking with professionals in network security, IT management, fraud detection, and e-commerce fields and by attending industry conferences where best practices and new technologies are proposed to address challenges of providing services to a global customer base, we seek to better understand the underlying reasoning and motivation behind regional blocking, and its broader implications. Professionals tasked with preserving online security hope automated identification of patterns of “good” and “bad” usage will produce more accurate and less discriminatory methods for determining who should be able to access their services and who should be flagged as a concern. However, without understanding the systematic social and political conditions that produce differential behaviors online, these systems may continue to embed unequal treatments, and further disguise discrimination behind more complex and less transparent automated assessment. To realize truly global, non-Eurocentric cybersecurity techniques would mean respecting the ecosystems of service utilization developed by marginalized users and focusing on preventing actual incursions wherever they occur, rather than reasserting norms of an imagined (Western) user that casts aberrations as suspect.

Methods

In this study, we took a qualitative approach employing a combination of participant-observation and in-depth interviews, and explored public social media to identify user complaints about regional blocking. We interviewed 11 experts (See Appendix 1) on topics including the rationales for blocking, its efficacy, the global financial marketplace online, and methods for fraud detection and identity verification in both online and offline settings. We used an unstructured interview approach in which certain topics were planned in advance but follow-up questions were determined in the course of the interview. This allowed for novel insights to be explored. Where possible, we looked for other evidence to further triangulate expert assertions, such as public websites or policy documents. Quotes in this article were chosen either because they represented sentiments repeated by multiple people and invoked themes that arose across data sources, or because they provide unique insight into the blocking phenomenon that can help explain otherwise opaque practices.

The first author attended industry conferences and community meetings and forums in Summer 2017 to establish the broader context in which these blocks occur, cultural patterns in their rationale, and their potential wider implications. These conferences, meetings, and fora include:

The Online Dating and Dating Industry Conference (iDate) in June 2017 in Los Angeles, California, a small (approximately 150 person) two-day event that focused primarily on niche and emerging companies in the online dating field. In addition to attendees who owned or worked for dating websites and apps, many participants worked as vendors for third-party technologies such as payment processing, identity verification, fraud detection and prevention, affiliate marketing, live video streaming, automated emotion detection and sentiment analysis, gift shop plugins, and chatbot creators. We decided to attend this event because of the second author's findings on regional blocking in online dating sites.

“Money20/20 Europe,” a large (over 4000 attendees and over 400 speakers) three-day financial technology conference, in June 2017 in Copenhagen, Denmark. Described as “Europe's largest FinTech event,” about three-quarters of attendees were from the financial services, payments, or general technology and internet industry, another 10% from data and analytics firms, mobile and telecom companies, investment, and retail, and the rest from government, non-profits, academia, or other fields.

“Community meeting about PayPal and Palestine” hosted by South Bay Jewish Voices for Peace in San Jose, in March 2017, this meeting had around 50 attendees, and focused on explaining the background of the larger #PayPalforPalestine campaign to expand access to Paypal to Palestinians and developing campaign strategies.

The Internet Governance Forum—USA during July 2017 (IGF-USA) in Washington, DC, a one-day regional multistakeholder meeting with presentations from civil society, government, technologists, research scientists, industry, and academia.

At these events, the first author took extensive notes on the context and content of presentations, collected marketing materials, and spoke informally with participants about this project and related themes, with these conversations then incorporated into fieldnotes.

We pursued an inductive analytical process to allow for the discovery of unanticipated themes and insights. Altogether this qualitative and inductive approach was geared toward uncovering the logic and underlying rationale behind these practices from the perspective of people who build and deploy such systems. The primary goal was to understand as fully as possible the scope and range of these practices. Our selection of interviewees and events was likewise guided by a sampling approach that sought breadth and diversity of views. This approach does not, however, allow us to make strong claims about the prevalence or typicality of regional blocking practices.

Relevant literature

In our approach to research on cybersecurity, we explore traditional questions of fair access to network resources and the challenge of distinguishing good and bad actors on the network. However, we do so in a way that unsettles normative ideas of imagined users which appear to be modeled after Western consumers in advanced capitalist economies. We draw insight from critiques of globalization and postcoloniality that decenter this normative “user” pointing to the way marginality is easily cast as aberrant and illegitimate (Burrell, 2012; Dourish and Mainwaring, 2012; Irani et al., 2010). We have worked with research collaborators exploring the question of regional blocking in parallel, but from within methodological traditions in computer science (see Afroz et al., 2018). In their work, they set about validating the existence and prevalence of “server-side” regional blocking utilizing automated techniques. Our approach pursued another set of questions. Defined by Irani et al.'s (2010) framework of postcolonial computing, “a project of understanding how all design research and practice is culturally located and power laden,” we sought to critically examine the cultural assumptions of blocking technologies and how they are influenced by the particular social locations of professionals who develop and deploy such practices.

Previous work has shown the limitations of mainstream cybersecurity approaches in providing protection to users when particular social and geographic locations and practices are not taken into account. Freed et al. (2018) identified how the standard threat models in cybersecurity did not account for network security violations of stalkers in the context of intimate partner violence. Thus there were few appropriate measures developed to effectively handle them. Ben-David et al. (2011) describe the many contextual factors absent from traditional security practices when applied to the Global South. For example, limited bandwidth can make it impossible to download security patches. As a result many machines remain unpatched and vulnerable. Similarly, an overzealous focus on one set of threats and practices of combatting them via regional blocking can leave other vulnerabilities under-examined.

We extend work in this area by analyzing the way that traditional cybersecurity practices overlook or misunderstand the online behaviors of populations that diverge from expected norms and thus entrench differential access along traditional lines of socio-political power. In addition we leverage insights and framing questions from a new literature on machine learning fairness and bias. Despite ongoing debates about how to define and operationalize concepts such as “fairness” in machine learning applications (Hardt et al., 2016; Kleinberg et al., 2016), scholars have consistently established diminished accuracy in algorithmic categorization for minoritized groups and predictions that are biased by historically skewed and prejudicial data patterns (Barocas and Selbst, 2016; Bolukbasi et al., 2016; Buolamwini and Gebru, 2018; Eubanks, 2017).

The literature on fairness in machine learning has especially considered application domains such as criminal justice, lending, and social services where mechanisms of allocation impinge upon civil rights. Cybersecurity has been largely left out of this examination, though it is a key application area for machine learning that is widely applied in spam filtering and fraud detection. A key question is whether such systems work as well for users from one part of the world as they do in another. Are false positives (users identified as ‘bad’ actors who are in fact legitimate users) more prevalent in certain geographies or among certain subpopulations? Furthermore, could the definition of what constitutes legitimate use be reconsidered? These questions resonate with literature in predictive policing and algorithmic identification of child exploitation, where concerns about disparate impact and harm on marginalized communities are crowded out by public safety rationales (Selbst, 2017) and “proximity to innocence” (Thakor, 2018). We take up these questions by examining the framings of those who design, use, and request such systems, exploring how built in expectations about users have the potential to produce unintended consequences.

Regional blocking and its discontents

“What the hell is the threat model that's solved by blocking a region of the world? And the answer is, no one knows.” (William¹, IT Director, Transportation Company)

In her 2012 book Invisible Users: Youth in the Internet Cafes of Urban Ghana, the second author of this article documented the closing down of opportunities for Ghanaians to interact with people from around the world online in response to suspicion about fraud and theft, particularly through internet dating websites. She observed,

The shift toward IP address blocking and other administrative filtering makes exclusions systematic, total, and materially concretized, no longer a matter of a case-by-case judgment of individual users at the content level but rather programmed into the non-negotiable realm of service configuration by Webmasters and network administrators. (Burrell, 2012: 196)

In the years following, such automated blocking based on country-level identification, whether through IP address, self-identification, or credit card bin number (geographically linked to particular bank locations), has proliferated.

This fragmentation has been aided by a parallel impulse among US-based companies to revoke access to certain sites and services from visitors in some regions due to regulatory requirements, from censorship and hate speech to data protection laws (Drake et al., 2016; Goldsmith and Wu, 2008; Mueller, 2017). This issue drew headlines recently when some EU citizens found themselves blocked from accessing a range of services following the implementation of the latest General Data Protection Regulation (Cimpanu, 2018). However, pre-existing restrictions by other countries had long had similar effects (Cory, 2017).

Meanwhile, even as many companies decried these legal barriers for preventing free market momentum, many quietly continued to put into practice their own systems that filtered out traffic or membership from places that seemed “risky” due to implied or stated security concerns. Regarding one large e-commerce site that she had worked with to develop a security strategy, Samantha, an anti-fraud professional, said that along with places blocked due to economic sanctions, they do not ship to “weird countries” like Ghana, Nigeria, Philippines and Malaysia, and instead would automatically cancel an order from those locales, because “they don't bother, [they] don't do business out there.” Yet despite its ongoing popularity, many were dismissive of this practice as crude and ineffective. William, an IT director for a large transportation related company, dubbed the practice a form of “security snake oil.” He explained,

This is part of both the actual features built in to the security software, as well as the marketing material of like, [sarcastically] ‘Well, you could block China. Now, you'll want to block Thailand. Like, you know, those Eastern European countries that you don't want touching your stuff. You'll want – you can use our product to block them.’ So, whether this is a thing that technology decision makers want to do or not, it is being presented to them as a thing that they want to do, and that sort of self reinforcing thing means that when weird stuff starts happening, then the technology leader says, ‘Well, let's just block China! …we can block China, that won't have any actual negative impact on our business, and I can say, well, we blocked China – and it's still happening – shit! – well, I did something!’

The efficacy of these practices is dubious both in addressing genuine security concerns and in their ability to accurately identify location. In response to a discussion of small e-commerce network administrators trying to block all of Africa, Dan, a former software engineer at a content distribution network which offers its clients the ability to block traffic based on regional affiliation, pointed out that these administrators would be using 10-year-old configuration techniques, where the designated IPs may not relate to Nigeria anymore. William similarly shared an incident where traffic from his company's call center offices in the US were being internally labeled as being in Malaysia, because “these databases are imperfect.” He suspects this is due to “traffic being routed through a third-party provider” and “that there's people in Malaysia using the same provider,” so that traffic is erroneously classified as being on the other side of the globe. These inaccuracies undermine the legitimacy of security and fraud concerns being addressed through regional blocking.

How widespread is regional blocking? There is no straightforward way to measure the exact scale of the problem. However, there are two distinct measures worthy of consideration, (a) potential users blocked from sites and web services given the region where they reside and (b) users actually seeking those services who are blocked. To provide an answer to the first measure, Afroz et al. (2018) crawled the top 500 Alexa websites and found consistent patterns of regional differences. In a comparison between the US and Pakistan into the availability of these sites, the authors found 14 sites they identify as “true server side blocking,” and another 28 where blocking was in place but the source of the block was unverified, for a combined total of 8.4% of the sampled top 500 sites. A comparison between the US and South Africa found 19 sites inaccessible from the latter but available in the former, a total of 3.8%, with 13 of those sites labeled as server-side blocking. However, our observations and interviews, as well as prior work, suggest that regional blocking may be more concentrated in West African and South East Asian countries not represented in that study.

As Afroz et al. point out, their methodology underestimates the prevalence of regional blocking because it does not account for sites that will load but fail to allow people to create an account, list their truthful location in their user profile, make a purchase, ship to the desired country, or process a money transfer. As anti-fraud professional Samantha verified, some users could have their purchases canceled, even when an order initially seemed to reach completion. Leaders of a campaign to advocate for making Paypal accessible in Palestine explained that workarounds, such as claiming to be in Israel or Jordan where Paypal is accessible, might work in the short term, but that Paypal would subsequently shut down those accounts. They had various ways of identifying users who were actually in Palestine based on their IP address or bank information.

The second measure of prevalence, whether users from blocked regions seek to access these sites and consequently encounter blocks, would certainly show a smaller number who are actually impacted by this practice. When functioning as intended, regional blocking often flies under the radar due to the small number of ‘legitimate’ users in a blocked country. Network administrators and other IT security professionals feel comfortable utilizing it for the same reason. As William mentioned, country-level blocking is perceived by firms as something “that won't have any actual negative impact on our business.” However, blocking becomes self-reinforcing—few people in a location use a service and so the company feels justified in blocking that region, meaning that few people in the region attempt to utilize the service, or are even aware of its existence. For this reason, regional blocking is most likely to be noticed by international students, businesspeople, or other cosmopolitan travelers (and those in their extended social networks) who are able to compare experiences of the Internet between two locales. Our preliminary exploration of self-reporting of this phenomenon on social media, including on Twitter, the public Facebook pages of companies that employ regional blocking (i.e. Macy's.com), and specialized sites like isitdownrightnow.com, yielded numerous complaints from prospective site visitors who fit the description of globe-trotting transnational.

Going frictionless and unbiased—The rise of the machines

At the 2017 Money20/20 conference, a consistent tension surfaced, aptly summarized in one panel there titled, “How can secure customer authentication and frictionless customer experience co-exist?” This problem was of particular concern given two frequently discussed trends—the rise of “cross border” transactions—alongside a more general appeal to the cosmopolitan global outlook exemplified by the conference's tri-annual meetings on three continents (Europe, Asia, and North America). There was also the looming double-edged threat of fraud and other security problems on one side, and overzealous regulation on the other. In this context, Uber's Chief Payment Counsel fiercely defended the free market as the solution to any security threats—lamenting, in regards to proposed legislation, that he didn't think government should be setting how fraud is managed, and that “most of what is being proposed will make the experience less magical” for consumers. Over and over again, corporate presenters grappled with strategies for rapidly expanding markets without being weighed down by bureaucrats or criminals. The latter were framed as sophisticated cartels with nothing to lose, “organized crime, not a not guy in his bedroom,” according to more than one panelist, who were “funding nasty stuff including terrorism.” These criminals were seen as more nimble than legitimate companies, and always on the verge of being one step ahead in the battle with multi-billion dollar corporations because they were not hampered by legal compliance. Government regulators were often the second place adversary in this scenario, the accidental handmaiden of fraud and theft by stifling innovation over paternalistic concerns that might be a smokescreen for economic protectionism, or even censorship.

A quieter faction raised another set of hopes, adjacent to frictionlessness and ostensibly more rooted in moral responsibility than profit—inclusion. This faction noted that certain regions differentially face “friction”—or outright refusal—in the current financial system, online and off and in their increasing overlap. Lisa, an interviewee specializing in emerging markets for a payments company, explained that their systems for evaluating costs and benefits were ultimately subject to bias. She framed the problem as stemming from an information deficit that not only discriminated against certain populations, but also failed to be a rigorous measure of market opportunity:

In a perfect world I would have information on risk – I would have a risk score for every country in the world, that's like some objective source. But, instead, we kind of just rely on our own feelings. Which I think is not as robust as we would like to make it out to be.

Traditionally, the only way to avoid this problem and include those in places that might substantially differ in material conditions from those in existing major markets was to have long-standing and resource intensive familiarity with a wide variety of local situations. In this light it is notable to consider which two populous continents—Africa and South America—were excluded from the rotating Money20/20 event. Thus, Western Union boasted at Money20/20 about its role as “truly a global company” with “feet on the ground” in 200 countries and a well-developed system for compliance with a web of financial regulations. Other companies wanted in on this population without having to deal with all the red tape. Both Melissa, who works on policy issues for tech companies, and Jesse, a marketing employee at an identity verification company, emphasized their beliefs that businesses do not want to “turn away customers” if they can help it. The former reasoned, “Pottery Barn probably wants to sell in any country they're legally able to” in order to expand viewership. Finding solutions that could incorporate the “unbanked” and “underbanked” was an increasingly attractive prospect—if they could find the magic bullet that would allow for inclusion without sacrificing profit.

One broad technique was repeatedly raised as a potential cure for these diverse ailments of friction, fraud, regulation, exclusion, and bias—machine learning and artificial intelligence. Using pattern recognition built on Big Data, such systems could, many presenters claimed, facilitate broader, faster, easier, and safer access to potential customers, while identifying and neutralizing problematic users. Vivek Bajaj of IBM argued at Money20/20 that incorporating AI into the calculation of risk and fraud was “not only about catching bad guys, as much as about letting legitimate clients get on with their lives in easier fashion,” citing “huge problems with false positives”—cases where legitimate consumers had their business refused over security flags. Sarah Kocianski of Business Insider agreed, arguing that “older systems which rely on rules and reputation lists are failing to cope with the sophistication of fraud,” and were facing “growing numbers of false positive declines … [and] failing to cope with growing number of cross border transactions and card not present” sales. Jesse, of the identity verification company, who had also attended Money20/20, explained, in his interview, that AI-based systems would succeed in evaluating the “risk that you actually pose” and support increasing acceptance rates for more sales. AI thus was a solution for multiple problems at once—as one Money20/20 panelist put it, “AI is not the new oil, it's the new electricity, it's going to become invisible, you don't even think about it, it's just there, it works.”

Yet simultaneously, these very professionals highlighted the myriad ways that machine-learning based systems for fraud detection, identity verification, and behavioral recognition are likely to contribute to the continued exclusion of marginalized user populations. These systems are frequently designed in a way that ignores or misunderstands on the ground realities of socio-economically and geographically diverse users. Further, they obscure the monetary incentives undergirding the technical infrastructure of exclusion, which continues to reward the expansion of blocking mechanisms.

Assumptions about behavior, norms, and access

Roughly, there are three approaches to blocking or flagging that can involve regional information:

Block all visitors from a country wholesale.

Develop a set of explicit rules that determine entry, which can include location related data alongside payment methods, objects purchased, number of transactions attempted in a session, etc.

With machine learning, let the system determine patterns that are associated with fraudulent or abusive behavior, utilizing all available data rather than pre-determined criteria.

In addition to blocking website access outright, each means of determining problematic usage can also be implemented to refuse web services (such as purchases or account creation). The latter systems may allow greater access to some potential users who would be blocked under the first set of rules. However, they also expand the potential failure points, by incorporating more information about user behavior and greater dependencies on third party sources to verify a user's identity and reputation. These sources often ignore differentiated global conditions or marginalize those who do not or cannot assimilate to the patterns of the imagined user, facilitating ongoing regional discrepancies in blocking.

Peter, who runs an online fraud detection company, explains that there are cases where anomalies must indicate some sort of problem (though not necessarily a malicious one), and other cases where the flagging of an issue is based on probability, rather than a rigid binary classification of right and wrong. The latter case is prone to capturing “real” traffic alongside manufactured attacks or automated visits attempting to parade as unique visitors. He says,

If a man is two meter high – no, no – ten meter high, there is something wrong, right? Definitely, there can be no person like ten meter high. But these are easy types of fraud – easy to detect, easy to do. The more complex cases, they are always more about machine learning or something like this, trying to cluster the data … to find some kind of discrepancies and abnormalities in traffic distribution. … . there definitely can be false positives, can be mistakes.

When utilized in a probability framework, these “mistakes” can create new barriers to entry for potential customers. Whether or not these barriers become as inflexible and biased as the initial country level blocking depends primarily on what happens when something is flagged—whether that creates an automatic cancellation or refusal, or simply initiates an investigation into the nature of the unusual use. George, who runs a global network of dating sites, explained that he responds to situations that suggest something uncommon is happening by following up with particular users to get a fuller understanding of each situation. For example, he first suggested that anytime someone chose “Native American” as their ethnicity on the site, “that one is always a fraudster.” But he then modified that assessment, explaining that while this was a common indication of fraud, there was also genuine confusion involved:

fraudsters like to pick the first – you know, the first American, they just go ‘boom’. Native American. I’m like, well, wait a minute. You look like you're from Africa, but you got Native American. But we’ve seen people, who were Americans – they don't quite get it, and they think hey, I’m a native American!

George is referring here to an instance where a person who lives in and is from the United States, but not of indigenous ancestry, mistakenly interprets the question to refer to their natal origin. However, this example also highlights how minoritized populations can face hassles, misinterpretation, and exclusion when systems—human or machine—seek to identify anomalies based on past data. After all, the reason for listing “Native American” in the ethnicity category in the first place is not as a trap to catch fraudsters, but because this is a valid identity category. George's network simply has not encountered Native American (i.e. indigenous) users on their platform (as far as he knows). George's response to these kinds of anomalies is to reach out to his clients and ask further questions. IT Director William and fraud detection software CEO Peter both emphasized that anomalies would often be a starting point for investigation, rather than a stopping point for user access. These explorations of unusual patterns can provide surprising insights and an opportunity to improve their systems when particular use cases diverge from the imagined user as originally envisioned. The result may be stronger security and usability preserving access to users who diverge from the norm.

Yet depending on the implementation, this too can be hurtful and suggestive of an unwelcoming environment should someone recognize that their very existence in a space sets off alarms, especially if integral aspects of one's identity are framed as security threats. Information Studies scholar Hoffmann (2018) argues this point in relation to incidents of Native American Facebook users being kicked off Facebook under the platform's “real names” policy and transgender passengers being flagged by TSA agents based on body scans. Sociologist Tufekci (2017) similarly explores a case where Western assumptions about user characteristics disproportionately exclude those who fall outside of normative expectations, recounting an instance where Facebook sanctioned a friend for having a name that was equivalent to a vulgar word in English. In order to reinstate her account, this friend had to provide government identification of her legal name. While the bias embedded in Facebook's real name policies has been roundly critiqued, this raises further concerns about the reliance upon standardized government identification in order to sign up for a service or keep from having an account suspended. In many parts of the world, possessing government identification is not required and is uncommon.

Many in network security roles are eager to find “new approaches” using machine learning to address fraud in cross-border payments “without putting up so many hurdles it makes service unusable,” as one panelist described it at a Money20/20 session. To limit these kinds of harmful biases, cautious professionals propose developing profiles either for different types of users or based on each individual's specific patterns of behavior, comparing users to themselves rather than to a larger pool of others who might differ significantly. For example, Jesse, an employee for an identity verification system, stressed the need for any machine-learning based process to be sensitive to regional variations. He argued that what counts as normal should be specific to each locale, so that behaviors that might be particularly common in a context like India or Africa, such as lower value and more frequent transactions, “could generate flags if [occurring] outside of those” locations. A representative from online security firm Kount argued at the iDate conference that more specific and individualized pattern recognition was not only more fair, but more effective: “what can't be broken is you … behavior is hard to mimic.” However, as a Money20/20 panelist explained, the only way to “understand customers better” at such a fine-grained level is to “combine [data on the platform] with other data on the device, accounts, take a whole suite of data points and use machine learning to identify fraud patterns.” In other words, in order for people not to be subject to the tyranny of the norm, they would have to forfeit their privacy. Particularly for those whose data stands out in a crowd—often those most vulnerable to varying forms of violence and discrimination—the price to gain access to these online systems would be their consent to massive amounts of surveillance.

Professionals who advocate for such systems are familiar with the problems of bias in machine learning. At Money20/20, Pedro Fonseca of CrowdProcess questioned, “Should neighborhoods be allowed? Should this data be allowed? People in poor neighborhoods would be inherently behind on your credit score.” He explained that non-discrimination is a difficult but important problem, because otherwise society would “end up with dystopia where you have to consider who you're friends with on Facebook—only friends with rich people on Facebook and keep everybody else on Whatsapp!” This got a hearty laugh from the audience, but can be extended to regionally correlated data and behavioral patterns that can operate in similarly exclusionary ways. Ironically, one set of common flags for machine learning systems that may inadvertently penalize those in certain regions are remnants of the very blocks those systems seek to replace.

Workarounds as flags

Internet users worldwide are innovative and often adapt quickly to barriers put in the way of using web-based services. There are entire markets that have risen up providing the means to navigate dominant commercial blockages or to offer local alternatives. These include proxies, VPNs, freight forwarding, and using someone else's payment or address information when conducting an online transaction. However, these workarounds developed to bypass group-level blocks are, in turn, being picked up as a signal by systems that determine which users to block access to. Such strategies are coming to be viewed as suspicious or as a likely cover for criminal activity.

Current strategies for navigating blocking

Technically savvy internet users and members of elite corporate or educational networks can turn to proxies, VPNs, and anonymity systems to appear like they are accessing the internet from a non-blocked locale. There are numerous VPNs available for purchase by those without such connections, such as the evocatively named Unblock-Us. These services are also frequently marketed as having the added benefit of providing greater privacy to users who do not wish to have their movements tracked across the web, including those who may be wary of government surveillance.

If access to site browsing is available but purchasing is not, people may ask friends or family to make a purchase for them. Upon being told that website access was denied because Macy's did not ship to Vietnam, one customer complained that where they would previously rely on others to purchase objects they chose, now they were unable to even browse. This statement was followed by a string of sob emojis. Other commenters on both sides of the arrangement agreed. This kind of third-party purchasing is well established for cosmopolitan networks. Technically, such practices may run afoul of applicable taxes, but these would be dependent on the particular circumstances—for example, one commenter argued, “I live in the US and I want my niece in Argentina to pick a dress she likes […] and she cannot even SEE the page. Ridiculous.”—describing a situation in which tariffs would not be relevant. It is further worth noting that Macy's uses the Borderfree service to ship to a number of countries around the world, which adds in relevant taxes and takes care of the necessary infrastructure. Borderfree serves Malaysia and Vietnam, but Macy's does not, suggesting that Macy's choice to limit access for people in those locations is not grounded in concerns about the supporting infrastructure.

Having a package sent to friends or family or to a part-time address in the US not only relies on the goodwill and resources of others, but also means a longer delay between purchase and reception for the eventual owner. For those who do not have informal networks to rely on, a number of services have developed to fill in the gaps that US based e-commerce companies are uninterested in bridging. These services exist to adapt to nascent and flawed shipping and address infrastructure and to supply goods to people who are not eligible for direct shipping. People can order something online and have it sent to the freight forwarding address in the US, which will then send the package along to its final destination for a fee and, usually, applicable taxes. Some companies offer additional services like personal shopping, also helping to alleviate the website blocks. These companies may deliver directly to a home address, and can involve scheduling time of delivery, similar to UPS services in the US. They can also have a centralized location at which customers can pick up their goods. They navigate relevant border crossings and customs. Intermediary services like iguama, which seeks to bridge between US merchants and Latin American consumers, work not only with global consumers but also directly with international brands, offering a win-win situation for US based companies that do not want to go through the full internationalization process. Platforms like Ahonya, available to Ghanaians, even allow people to purchase goods from Amazon using Ghanaian currency, providing a full pathway through the e-commerce process.

Bad habits and personal failures: How workarounds become blacklisted

These third-party workarounds are patterns based on lived conditions in many locations, and thus, following the logic laid out by many professionals pushing for machine-learning systems, should be accepted as familiar behavior. However, they are instead consistently used as factors in flagging behavior and blocking access. As Khattak et al. (2016) have established, anonymity networks such as Tor, increasingly receive “second-class treatment […] from outright rejection to limiting their access to a subset of the service's functionality or imposing hurdles such as CAPTCHA-solving” as a response to accusations that such systems are employed for malicious use (p.1). Many media streaming websites have also adjusted their blocking policies to try to prohibit VPN access. Facebook security researchers refer to this process as an adversarial cycle, where defenders of security target features of bad actors that are hard to change, such as IP address, the adversary nevertheless adapts to avoid such detection, here by using a VPN or proxy network, and the defender subsequently flags this response (Stein et al., 2011). Thus, because people use VPNs to get around regional blocks, VPNs themselves become the target of blocks. Peter, the owner of a fraud detection company, said of Tor:

[t]he so called dark net that is behind the Tor browser … that is where you can find anything like, order a human. [laughs] Or, uh, find some, I don't know, drugs anything. This is where you can find the credit cards, and I'm sure you can find this kind of proxies all over the world.

Fraud detection services like Kount, a vendor at both iDate and Money20/20 Europe, promote their “Proxy Piercing” services as countering fraud, arguing, “Proxy servers are used because a fraudster conducting a stolen credit card transaction wants to appear to be in the same location as the owner of the stolen card.” This ignores the use cases of those who are attempting to circumvent regional blocks in order to make legitimate purchases or simply to browse a website, and casts a negative valence on their behavior. Because these services can obfuscate the identity and personal information of a buyer, they are sometimes targets for scammers.

Sociologists Fourcade and Healy (2017) extrapolate the implications of such projects, arguing that as behavioral patterns that are correlated with negative outcomes are activated to sort people into categories of risk, “scores become ethically meaningful indexes of one's character […] Bad outcomes are nothing but the mechanical translation of bad habits and personal failures” (p.24). Because proxies and VPNs are associated with criminal activity, using them becomes a marker of being a likely criminal, even though the very same system of categorization and blocking might lead a “good” actor to use these services to begin with.

Overall country-level blocks, justified by a preponderance of bad behavior, push people in blocked countries to develop workarounds that are further coded as risky, which then get used as variables in personalized systems, which accumulate judgments that may re-validate group level blocks. As regional barriers continue to coexist with personalized categorization and sorting systems, the former can start to blur to look like the latter—if you are being blocked, you must have done something to deserve it. These workarounds engaged by legitimate users take on the characteristics of the “other” in a recursive and swelling fashion. Being unable to engage “appropriately” within sites that block them, people in these countries are forced into “abnormal” behaviors, further cementing their status as outliers. This problem represents another strike against the industry hope that collecting enough data about regional variation will eventually ameliorate discrimination. Long-standing patterns of economic and social inequality mean that rates of fraud may genuinely be higher among those in certain regions, which can then be used to justify further profiling.

Removing those who cannot conform

Regional blocking can also be “effective” not only by preventing fraud, but by removing data that could complicate the smooth functioning of algorithmic recognition and sorting practices—taking away “friction” for those users who are considered valuable. At first, the refusal to collect data on people from blocked countries by refusing to provide services to them seems antithetical to the drive described by Fourcade and Healy to capture as much behavioral data as possible to sort people into appropriate categories. But by turning away those who might foreground the limitations of the algorithms, these systems can then boast higher rates of accuracy for those who are allowed in—those who already match pre-determined criteria. As Dourish and Mainwaring (2012) argue, ubiquitous computing has frequently obscured “particular geographical, institutional, commercial, and historical settings,” claiming to develop “global” solutions that actually just re-enforce the dominance of one “global authority” over all other ways of being. Thus websites that want to serve a particular group of people while claiming omnipotence are better positioned to do so if they eliminate the anomalies of people from places less prone to producing profits.

Conclusion

Considering access to online shopping, dating sites, and payment systems, those types of services where regional blocking seems to occur frequently, might be critiqued as a shallow way to address discrimination online. But access to these arenas increasingly has implications for people's social, economic, and political well-being. Dating sites provide romantic and economic opportunity (Burrell, 2012). Online shopping can provide enormous cost savings not only on consumer goods, but also on equipment necessary for one's livelihood—but only for those able to access it. Furthermore, shopping online can help avoid the embarrassment or risk of purchasing stigmatized products in person (Boellstorff et al., 2013). Payment systems are a foundational necessity for those who wish to provide goods and services online to a broader, global customer base. The ability to be paid strikes to the core of economic security. And while some current experiences of regional blocking may simply qualify as a nuisance, increasingly the workarounds users develop are being targeted as aberrant and suspicious behavior compromising their ability to build up a “good” behavioral history online. There are also some signs that these barriers could begin to spillover into other domains directly tied to life chances, such as employment and housing. Third party systems that handle blocking and verification efforts stand to profit by being the brokers of expansion. For example, Jumio's identity verification system is moving from the financial world of know-your-customer policies into the sharing economy and dating ecosystem, offering promises of greater physical security for users, and being adopted to help enforce rental prohibitions. Data from these processes can then in turn be fed back into fraud detection and other security algorithms, so that a person ends up “flagged'” elsewhere, a cascading set of blockages. By paying close attention to the evolving deployment of regional barriers on commercial websites now, we have the opportunity to influence the design of systems that may have more influence over people's life chances in the future. Only by facing these realities as they presently exist can we work towards a more just alternative.

Footnotes

Acknowledgments

The authors would like to thank our participants for their willingness to share their insights. We are also very appreciative of collaboration and discussion with Sadia Afroz and Michael Tschantz at the International Computer Science Institute, technical guidance from Dmitry Isakov at FraudScore, and helpful feedback from colleagues including Shazeda Ahmed, Morgan Ames, Eleanor Cawthon, Coye Cheshire, Max Curran, Nicholas Doty, Paul Duguid, Gennie Gebhart, Anushah Hossain, Noura Howell, Nick Merrill, Lavanya Singh, Steve Weber, Richmond Wong, and our anonymous reviewers.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: the National Science Foundation under Grant No. 1651770 as well as a grant from UC Berkeley’s Center for Long-Term Cybersecurity.

Note

References

Afroz S, Tschantz M, Sajid S, et al. (2018) Exploring server-side blocking of regions. ArXiv 1805.11606.

Barocas

Selbst

(2016) Big Data's disparate impact. California Law Review 104(1): 671–729.

Ben-David Y, Hasan S, Pal J, et al. (2011) Computing security in the developing world: A case for multidisciplinary research. In: NSDR '11 proceedings of the 5th ACM workshop on networked systems for developing regions, NSDR '11, 28 June 2011, Bethesda, Maryland, USA, pp.39–44.

Boellstorff T, Ihsan AD, Mahardika A, et al. (2013) Landscaping mobile social media and payments in Indonesia. Final Report. Institute for Money, Technology, and Financial Inclusion, Irvine, CA, July.

Bolukbasi

Chang

Zou

et al. (2016) Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In: Lee

Sugiyama

Luxburg

(eds) Advances in Neural Information Processing Systems Vol. 29, Curran Associates, Inc, pp. 4349–4357.

Buolamwini

Gebru

(2018) Gender shades: Intersectional accuracy disparities in commercial gender classification. Proceedings of Machine Learning Research 81: 1–15.

Burrell

(2012) Invisible users: Youth in the Internet cafés of urban Ghana, Cambridge, MA: MIT Press.

Cimpanu C (2018) New service blocks EU users so companies can save thousands on GDPR compliance. Bleeping Computer. Available at: https://www.bleepingcomputer.com/news/security/new-service-blocks-eu-users-so-companies-can-save-thousands-on-gdpr-compliance/ (accessed 4 March 2019).

Clark

Faris

Morrison-Westphal

et al. (2017) The Shifting Landscape of Global Internet Censorship, Cambridge, MA: Internet Monitor, , June.

10.

Cory

(2017) Cross-Border Data Flows: Where are the Barriers, and What do They Cost?, Washington, DC: Information Technology & Innovation Foundation, , May.

11.

Dourish P and Mainwaring SD (2012). Ubicomp's colonial impulse. In: Proceedings of the 2012 ACM conference on ubiquitous computing – UbiComp '12, Pittsburgh, PA, USA, September 2012.

12.

Drake WJ, Cerf VG and Kleinwächter W (2016) Internet fragmentation: An Overview Future of the Internet Initiative White Paper.

13.

Eubanks

(2017) Automating Inequality: How High-tech tools profile, police, and punish the poor, New York, NY: St. Martins Press.

14.

Fourcade

Healy

(2017) Classification situations: Life-chances in the Neoliberal Era. Historical Social Research 42(1): 23–51.

15.

Freed D, Palmer J, Minchala D, et al. (2018) A Stalker's paradise. In: Proceedings of the 2018 CHI conference on human factors in computing systems – CHI '18, Montreal, Canada. ACM Press, pp.1–13.

16.

Gebhart G, Anonymous and Kohno T (2017) Internet censorship in Thailand: User practices and potential threats. In: Proceedings – 2nd IEEE European symposium on security and privacy, EuroS&P '17, 26–28 April 2017, Paris, France, pp.417–432.

17.

Goldsmith

(2008) Who Controls the Internet?: Illusions of a Borderless World, Oxford: Oxford University Press.

18.

Hardt M, Price E, Srebro N (2016) Equality of opportunity in supervised learning. Retrieved from http://papers.nips.cc/paper/6373-equality-of-opportunity-in-supervised-learning (accessed 4 March 2019).

19.

Hoffmann AL (2018) Data violence and how bad engineering choices can damage society. Medium. Available at: https://medium.com/s/story/data-violence-and-how-bad-engineering-choices-can-damage-society-39e44150e1d4 (accessed 4 March 2019).

20.

Irani L, Vertesi J, Dourish P, et al. (2010). Postcolonial Computing: A Lens on Design and Development. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2010, April 10–15, 2010, Atlanta, Georgia, USA, pp.1311–1320. DOI: 10.1145/1753326.1753522.

21.

Khattak S, Fifield D, Afroz S, et al. (2016) Do you see what i see? Differential treatment of anonymous users. In: Proceedings of the network and distributed system security symposium (NDSS). Available at: https://www.eecs.berkeley.edu/∼sa499/papers/ndss2016.pdf (accessed 4 March 2019).

22.

Kleinberg J, Mullainathan S and Raghavan M (2016) Inherent trade-offs in the fair determination of risk scores.

23.

Mueller

(2017) Will the Internet Fragment? Sovereignty, Globalization, and SNS '11, Salzburg, Austria, April 10–13, 2011 Cyberspace, Malden, MA: Policy Press.

24.

Selbst

(2017) Disparate impact in big data policing. Georgia Law Review 52: 109–195.

25.

Stein T, Chen E and Mangla K (2011) Facebook immune system. In: Proceedings of the 4th workshop on social network systems (SNS '11), ACM, Article 8.

26.

Thakor M (2018) Digital apprehensions: Policing, child pornography, and the algorithmic management of innocence. Catalyst: Feminism, Theory, and Technoscience 4.1.

27.

Tufekci

(2017) Twitter and Tear Gas: The Power and Fragility of Networked Protest, New Haven, CT: Yale University Press.