Abstract
The potential for biases being built into algorithms has been known for some time (e.g., Friedman and Nissenbaum, 1996), yet literature has only recently demonstrated the ways algorithmic profiling can result in social sorting and harm marginalised groups (e.g., Browne, 2015; Eubanks, 2018; Noble, 2018). We contend that with increased algorithmic complexity, biases will become more sophisticated and difficult to identify, control for, or contest. Our argument has four steps: first, we show how harnessing algorithms means that data gathered at a particular place and time relating to specific persons, can be used to build group models applied in different contexts to different persons. Thus, privacy and data protection rights, with their focus on individuals (Coll, 2014; Parsons, 2015), do not protect from the discriminatory potential of algorithmic profiling. Second, we explore the idea that anti-discrimination regulation may be more promising, but acknowledge limitations. Third, we argue that in order to harness anti-discrimination regulation, it needs to confront emergent forms of discrimination or risk creating new invisibilities, including invisibility from existing safeguards. Finally, we outline suggestions to address emergent forms of discrimination and exclusionary invisibilities via intersectional and post-colonial analysis.
Keywords
Algorithmic profiling
The data revolution has been driven by rapid innovation in ‘ubiquitous computing’ in which some claim has resulted in widespread ‘datafication’ of the ‘surveillance society’ or ‘information civilization’ (Dencik, et al., 2016; Lyon, 2001; Matzner, 2014; Zuboff, 2015). Central to this is the exponential increase in data, expanded surveillance to gather more and more of it, and dynamic new ways to analyse it (Kitchin, 2014). Algorithmic profiling is a way of detecting patterns, and making predictions on the basis of them. This occurs in a range of contexts including insurance, finance, differential pricing, education, employment, marketing, governance, security, and policing (Ferguson, 2017; O’Neil, 2016; Stalder, 2002). More specifically, we understand algorithmic profiling 1 as a method of inferential analysis that identifies correlations or patterns within datasets, that can be used as an indicator to classify a subject as a member of a group (Hildebrandt, 2008; Schreurs et al., 2008). 2 These categories are formed from ‘probabilistic assumptions’ (Leese, 2014: 502) that are de-individualised (Schermer, 2013). A decision for a loan application may not be made on the basis of individual risk of default, but on the basis of postcode or neighbourhood, that may operate as an indirect proxy of other indicators such as the socio-economic or racial composition of one’s neighbours. This leads to concerns about social sorting and discrimination.
Social sorting and discriminatory potential
Algorithmic profiling may result in social sorting and other discriminatory outcomes (see e.g., Lyon, 2003, 2014; Parsons, 2015). Research in Australia (Mann and Daly, 2019) and North America (Browne, 2015; Eubanks, 2018; Noble, 2018; Peña Gangadharan, 2012; Sandvig et al., 2016) demonstrates how algorithmic profiling targets marginalised groups, such as racial minorities, individuals of low socio-economic status, and women. Browne (2015) argues that algorithmic profiling perpetuates hierarchies predicated on the enmeshing of identity characteristics. Discriminatory practices become self-enforcing with feedback loops, as datasets are constructed that disproportionately contain data about certain people, leading to over-monitoring and over-policing of those groups (see e.g., Ferguson, 2017). Importantly, discriminatory effects also occur if data on discriminatory features like gender, race, ethnicity, etc. are not directly processed. 3 In fact, algorithmic profiling can easily identify ‘proxies’, i.e., combinations of input data which are accurate predictors for the discriminatory categories (Harcourt, 2010; Kleinberg et al., 2016). This illustrates that the data which are used, as well as implicit assumptions while programming (and training in the case of machine learning algorithms), carry discriminatory potential (Campolo et al., 2017).
Data protection and the ‘right’ not to be subject to automated decisions
Currently, many challenges to algorithmic profiling refer to data protection law, especially the new General Data Protection Regulation (GDPR) of the European Union (EU), since it contains dedicated rules for algorithmic profiling that have resonated in academic critique internationally (see e.g. Edwards and Veale, 2017; Selbst and Powels, 2017; Vedder and Naudts, 2017). We focus specifically on the EU due to the recent introduction of the GDPR, 4 which replaced the Data Protection Directive, 5 and seeks to regulate the use of personal data, 6 and includes a ‘right’ not to be subject to automated decisions (Article 22 of the GDPR). The GDPR is designed to uphold data protection rights under Article 8 (Protection of Personal Data) 7 of the EU Charter of Fundamental Rights. Although this new regulatory regime is generally regarded as global ‘gold standard’ (see e.g., Buttarelli, 2016; Safari, 2017), there are limits to the application of data protection law in countering algorithmic profiling and the drawing of sensitive or discriminatory inferences (see also Wachter and Mittelstadt, 2019). Therefore, we argue that data protection law may not be a good resource to challenge the problems of algorithmic profiling introduced above. Instead, we show that anti-discrimination may offer a more promising outlook; however, existing protections should be amended or extended in order to cope with new forms of discrimination that emerge, or that do not pertain to known protected identities, but rather represent patterns that have little or no intuitive meaning to human practice.
There is a lack of consensus as to whether algorithmic profiles, or algorithmic inferences made about an individual, are considered as personal data. This is because according to Article 4 of the GDPR, 8 information must relate to an identified or identifiable natural person to be considered as personal data. 9 Koops (2014) argues that with ongoing technological innovations, what counts as personal data is becoming obscured, a point also made by Purtova (2018), who argues the distinction between personal and non-personal data should be suspended. 10 Purtova (2018) concludes all data processing with the potential to impact people should trigger protection, and at the very least, should be assessed for the likely impact it will have. Purtova’s argument does not entail that data protection regulations suffice to deal with all kinds of data, but rather that the blurring of the distinction between personal and other data underlines the need for new normative grounds to assess the impact of data processing. Wachter and Mittelstadt (2019) note the guidance provided by the Article 29 Working Party 11 provides support for inferences being considered as personal data, particularly if there is potential to impact an identifiable individual’s rights and interests. Yet they also point to the conflicting decisions of the European Court of Justice that have a more constrained interpretation of personal data. The consequence is that those impacted by algorithmic profiling may enjoy limited data protection rights, such as access rights, that in turn may impede their ability to correct or rectify inaccurate inferences, or assess the lawfulness of data processing (Wachter and Mittelstadt, 2019). 12 Complicating matters further, anonymised data 13 can be used as a basis to construct profiles and draw sensitive inferences. Therefore, ‘by using data about people not linked to a particular individual, or by purposefully anonymising data prior to drawing inferences and constructing profiles, companies can thus avoid many of the restrictions of data protection law’ (Wachter and Mittelstadt, 2019: 55).
A significant aspect of the GDPR is that it grants a ‘right’ ‘not to be subject to a decision based solely on automated processing, including profiling, which produces legal effects concerning him or her or similarly significantly affects him or her’ (Article 22). This has been subject to debates. The Article 29 Working Party Guidelines on ‘Automated Individual Decision-Making and Profiling for the Purposes of Regulation 2016/679’ provide further guidance on the specific provisions that establish the general prohibition for decision-making based solely on automated processing. They argue that ‘interpreting Article 22 as a prohibition rather than a right to be invoked means that individuals are automatically protected from the potential effects this type of processing may have.’ However, it is also argued that the scope of Article 22(1) is confined to decisions based
Given the above, there have been calls for entirely new rights to be recognised under data protection law. Wachter and Mittelstadt (2019) argue for a ‘right to reasonable inferences’ to be incorporated into the GDPR. This would, in principle, require the data controller to establish whether an inference is reasonable. A key limitation of this proposal is that it does not specifically relate to managing differential treatment or discriminatory outcomes on the basis of sensitive inferences. Therefore, data protection, and suggested improvements such as a ‘right to reasonable inferences’, may not be an ideal framework for responding to the challenges presented by algorithmic profiling. Questions about the suitability of data protection law intersect with the recent developments in the critique of algorithms that we discussed in the introduction: that is, the problems with algorithmic profiling are not limited to processing personal information or drawing sensitive inferences. Rather, the impact of algorithmic profiling on the actions, lives and personalities of profiled persons might derive from input that seems inconspicuous from the point of view of data protection, due to the fact that algorithmic profiling works on classes, aggregates and patterns. This of course falls within a long-discussed limit of privacy in general, and not just data protection, if understood in an individualising manner. Gilliom (2001: 122) argues: To the extent […] that the privacy paradigm relies on and maintains the idea of the autonomous individual and the idea of surveillance as mere visitation, it risks a massive misrepresentation of the full impact of surveillance in our lives. The positioning of extensive and ongoing surveillance in the modern state promises to recast the citizen into the frames and terms of bureaucratic analysis and translate our ongoing actions into tactics of compliance, evasion, and above all, calculation.
Anti-discrimination as an alternative
Given the potential for algorithmic profiling to facilitate discrimination, anti-discrimination law may provide a more promising avenue for responding. Gellert et al. (2013: 61) draw attention to the important distinction that data protection is concerned with certain actions (principally data ‘
In the EU, both data protection 16 and non-discrimination 17 are fundamental rights enshrined in the Charter of Fundamental Rights of the EU as ‘regulatory human rights’ (Gellert et al., 2013: 61). Article 14 (Protection from Discrimination) of the European Convention on Human Rights prohibits discrimination on the basis of demographic characteristics, including any possible ‘other status’. 18 Further, Title III of the Charter of Fundamental Rights of the EU is dedicated specifically to equality, and composed of a general provision on anti-discrimination and equality (Article 21), and provisions for specific demographics such as cultural, religious and linguistic diversity, gender, rights of child, elderly, and persons with disabilities (Articles 22–26). 19 Gellert et al. (2013: 65) argue the specific types of discrimination (i.e., Articles 22–26) represent ‘a more conceptually refined notion of discrimination’ in comparison to the general principle of equality. Yet, we contend that with respect to the abstracted nature of profiling and the drawing of inferences, it may not be possible to identify grounds for discrimination as per specific protected grounds, and that a broader and more diversified approach to anti-discrimination may be an avenue to explore. This is a point we return to, but first a brief comment on direct and indirect discrimination is required.
Direct and indirect discrimination
Direct discrimination focuses on situations whereby an individual has been treated unfairly on the basis of protected grounds, whereas indirect discrimination refers to practices that may
Thus, algorithmic profiling complicates the notion that a discriminatory outcome can be linked to a protected identity in a two-fold manner: first by enabling proxies, and second, by using new categories that have no clear meaning to human interpretation. This is significant in the context of arguments made by Leese (2014: 505) in relation to a ‘deep-seated epistemological conflict between an anti-discrimination framework that conceives of knowledge as the establishment of causality and data-driven analytics that build fluid hypotheses on the basis of correlation patterns in dynamic databases.’ This means that ‘discrimination will not concern any of the protected grounds, but rather attributes such as income, postal code, browsing behaviour, type of car, etc., or complex algorithmic combinations of several attributes’ (Gellert et al., 2013: 80). Therefore, it is necessary to identify ‘whether attributes, and complex algorithmic combinations of attributes, which do
Reconceptualising discrimination
On a very fundamental level, every form of algorithmic judgment could be treated as discriminatory. Profiling aims at making predictions, that is, uses statistical methods or machine learning to predict pieces of information which are not directly available – otherwise profiling would not be necessary. Very generally, it looks for differences among people to entail that they are treated differently. Anti-discrimination asks us to treat people equally despite their differences.
There are challenges for individuals to even identify that they have been subject to differential treatment on the basis of a protected ground. This is because, for example, an individual denied a mortgage on the basis of their neighbourhood is ‘not a member of a protected group. She is a victim, not because of her race, but because of the race of the people that live in, and help determine the profile of her neighborhood’ (Danna and Gandy, 2002: 382). Moreover, when the aim of the algorithmic system is to identify and manage risks, some outcomes may never arise: ‘we must also consider that fact that as system objectives more routinely come to be framed in terms of the identification, minimization, or management of risks, rather than the achievement of objectively measured goals or achievements, the consequences of systematic error will be more difficult to observe and control’ (Gandy, 2010: 39). Our main contribution is the development of new concepts that capture a more precise notion of discrimination and that enables emergent classifications to be recognised as discriminatory. To that aim, there is a need to connect with intersectional perspectives, that also do not map so easily onto existing protected identities.
Intersectional discrimination
Intersectional forms of discrimination have been shown as important expansions to existing views on discrimination on the basis of one protected ground (Crenshaw, 1989). Intersectional theory argues that the specific combination of identities, e.g. a woman of colour, cannot be understood in terms of the discrimination that people of colour, or women, experience. That is, intersectionality highlights the entanglement of protected identities. In one of the first texts that defines intersectional analysis, Crenshaw argues that the combined effects of discrimination are particular forms of discrimination that are experienced by people with a specific combination of (protected) identities – and cannot be reduced to one of its ‘elements’ (Crenshaw, 1989: 149). Thus, intersectional theory highlights the specificity of discrimination: it may be more specific than one protected ground.
With the promise of personalisation through algorithms that tie in many more features than human judgement does, the results become much more specific than just the intersection of two or three prominent markers. The same is true of emergent discrimination: it might be much more specific than the intersection of two or more identities. Crenshaw illustrated that such forms of discrimination are hard to prove statistically, even if they are ‘just’ concerning race and gender, as the necessary data might not be available or the statistical populations too small (Crenshaw, 1989: 146). This sensitivity for experiences of discrimination that are hard to prove statistically, or to be objectified in another manner, are an important insight from intersectional thought, that can be carried over to the analysis of emergent discrimination. The fact that a large proportion of the populace with a protected feature is processed in a ‘fair’ manner is no guarantee that discrimination does not take place.
Significantly, algorithmic profiling that facilitates the inclusion of different sources and types of data is likely to contribute to increasing entanglements of protected identities, thus creating new categories and groups of people that experience forms of intersectional discrimination. Intersectional theory has shown that safeguards against discrimination wrongly assume that all forms of discrimination function similarly or independently. Although intersectional theory was conceived against the backdrop of US anti-discrimination law, a similar disregard of the specifics of particular social positions and intersectional identities has been diagnosed for the EU (see e.g. Verloo, 2006). Algorithmic systems, for which everything is yet another potential input feature or proxy variable for correlative analysis, might increase that assumption. These aspects highlight the problems of new forms of discrimination that emerge from complex intersectional combinations of protected grounds, or correlative abstractions from them.
Emergent discrimination
In addition to these more complex intersectional forms of discrimination, completely new forms of discrimination may emerge. Leese (2014: 504) calls these ‘non-representational’ to express that these new classes of discriminated people might be formed by combinations of input features that do not even make sense as the intersectional combination of identities. That is, both the input and output of algorithmic systems may not have a direct relation to a protected ground but it might still be the case that an algorithmic system systematically disadvantages persons with, say, a specific combination of browsing history, make of the computer, and favourite bands for example, Facebook likes. However, it is not clear that this would count as discrimination. A first rebuttal could be that such outcomes are not discriminatory at all. Given that the system works well, it tracks existing statistical differences – and if one does not want to call that discriminatory in general as explained above, that is just how the system works. After all, if it could not find any differences, the system would not work. Following this line of thought, if someone is incorrectly profiled by such a system, and as a consequence suffers from some form of disadvantage, that would be an individual error, not discrimination. However, discrimination is not an issue of wrong classifications. In fact, that a system used for algorithmic profiling should be as error-free as possible, is a matter of course. Anti-discrimination safeguards carry a stronger intuition than protection against erroneous treatment: even if there are differences in the world, we might better not differentiate along them. Thus, even if the algorithm in this case was not ‘wrong’ in a narrowly conceived epistemic understanding, applying the principles of anti-discrimination might mean refraining from using this information to make discriminatory decisions.
In consequence, the problem is how to single out results that count as discriminatory and should be avoided. One approach would be the perspective of ‘data justice’ by Dencik et al. who advocate for a critique that scrutinises ‘
Making exclusionary invisibilities visible
There are differences between the newly emerging forms of discrimination and intersectional forms, as intersectional theory focuses on identities that are already recognised as a source of discrimination. Emergent forms of algorithmic discrimination stem from features and indirect proxies that themselves, on face value, seem harmless. However, it is a combination of such seemingly harmless features that might lead to emergent forms of discrimination. In this regard, however, parallels become visible. Intersectionality has put the focus on people with complex identities that suffer discrimination that is not visible from the perspective of singular protected grounds. This is repeated structurally with emergent forms of discrimination in a more complex way. Thus, remedies for intersectional analysis can point towards possible approaches that bring greater attention to emergent forms of discrimination.
There is a need for new strategies or methods for showing discrimination that do not rely on direct comparisons, as it may be so specific – or personalised – that any comparisons become meaningless (see e.g. Marcat-Bruns, 2018). In relation to intersectional discrimination in the EU, Marcat-Bruns (2018: 49) argues that ‘more efficient institutional monitoring’ is required, and we agree that this is the case in relation to emergent forms of discrimination also. Fredman (2016: 8) argues that intersecting relationships of power can be analysed and counteracted by four dimensions: ‘(i) the need to redress disadvantage, (ii) the need to address stigma, stereotyping, prejudice and violence, (iii) the need to facilitate voice and participation; and (iv) the need to accommodate difference and change structures of discrimination.’ We argue these arguments for improvements based on intersectional theories of equality can inspire countermeasures for emergent forms of algorithmic discrimination. As above, the second point may be difficult in the case of complexly intersectional or emergent forms of algorithmic discrimination since they are so hard to identify because they do not provoke a socially recognisable form of stigmatisation. However, the other dimensions can be readily extended to emergent algorithmic discrimination. This starts by ensuring that anti-discrimination institutions and officers are attentive to the possibilities of emergent discrimination. Thus, possibilities for challenging algorithmic verdicts and demands for redress need to be available to all, regardless of belonging to a specific protected group. Yet, this will only be possible via broad anti-discrimination logics and protections, that operate independently of specific protected grounds, for example by embracing provisions such as Article 14 of the European Convention on Human Rights that prohibits discrimination on the basis of any possible ‘other status’. Learning from arguments raised about amending anti-discrimination protections to encompass intersectional discrimination, there should be recognition of ‘the risks of compartmentalisation generated by the existence of [specific] grounds for discrimination’ (Marcat-Bruns, 2018: 48). In turn, this will contribute to the third dimension, to increased participation and voice, not only for representatives of certain groups, but for all who may be impacted by discriminatory processes. Anti-discrimination officers working in the field of algorithmic profiling should work less in the name of particular groups but towards broader dimensions of equality. Apart from legally institutionalised forms of voice, ideally practices like participatory design can raise awareness of complex forms of discrimination.
This would also conform to Fredman’s (2016: 80) suggestion ‘that in designing proactive measures, groups should be defined not merely in terms of their status markers, but with reference to the particular aims of equality.’ Fredman (2016: 66) continues that ‘new intersectional groups should be recognised in their own right,’ and argues for entirely new grounds for discrimination – an argument that could also be applied to emergent forms of discrimination, provided they can be identified. Following Crenshaw’s (1989) seminal piece there was wide recognition and acceptance, including within the judiciary, of intersectional forms of discrimination (see e.g. Marcat-Bruns, 2018). Drawing attention to the possibilities of emergent forms of discrimination in algorithmic profiling in this way may also contribute towards a rethinking of anti-discrimination approaches, particularly when they connect to exclusion and marginalisation: ‘this recognition might narrow the focus on those who are most often disenfranchised at the intersection of multiple forms of subordination’ (Marcat-Bruns, 2018: 47, citing Crenshaw). Moreover, the existing safeguards for the legally encoded protected groups need to be ameliorated. Crenshaw writes that such simple lists are not grounded in a bottom-up commitment to improve the substantive conditions for those who are victimised by the interplay of numerous factors. Instead, the dominant message of anti-discrimination law is that it will regulate only the limited extent to which race or sex interferes with the process of determining outcomes (Crenshaw, 1989: 151). This insight can be carried over to algorithmic judgment as well. It might help to construct safeguards for protected grounds in a way that reflects that discrimination might not be experienced by all members (e.g. all women), but only some. This would help wherever new emergent forms of discrimination include members of protected groups or categories.
Another important approach to diagnosing and protecting against emergent forms of discrimination comes from the post-colonial view that practices of control and power have been developed in complex back-and-forth traffic between the West and its colonies, Which have always included data gathering and processing (Foucault, 2003; Legg, 2007; Thatcher et al., 2016). Increasing global flows of data, and the relative ease to tap into them, has made algorithmic profiling an important tool that extends the reach of states’ institutions beyond their national borders. The ‘Five-Eyes’ spying collaboration of the US and four Commonwealth states including the UK, imports the British colonial legacy into the very structure of the internet (Mann and Daly, 2019). Thus, when new criteria are formed through algorithms, they have to be assessed against the backdrop of a global surveillance system that transports its own norms and processes of suspicion. Therefore, post-colonial attention to (in)visibilities and marginalisation is important, especially regarding algorithmic profiling that is used to protect borders, migration and other ‘outsides’ (Adey, 2012; Monahan, 2017). As Mann and Daly (2019) show, algorithmic profiling continues many of the colonial practices of creating margins, outsides, and invisibilities of excluded subjects. For example, data-based border controls have become decisive in the processing of migration, asylum requests, and the ensuing actions like moving people to detention camps (Mann and Daly, 2019). Here, the verdicts of algorithmic profiling are directly related to exertions of power. These, then, are further instances where seemingly harmless data are directly invested with powerful measures that are hard to challenge. Further, as Monahan (2017: 202) argues, these types of marginalising surveillance produce new forms of ‘exclusionary invisibility’ where algorithmic profiling is aimed at persons who are hardly visible, and who often do not fall under the scope of existing protections. This social invisibility is mirrored and augmented if the emergent categories are also ‘invisible’ from the point of view of existing anti-discrimination protection. It becomes an invisible production of invisibilities.
An important cue to analyse newly created categories is the question of whether they enforce, facilitate, or legitimise, such exclusionary invisibilities. Thus, the fight against emergent forms of discrimination through anti-discrimination protections runs the risk of continuing the colonial practices of providing safeguards by creating exclusions. Here we are making the dynamics of exclusionary invisibility, in both algorithmic profiling, and anti-discrimination logics, more visible. There is a need for further research in order to make such invisibilities visible. This may include algorithmic accountability and auditing initiatives that seek to identify when, why, and how, emergent discrimination is occurring, yet opening the ‘black-box’ (Amoore, 2011; Pasquale, 2015) is likely to be challenging. Perhaps one way of doing so, and aligned with current movements in the field of Artificial Intelligence (AI), is incorporating such post-colonial sensibilities for power structures into the development of ethical frameworks, and specifically into measures of ‘fairness, accountability and transparency’, although we acknowledge critiques of ‘ethics washing’ as a way to side-step hard law and regulation (Wagner, 2018). However, a more successful route for implementing such attention to invisible shifts of power might be in social and political forms of oversight and potential new legislation that needs to address which forms of data should be allowed for algorithmic profiling and thus be scrutinised, and which actions should or should not be invested with the power that algorithmic profiling creates.
Conclusion
In this article, we have analysed algorithmic profiling as a process of knowledge construction from large sets of data that often bear no direct relation to the protected grounds of anti-discrimination laws. Still, they form complex intersectional and non-representative categories that may bring about systematic disadvantage for a hitherto unnoticed group of people. We term this process as emergent discrimination. There are limits to the applicability of both data protection and anti-discrimination law in responding to new forms of discrimination that emerge, or that do not pertain directly to protected identities, but rather represent patterns that have little or no intuitive meaning to human practice. However, we have shown that the intuition of anti-discrimination law can be carried over to these new forms of discrimination. Inspired by intersectional reconceptualisations of justice and the ensuing proposals for institutional amendments, we have shown potential remedies. Furthermore, post-colonial attention to (in)visibilities is required to counter the risk of continuing marginalisation. These insights should inform ethical assessments, design processes, and other proactive protective measures in creating and applying algorithmic profiling.
Footnotes
Acknowledgements
We acknowledge the excellent research assistance provided by Ms Harley Williams, and QUT for financing Harley’s contribution. We would like to thank Associate Professor Peta Mitchell, Dr Ian Warren, Dr Angela Daly and the three anonymous reviewers for their helpful comments on previous versions of this article.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Monique Mann received funding as part of her Vice-Chancellor's Research Fellowship (Technology and Regulation) at QUT.
