Sage Journals: Discover world-class research

Abstract

Automated decision-making systems are commonly used by human resources to automate recruitment decisions. Most automated decision-making systems utilize machine learning to screen, assess, and give recommendations on candidates. Algorithmic bias and prejudice are common side-effects of these technologies that result in data-driven discrimination. However, proof of this is often unavailable due to the statistical complexities and operational opacities of machine learning, which interferes with the abilities of complainants to meet the requisite causal requirements of the EU equality directives. In direct discrimination, the use of machine learning prevents complainants from demonstrating a prima facie case. In indirect discrimination, the problems mainly manifest once the burden has shifted to the respondent, and causation operates as a quasi-defence by reference to objectively justified factors unrelated to the discrimination. This paper argues that causation must be understood as an informational challenge that can be addressed in three ways. First, through the fundamental rights lens of the EU Charter of Fundamental Rights. Second, through data protection measures such as the General Data Protection Regulation. Third, the article also considers the future liabilities that may arise under incoming legislation such as the Artificial Intelligence Act and the Artificial Intelligence Liability Directive proposal.

Keywords

Non-discrimination law machine learning automated decision-making recruitment

1. Introduction

Alongside the rise of densely interconnected information-based societies and platform economies, the term ‘algorithmic management’ has aptly arisen to describe Artificial Intelligence (‘AI’) and machine learning (‘ML’) as a technological infrastructure that shapes the nature of work and status of workers in the modern workplace.¹ Within this infrastructure, automated decision-making systems (‘ADMs’) assume a central role in human resources (‘HR’) and are standard in most recruitment practices.² Many ADMs include advanced ML algorithms that operate on various predictive and inferential analytics, natural language and audio-visual processing systems.³ These screen candidates through asynchronous video interviews or gamified assessments,⁴ as well as compile profiles that forecast the behaviour and success chances of a candidate at a given company, and in some cases make recommendations to HR on whether to accept or reject the application.⁵

Bias, discrimination, inequality and unfair treatment are common side effects of recruitment ADMs.⁶ Female candidates seeking technical roles at Amazon suffered such a fate when the ADMs penalized résumés containing words with female connotations because the algorithmic model was trained on past resumes submitted to Amazon from predominantly male candidates. One year after launch, the model was abandoned for systematically discriminating against female candidates.⁷ Recently in Filcams VGIL Bologna and others v. Deliveroo Italia SRL, Italian trade unions sued Deliveroo for using ADMs that scored drivers on their participation and reliability in the company's booking system.⁸ Drivers who were less active, or had previously cancelled bookings, were ranked lower and less likely to be allocated work. The Bologna Labour Tribunal found that the ADMs failed to consider the reasons behind the drivers’ non-participation or cancellation. Without consideration of these reasons, it was likely that protected groups would suffer a particular disadvantage, which the court found to be indirectly discriminatory in a first of its kind judgment.

Concerns about the general impact of AI on work, and particularly on the abilities of individuals to access labour, have consequently surged in the global legal community. Data-driven discrimination is one of the main challenges that has drawn the European Commission to question the technological viability of the protections available to workers against discrimination found in the EU equality directives.⁹ Similar sentiments are expressed in the United States, where researchers metaphorically describe the regulation of these matters within US anti-discrimination law as the act of forcibly hammering a ‘square peg in a round hole’.¹⁰ Algorithms, unlike humans, lack agency, making it necessary to attribute liability for discrimination to the human behind the machine. Attribution in the EU equality directives requires sufficient evidence of causation. This is problematized by ADMs that use advanced ML algorithms to produce complex data-driven predictions through correlative processes, since these are not easily translatable into a causal language, and often constitute inscrutable black boxes that shield their inner operations from external scrutiny.¹¹ As a result, ADMs may camouflage evidence of human bias, prejudice or social stereotyping as seemingly objective mathematical criteria in the decision-making process. Notwithstanding that ML is itself unable to discriminate for lack of agency, it can nonetheless enable human discrimination with detrimental consequences. Opacity in this sense becomes a de facto mask, permitting recruiters to hide behind the ADMs, and in doing so, distance themselves from the practical as well as legal consequences of the procedure.

This article studies causation as a key, but often overlooked, challenge in data-driven discrimination cases that gives rise to two critical legal problems in EU law: (i) how to infer discrimination from ADMs and connect it to a human, and (ii) how to access the information necessary to make causal inferences. To examine these causal issues, the article focuses on complex and opaque ML in discriminatory recruitment ADMs and explores how this limits complainants’ abilities to causally explain data-driven discrimination and discharge their burdens of proof. To overcome this gap, the article argues that it is necessary to inspect causation as an informational challenge that marries non-discrimination and equality issues with data access and evidence disclosure issues. This reflects the multidimensionality of data-driven discrimination as a by-product of what has famously been described by Citron and Pasquale as the scored society effect of big data economies,¹² and responds to the need to address the digital, as well as economic, political and social, concerns of ADMs.

The article proceeds as follows. Section 2 briefly outlines the nature of data-driven discrimination in ADMs. Section 3 presents the substantive and procedural challenges of addressing data-driven discrimination through the EU equality directives, with a focus on the (i) the Race Equality Directive 2000/43/EC, (ii) the Gender Equality Directive (2006/54/EC) and (iii) the Framework Employment Directive (2000/78/EC).¹³ Section 4 examines alternative remedies complainants have at their disposal to claim data-driven discrimination and procedurally access evidence of discriminatory recruitment ADMs. Special consideration will be placed on Article 21 of the EU Charter of Fundamental Rights (‘CFREU’),¹⁴ the data and privacy protections found in the General Data Protection Regulation (‘GDPR’),¹⁵ and the recent legislative developments in the forthcoming EU Artificial Intelligence Act (‘AIA’) and the Artificial Intelligence Liability Directive (‘AILD’) proposal.¹⁶ Section 5 concludes.

2. Data-driven discrimination in automated decision-making systems

The European Commission broadly defines algorithms as ‘computer instructions that, based on a series of input data, can produce a certain value or set of values as output’.¹⁷ Algorithms vary greatly in their operation and should not be homogenized. Some are simple rule-based procedures that reach a particular decision through a series of pre-programmed instructions.¹⁸ Others, such as ML, constitute non-parametric procedures that autonomously learn and optimize their predictions throughout their operation through the process of backpropagation.¹⁹ The process itself must exist before the programme is run, but the combination of weights and factors that optimizes the function of ML is not known in advance.²⁰ ML is the main toolkit to achieve AI,²¹ and the fundamental technology behind most ADMs.²²

There are three technical modes in which ML operates.²³ The first is through supervised learning where the algorithm is trained against a set of labelled data. The latter is through unsupervised learning where ML aims to automatically extract features from the data using clustering and regression techniques. The last is reinforcement learning, where ML is instructed with a specific aim and then calibrated with positive or negative feedback, depending on whether the aim is met.

In current ML practice, neural nets discover correlations that work without a prior causal model.²⁴ This is not incidental but precisely due to the usual design configurations of ML, which is to dispense with a prior model of the world in favour of discovering patterns as they emerge in the data. Data scientists such as Pearl consequently argue that current ML can never show causation, but future ML might do so.²⁵ Whilst these correlations may evidence causal relations, they are not equivalent to them. Instead, they describe associations between variables as statistical distributions that are de facto blind to the human attributes they correlate with, unless they are included as labels.²⁶ ML predictions are therefore arbitrary aggregations of variables that may or may not be intelligible to humans.²⁷ To date, emerging explainable AI programmes (‘XAI’) cannot apply to ML without great difficulty since they can only connect decision relevant parts of the model, that being parts that contributed to the accuracy in training and prediction of the model.²⁸ The imputation of a causal framework also generates risks of its own, such as to oversimplify or misrepresent ML.

As Veale and Binns explain, a result reached by an algorithm may, despite being perfectly valid as a statistical prediction, constitute an outcome that is not acceptable by contemporary social, political or ethical standards.²⁹ This typifies what computer scientists call the ‘Garbage in Garbage Out’ principle and reinforces the fact that algorithmic biases are strictly speaking effects, not causes.³⁰ The principle has long existed, as can be seen in Charles Babbage's seminal response to the UK House of Commons when asked about the polynomial functions of what was then known as the Difference Engine: ‘Pray, Mr Babbage, if you put into the machine wrong figures, will the right answers come out?’ … ‘I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.’³¹

In accordance with Babbage's principle, human bias can become encoded in ML,³² or otherwise emerge in three main ways. First, ML can have a representational bias where the training data does not adequately represent a diverse or representative user-base. This often emerges from sampling errors such as the collection of data from a skewed sample, over and under-sampling, limited feature choices, proxies, redundant encodings and human bias behind the sampling collection.³³ Second, labelling bias can occur where human attributes are utilized as markers on the training data. Even where labels are audited to remove individual characteristics, bias can alternatively be introduced via the design model in situations where it identifies unanticipated proxies via inferences that are associative of these characteristics. Third, there is self-training bias. This occurs where ML replicates, and potentially even amplifies the biased training data when making future predictions and selections. Generally, bias is introduced into the underlying data set or programming code through an external actor or procedure and then either carried forth, or potentially amplified, through the internal programme.³⁴

Biased applications of these kinds often go unnoticed and unopposed by humans. Bingman et al. in a psychological study observed the reaction of its 3900 participants towards the utilization of biased artificial agents in job recruitment decisions.³⁵ The researchers found that participants were less morally outraged by algorithmic (vs human) bias and thus less likely to find companies legally liable for discrimination. Bonezzi and Ostinelli trace the asymmetrical perception of bias to the mistaken assumption that algorithms, as computational processes, must be neutral entities that operate solely on data-driven as opposed to thought-based inputs.³⁶ Similar was noted by the European Commission during the drafting of the 1995 Data Protection Directive as the result of the ‘apparently objective and incontrovertible character [of automated processes] to which a human decision-maker may attach too much weight, thus abdicating his own responsibilities’.³⁷

Without correcting the underlying bias, discrimination becomes systematically embedded and potentially even amplified through the generation of feedback loops and effects. ML not only becomes increasingly more unreliable and skewed towards the fair representation of persons outside of its training set population but normalizes the outcome of future predictions on the basis that they correlate with previous outcomes.³⁸ Kim uses the example of an algorithm that assigns management positions to male workers and classifies female candidates with lower scores.³⁹ Over time, and if used sector-wide, female workers will perceive these positions as gender-typed and adjust their expectations to what they consider to be normal, which may deter females from applying for future managerial roles, as well as reduce their incentive to learn relevant skills for certain types of jobs and opportunities.⁴⁰ Barocas and Selbst consequently reach the unfortunate conclusion in their seminal research on Big Data that ‘remedying the corresponding deficiencies in the law will be difficult technically, difficult legally, and difficult politically’.⁴¹

3. The substantive and procedural causal requirements of the EU equality directives

The main protections against data-driven discrimination for workers derive from three EU equality directives: (i) the Race Equality Directive (2000/43/EC), which confers a principle of equal treatment between persons, irrespective of racial or ethnic origin, (ii) The Equality Directive (recast) (2006/54/EC), which prevents gender discrimination in the employment context and, lastly, (iii) the Framework Employment Directive (2000/78/EC), which provides general protections against discrimination on grounds of religion or belief, disability, age or sexual orientation as regards employment and occupation. These directives operate on a two-fold distinction between direct and indirect discrimination. The former imposes an absolute ban on the discrimination against an individual on grounds of a protected characteristic (‘PC’), subject to very limited legislative exceptions.⁴² The latter prohibits the use of a facially neutral provision, criterion or practice (‘PCP’) that may be neutral but places individuals at a particular disadvantage, unless it is justified.⁴³ To prove either direct or indirect discrimination, a complainant will have to demonstrate a prima facie case. Once the complainant establishes this, discrimination is presumed and the burden of proof shifts to the respondent to refute the claim.⁴⁴

A. Direct discrimination: Identifying the causes of effects

In direct discrimination, causation follows an upstream approach that attributes the disparity in outcome to a reason behind the decision. To invoke the presumption of discrimination, complainants will need to inspect the inner workings of the ADMs and explain how this caused discrimination. Except where there is indisputable evidence of discrimination, such as where a complainant possesses factual proof that a PC was used as an express label or as a ground truth on which ML operates, or otherwise as an illicit masking device to hide their underlying bias, it will not be possible to overcome the informational challenges of the causal requirements due to two practical limitations. The first limitation arises from the need to interpret whether there is a statistical relation between the variable and a PC. The second limitation arises from the additional need to explain that the PC constitutes the relevant ground for the discrimination.

1. Correlations with protected characteristics

ML variables often intersect with different characteristic nuances of a PC.⁴⁵ Fabris et al. note in their multidisciplinary survey on AI fairness and bias in hiring practices that algorithmic intersectionality of this kind risks multiplying and compounding bias against particularly vulnerable individuals as well as making evidence thereof less traceable.⁴⁶

The EU equality directives strictly maintain a single-axis framework. In its two leading judgments, Parris and Z, the European Court of Justice (‘CJEU’) imposed the need for applicants to prove discrimination on the grounds of, either one, or otherwise multiple PCs.⁴⁷ The CJEU has maintained this requirement in the series of challenges brought by Muslim women who were penalized for wearing a headscarf at their workplaces.⁴⁸ Although the EU Council's compilation of the draft Horizontal Directive expressly includes protection for cases of discrimination falling outside of the list of PCs,⁴⁹ the directive has, to date, not been adopted by the co-legislators.⁵⁰ Xenidis draws on multiple discrimination as a safety valve to catch instances of intersectional discrimination.⁵¹ Support for multiple discrimination may also be derived from Recitals 2, 3 and 10 of the 2000/78 EC Employment Directive. These provisions contemplate protection against discrimination between religion and other rights, as well as age and gender rights respectively in Article 4(2) and Article 6(2). Multiple discrimination does recognize the fact that discrimination is not unidimensional, but it doesn’t necessarily grasp the interconnection between the different PCs and their discriminatory effect.

Further complications arise where proxies are used as surrogate variables. Proxies can be captured by the doctrine of associative discrimination, which has been recognized as giving rise to direct and indirect discrimination by the CJEU in Coleman and CHEZ respectively.⁵² Theoretically speaking, if an employer creates an algorithm to avoid hiring pregnant candidates as a means of avoiding costs of paying for maternity leave, this will clearly amount to a case of direct discrimination on grounds of sex.⁵³ However, in Jyske Finans the CJEU interprets associative discrimination as a very narrow extension to direct discrimination that presupposes the establishment of a ‘direct or inextricable link’.⁵⁴

Legal debates disagree on the amount of correlation sufficient to invoke the protection of associative discrimination.⁵⁵ Adams-Prassl et al. argue that the causal requirements of direct discrimination are met where proxies are ‘so strongly correlated to protected grounds that the affected individuals were still held to have suffered less favourable treatment because of their protected characteristics’.⁵⁶ Others such as Hacker and Xenidis agree that it is theoretically possible to classify proxies as direct discrimination but note that the ‘degree of overlap required between a proxy and a given protected group to give rise to discrimination is unclear’.⁵⁷ Both approaches suggest that correlation can evidence a causal relationship. It would then naturally follow that the stronger the correlation between two variables, the more conclusive such evidence should be of causation.

However, there are theoretical as well as practical problems with these approaches. They do not interrogate the type of connection between the proxy and the PC that is required in this legal determination. The connection could be measured by the size of the coefficients to determine the strength and direction of the relationship between the variables.⁵⁸ Or alternatively, by reference to statistical measurements on how closely the model matches its training data.⁵⁹ Notably, a larger coefficient in the former case is not in itself stronger evidence of causation, and the statistical measures in the latter case merely reveal the data-model fit. Establishing causation on the mere strength of statistical correlation may therefore lead to false positives and give rise to spurious correlations.⁶⁰

A case on this point is the Berkely University sex discrimination lawsuits in the United States.⁶¹ When analysing the admissions rate of the relevant year in 1973, it appeared that women were significantly less likely to obtain admission than their male counterparts. However, this discrepancy was not attributable to any discriminatory practices since many of the faculties in fact exercised positive bias towards female applicants. Instead, the discrepancy resulted from the historical propensity of females to apply to faculties that were more competitive and therefore had lower admission rates than men.⁶² Correlations are indicative of causation, but they can as much be attributed to confounding factors.⁶³ Without further explanations, correlation cannot meet the causal requirements of direct discrimination since it is an indicator of causation but not necessarily conclusive proof thereof.

Whether correlation amounts to sufficient evidence of causation may further often depend on the type of algorithmic model used. Most ML uses features that operate on remote associations of specific variables conditional upon other variables. It will consequently be harder for complainants to obtain evidence of direct discrimination in a more complex model compared to a straightforward linear model or rule-based algorithm.⁶⁴ Even in the latter case, the production of evidence may depend on the number of rules as well as the specificity level of the model.⁶⁵ Liability in this sense would not be contingent upon a recruiter's choices per se but additionally upon the technical operation of ML. Where highly advanced algorithms are used, such as Deep Learning, that operate on artificial neural networks, the chances are less likely that any correlation, yet alone a weak one, can be proven between the variables of interest and a PC. Similar applies to non-linear models such as tree ensembles, forests and multiple classifier systems.⁶⁶

Correlation is therefore a potentially powerful, but equally unreliable, indicator of causation. A legal enquiry of this kind would additionally require costly resources such as XAI. Such is not a feasible option for many complainants due to cost and resource implications, particularly where advanced ML such as neural networks are used that are more arduous to interpret to an extent that the answer is proximate enough to a comprehensible human explanation without oversimplifying or otherwise reducing the technical accuracy of such explanation.⁶⁷ XAI will also not yield helpful results in recruitment decisions without further disclosure of information such as the criteria and selection of successful candidates, which may engage a plethora of privacy and confidentiality issues.

2. Causes of the less favourable treatment

The EU equality directives involve a further causal enquiry that consists of a comparison between the complainant and either an actual or hypothetical comparator.⁶⁸ This counterfactual exercise studies what would have happened in the recruitment process, had it not been for the complainant's PC. The reliability of the counterfactual assessment as an indicator of causation depends on constructing the counterfactual to the closest possible fictious equivalent of the real-life occurrences. The problem that arises with ADMs is that it is not only exceedingly difficult for complainants to establish the identity of a suitable comparator, actual or hypothetical, but to explain whether there has been any differential treatment that can give rise to liability.

The characteristics of the appropriate comparator, real or hypothetical, are essentially predetermined by the selection of the PC, and consequently problematized by the difficulty of attributing variables to PCs. In cases where discrimination is intersectional, multi-dimensional or proxy-based, the construction of the comparator and their counterfactual treatment becomes increasingly uncertain and speculative, which makes it difficult to determine whether there has, or has not been, any favourable treatment.⁶⁹ These issues are not unique to discriminatory recruitment ADMs, since they frequently arise in cases involving associative and multiple discrimination.⁷⁰ Likewise in age discrimination cases, courts grapple with the challenges of constructing counterfactuals based on non-binary variables.⁷¹ The added complication in the ML context is that the variables used may be completely innocuous and incongruous to the human mind.⁷²

ML additionally problematizes the estimation of counterfactual treatments. Without knowing the technical reason why the complainant was disadvantaged by the model, the counterfactual treatment of the comparator cannot be estimated with precision. Again, ML models are uncertain by design not accident, since this allows the necessary technical flexibility on which they update their prediction process. At the same, however, it needs to be stressed that the nature of this process interferes with the reconstruction of the model as a counterfactual stimulation. The application of the counterfactual is therefore circular where ML is used in discriminatory recruitment ADMs since the counterfactual assumes prior causal knowledge as a way of framing the treatment of the comparator. For example, without knowing the criteria on which another candidate was hired and how these were computed by ML, it cannot be estimated with certainty on what basis the complainant who is seeking redress for discrimination was rejected by the ADMs, and whether such rejection was caused on the basis of their PC or not. Similarly relevant is the fact that ML may produce completely different predictions for two different scenarios due to the lack of a prior causal model since there is no guarantee of consistency in outcome where the model is given different data.⁷³

B. Indirect discrimination: Identifying effects of causes

Indirect discrimination is known as the default solution to discriminatory recruitment ADMs.⁷⁴ The doctrinal logic of indirect discrimination is suited to the computational logic of data-driven discrimination since it is designed to capture discrimination arising from measures that are apparently neutral, but nonetheless cause protected groups a particular disadvantage that is unjustified when compared to a suitable comparison group.⁷⁵ Indirect discrimination therefore avoids many of the informational challenges that arise in direct discrimination claims, precisely because it concentrates less on the form and more on the substance of the discrimination. Notwithstanding this, indirect discrimination still requires the establishment of two separate causal requirements: a causal connection between the PCP and the particular disadvantage, and a causal connection between the PCP and a protected group. Causal explanations also operate as a quasi-defence to indirect discrimination where respondents can produce evidence that a discriminatory effect is explicable by objectively justified factors unrelated to the discrimination.⁷⁶

1. Particular disadvantage to a protected group by a provision, criterion or practice

The CJEU generally adopts a flexible approach when determining the threshold of a particular disadvantaged needed to prove that a protected group suffered a particular disadvantage because of a PCP.⁷⁷ Danfoss, Seymour-Smith and Bilka-Kaufhaus confirm that the CJEU will, once a pattern of adverse impact emerges, readily shift the burden to the respondent to justify the indirect discrimination.⁷⁸ It is consequently easier for complainants to produce a prima facie case to invoke the presumption of discrimination due to the focus of indirect discrimination on the effects rather than reason behind the discrimination. Because of this, complainants can circumvent many of the problems arising from the interpretation and explanation of ML.⁷⁹ Instead, the legal enquiry is on the production of statistical evidence to prove the disparate effect of the ADMs.⁸⁰ In these cases, the burden of proof will be influenced by the choice of statistical metric, since computer science has advanced competing methods for the measurement of data-driven discrimination.⁸¹

One measure is Group Fairness, which is the principle of statistical parity across different demographic groups. The mathematical formula is most identical with the four-fifths rule of the Equal Employment Opportunity Commission that is deployed by US courts in disparate impact cases.⁸² The main challenge of this formula is that it equates basic fairness with equality of outcome in the very strictest sense.⁸³ Under such approach, unfair disparity results where a recruiter places too much emphasis on the qualifications or CV experiences of candidates with the result that the intake of workers does not equally represent the demographic groups of applicants. The competing measure is individual fairness, which presupposes that individuals should be treated alike, even if this does not result in group parity. An example is to take the socio-economic backgrounds of candidates into account when evaluating job applications and awarding recognition to those who have not shared the same privileges as other applicants. In computer science, this is achieved through error parity to measure whether the outcome is equally accurate for individuals, regardless of their membership to a particular group.⁸⁴

These methods therefore operate on measures of statistical difference, which are not equivalent to factual proof of a particular disadvantage since the test does not necessarily evidence the amount of disadvantage suffered between the protected group and the comparator group. In other words, even as an effect rather than reason-based approach, indirect discrimination is not solely satisfied with the establishment of a given statistical threshold per se, but requires a linkage between the effect of the PCP to a protected group, which also uses a counterfactual comparison between a protected group and comparator group as a means of establishing causal inferences. Whilst complainants will therefore not face the same issues on the interpretability, since there is no need to discuss the neutrality of the ADMs, it is still necessary for complainants to causally explain how ML discriminated against them by reference to a comparator group. There is a crucial difference in the construction of the comparator group in cases of indirect discrimination. Direct discrimination requires a homogeneous comparator in the sense that the comparator must not share the same PCs as the complainant. Indirect discrimination is, on the other hand, constructed on a heterogeneous group. Under this construction, the protected group has an overrepresentation of PCs and the comparator group has an underrepresentation.⁸⁵

2. Objectively justified factors unrelated to discrimination

Once the burden shifts, the respondent may defend the PCP by reference to objectively justified factors unrelated to discrimination. The syntax of the EU equality directives and the specific legislative choice to use the word ‘unrelated’ is key to understanding the causal implications of the defence stage. The informational challenges arising from causation are different in this context, since the explanatory burden is on the respondent exclusively to justify the ADMs. To achieve this, a respondent must show that a PCP satisfies: (i) a legitimate purpose and (ii) is necessary and proportionate to the achievement of the aim.⁸⁶

In relation to the legitimate aim requirement, most recruitment ADMs will pass this first hurdle.⁸⁷ ML can screen and assess job applications by retrieving the necessary data from screening CVs, cover letters, written applications, as well as conducting performance assessments and interviews to assess the most suited candidates for a position. Thus ADMs can easily be regarded as a legitimate means of identifying the quality and promise of candidates for a given job. This reflects the reality that ADMs are not inherently bad but have various benefits attached to their uses as well, particularly in comparison to a human system of recruitment.⁸⁸ Bias, prejudice and social stereotyping in the latter are common occurrences.⁸⁹ Due to these perceptions, recruitment ADMs are in fact viewed by many lawyers, computer scientists and regulators alike as anti-bias measures in HR.⁹⁰

Imagine a recruiter knowingly deploys ADMs where the model is deliberately labelled in a way that causes ML to distinguish between individuals on the basis of a PC, or otherwise where a PC in fact constitutes the ground truth of the model. However, the recruiter continues to use this in the genuine belief that the ADMs they are using will yield more objective decisions than a human decision-maker. Despite their genuine beliefs, the use may amount to direct discrimination if proof of the deliberate labelling choices or the ground truth can be adduced. Yet, where there is no proof, the same hypothetical example amounts to indirect discrimination that the recruiter will seek to justify as an anti-bias measure. This raises the importance of clarifying the assumptions behind the human-machine comparison and importantly asks whether the law adopts an optimistic or cynical view on the capacities of humans to be neutral. One possible assumption is to consider human decision-making as inherently subjective and thus inferior to ADMs. Under this assumption, it would consequently follow that discriminatory recruitment ADMs would always be justifiable except for unnecessary and disproportionate uses. The alternative is to assume a perfectly objective human comparator. This would generate the practical consequence that respondents would face significantly higher legal standards in justifying their ADMs. These standards, however, may not necessarily be reflective of the realities of human decision-making, particularly in recruitment decisions.

Importantly, respondents will not want to engage with these legal considerations since the former – the assumption of an imperfect human counterpart – may expose them to further liabilities from past recruitment decisions, and the latter – the assumption of the perfect human counterpart – increases the legal standard that the ADMs must obtain.

Furthermore, the requirements of necessity and proportionality assess whether there are equally effective but less discriminatory ways of achieving the result. Applying this criterion to discriminatory recruitment ADMs requires a further comparison between man and machine. In Motherhood Plan the English High Court permitted the use of ADMs in the administration of a Self-Support Employment Income Scheme because it found this to be more efficient and cost-effective than a human alternative.⁹¹ The High Court was persuaded by the factual reality that it is sometimes practically necessary to make use of AI, particularly in contexts where there are high demands, pressures or resource considerations. In this case, it was significant that the disputed measure was introduced in response to COVID-19 and therefore was considered necessary to help the administrators deal with the overwhelming paperwork involved in the claims.⁹² It is likely that a similar rationale applies to the use of discriminatory recruitment ADMs. Take the following example. HR are often inundated with applications, such as in the early stages of recruitment, where it is impossible to have a human dedicate time and effort to each of the incoming applications. The use of ADMs is unavoidable at this stage of the application cycle if each application is to be given some kind of review or consideration. In contrast, it is arguable that ADMs would not be considered as essential in later stages of the process where there are less candidates for HR to review and the success chances of the remaining candidates are proportionately higher.

The Bologna Labour Tribunal reached a different conclusion to the English High Court, even though it is arguable that it is equally necessary to use ADMs as a means of allocating work to drivers.⁹³ In contrast to the English High Court, the Bologna Labour Tribunal discussed the legal significance of the ADMs having a transparent process and found that Deliveroo's refusal to demonstrate the inner workings of the algorithm was grounds to uphold the claimant's case. This relates to the fact that a significant portion of the dispute evolved around the omission of the ADMs to consider the reasons for the drivers’ performance and participation in the app when devising its score. The algorithm was therefore flawed in the sense that it was blind to these considerations.

In similar vein, Hacker argues that the assessment will depend on the type of algorithmic bias in dispute. He distinguishes between cases involving biased training data and proxy discrimination. Hacker argues that ADMs with biased training data will only be justified where an unbiased set of training data that contains the same rate of predictive accuracy is unavailable or unaffordable. He further opines that ADMs using discriminatory proxies will, in contrast, often be justified since the use of the proxy is even more intrinsic to the predictive accuracy of the ML model.⁹⁴ Under this view it would further follow that any alternative procedure ideally attains the same prediction rate accuracy as the disputed algorithm. In the example of removing bias from training data, this would raise an inherent trade-off between fairness and accuracy in computer science,⁹⁵ as well as involve pre-processing or post-processing methods of AI bias mitigation.

C. Evidential barriers to causal explanations

Causation in the context of discriminatory recruitment ADMs amounts to an informational challenge that depends on the abilities of complainants to access evidence and causally explain the data-driven discrimination. Without access to evidence, the existence of bias, prejudice or social stereotyping may cause serious harm to individuals but will not amount to a ground of liability that gives rise to discrimination. Compared to the substantive protections, the CJEU has not focused as much on the procedural rights conferred on complainants to access evidence or request the disclosure of material facts in the EU equality directives.

In the context of equal pay litigation, the CJEU found that the employer is liable to ensure that a PCP is transparent, once the complainant has discharged their burden of proving a prima facie case of discrimination against a large number of employees.⁹⁶ Similarly, in Commission v. France, the CJEU applied principles of transparency to recruitment decisions to protect the equal access rights of the complainants.⁹⁷ However, the court did not specify the content of such principle or confirm what its procedural operation practically entails, which leaves both the strength of the principle and its scope of application uncertain.⁹⁸ Consequently, the CJEU held in Meister that the EU equality directives do not entitle a candidate who has demonstrated that they meet the requirements for the job to request access to information indicating whether the recruiter hired another applicant at the end of the recruitment process.⁹⁹ The court did, however, state that the refusal to grant this is nonetheless one of the factors to consider in the context of establishing facts from which it may be presumed that there has been direct or indirect discrimination.¹⁰⁰ Likewise in Kelly, despite awarding the complainant access to redacted information concerning their unsuccessful application to a vocational training course, the complainant was not entitled to access information held by the provider about the qualifications of other applicants.¹⁰¹

The CJEU consequently distinguishes between the information rights held between complainants in cases where the dispute involves access to labour and cases where the complainants already possess a contractual relationship with the respondent that assimilates a working relationship. The court also draws a distinction in the nature of information requested by the complainants. In Meister and Kelly, the complainants sought confidential information that engaged the privacy rights of others. From this perspective, the CJEU had to engage with ‘rules governing confidentiality which follow from European Union legal acts’ unlike in Danfoss.¹⁰² Advocate General Mengozzi shared similar views in his opinion in Meister:

…there is nothing in the wording or spirit of Article 8(1) of Directive 2000/43, Article 10(1) of Directive 2000/78 and Article 19(1) of Directive 2006/54 to rebut that decision. Indeed, their wording does not once expressly refer to a right to information held by the person ‘suspected’ of discrimination. For the most part, the interested parties who have lodged written observations noted that, while the Commission tabled a proposal intended to establish a right to information for victims of discrimination, such a proposal has never been adopted in the final texts. In those circumstances, the absence of an express reference to a right to information in the aforementioned provisions must be interpreted, not as an oversight on the part of the legislature but, on the contrary, as the manifestation of its intention not to affirm such a right.¹⁰³

Taking account of Commission v. France, it can be assumed that transparency principles do in theory apply to recruitment decisions. Following Meister and Kelly, the concrete entitlement of complainants to information will practically depend on the nature of information they seek and the impact this has on the rights and interests of the respondent and third parties. In the case of discriminatory recruitment ADMs, there are several layers of confidentiality considerations at play. First, there are those relating to the ML model, which may engage corporate trade secrecy laws, intellectual property laws as well as business confidentiality issues.¹⁰⁴ Second, there may be information that identifies other job applicants or engages their privacy interests. Complainants will therefore be challenged by various competing confidentiality issues, which will limit their ability to audit the algorithm and in turn supply causal explanations for the discriminatory recruitment ADMs to the necessary threshold required to meet their burden of proof and benefit from the presumption of discrimination.

The only way around these evidential limitations in recruitment cases is where complainants can demonstrate the voluntary disclosure of overt bias by the respondent. For instance, in the preliminary referrals in Feryn and ACCEPT, there was no need to consider issues of causation since the recruiter had in both cases made public statements that from an evidential perspective provided sufficient proof of bias to engage the presumption of discrimination.¹⁰⁵ Other than these instances, access to evidence is rather limited and will not be likely to provide complainants with the information they need to meet the causal requirements of either direct or indirect discrimination claims in the EU equality directives.

4. Bridging the causal gap: Looking beyond the EU equality directives

Article 21 of the CFREU and the general principle of non-discrimination provide an alternative framework of protection that can capture new and innovative forms of social disparity such as various sources of algorithmic bias that result in data-driven discrimination, which would consequently create more generous protections that practically translate into less onerous causal requirements. Alternatively, data protection and technology laws such as the GDPR, recent legislative innovations such as the forthcoming AIA, as well as the AILD proposal, enable greater access to evidence of data-driven discrimination, and should be regarded as essential to the litigation of discriminatory recruitment ADMs. The GDPR provides rights that are directly conferred on the candidate and can be invoked against the recruiter. In contrast, the forthcoming AIA will, once enacted, predominantly regulate the relationship between the manufacturer of the ADMs and the recruiter respectively. In its current draft form, the AILD proposal responds to this with a system of direct recourse in the form of non-contractual civil liabilities to candidates affected by discriminatory recruitment ADMs that captures cases where ex ante compliance through the AIA alone does not prevent data-driven discrimination.

A. Charter of fundamental rights

Unlike the GDPR, AIA and the AILD proposal, which strengthen complainants’ access to evidence of data-driven discrimination by supporting their procedural rights to information, the CFREU is not a procedural system of evidence but a substantive source of non-discrimination protections that provides either an independent claim, or otherwise operates as an interpretive tool to safeguard and strengthen the application of the EU equality directives to data-driven discrimination. Within the language of fundamental rights and general principles, there are two ways in which the CFREU can protect workers against data-driven discrimination.

The first is to invoke the legal protections of Article 21 CFREU instead of the EU equality directives. Following the CJEU decisions in Egenberger and IR, Article 21 can claim direct horizontal effect.¹⁰⁶ However, the Charter does not impose a general rule of direct horizontal effect. Whilst the CJEU has recognized the direct horizontal effect of Article 31,¹⁰⁷ conferring the right to fair working conditions including paid annual leave, and Article 47,¹⁰⁸ conferring the right to an effective remedy and fair trial, it has not done so in the context of other articles.¹⁰⁹ Notwithstanding this, it is settled that the CFREU does at least provide direct horizontal effect in the context of Article 21 and should therefore be able to catch data-driven discrimination that would otherwise fall outside of the narrower scope of the EU equality directives. This is due to the distinct phrasing of Article 21(1) as a prohibition against: ‘Any discrimination based on any ground such as sex, race, colour, ethnic or social origin, genetic features, language, religion or belief, political or any other opinion, membership of a national minority, property, birth, disability, age or sexual orientation shall be prohibited.’

The CFREU includes many more PCs than the directives, such as national minority and genetic features, and expressly states that discrimination can be established on ‘any ground’, which mirrors the non-exhaustive list of PCs in Article 14 of the European Convention of Human Rights (‘ECHR’). In contrast to the European Court of Human Rights (‘ECtHR’), which has extended the grasp of PCs in its case law to capture new and intersectional characteristics,¹¹⁰ the CJEU has not readily made use of the interpretative potential of Article 21. For instance, when determining whether obesity should amount to a PC in Kaltoft, the CJEU avoided all consideration of Article 21 by finding that the complainant's case fell outside of the scope of EU law with the effect that the legal protections of the CFREU were not invoked.¹¹¹ Unfortunately, neither the travaux préparatoires nor the CFREU itself explains why Article 21 was drafted with non-exhaustive PCs, leaving it ambiguous as to what types of data-driven discrimination will, and will not, be captured by the CFREU.

Considering this, the general principles expressed in the CFREU must be considered as well. These have the constitutional implication that secondary laws cannot be interpreted restrictively.¹¹² It is likewise of significance that the general principle of non-discrimination applies horizontally to private parties, as is clear from Mangold.¹¹³ For these reasons it is important to draw on the opinion of Advocate General Jääskinen's in Kaltoft:

There is a general principle of non-discrimination in EU law covering grounds not explicitly mentioned in Article 21 of the Charter. Examples of such prohibited grounds of discrimination might lie in physiological conditions such as appearance or size, psychological characteristics such as temperament or character, or social factors such as class or status.¹¹⁴

As with Article 21, this does not necessarily mean that the general principle of non-discrimination can be used for expansive interpretations that would override the directives. Evidently, Advocate General Jääskinen ultimately shared the CJEU's conclusion that ‘the scope of Directive 2000/78 should not be extended by analogy beyond the discrimination based on the grounds listed exhaustively in Article 1 thereof’.¹¹⁵ Similar can be seen in the Pre-Lisbon case Chacón Navas, where the CJEU rejected the general principles as a means of extending the PCs in the Directive 2000/79, holding that the complainant's chronic sickness could not invoke the statutory protections against discrimination on grounds of disability.¹¹⁶

Neither Article 21 nor the general principles will guarantee a complete gap-filling mechanism that protects against all manifestations of data-driven discrimination falling outside the scope of EU equality directives and should therefore not be viewed or interpreted as such. But what they may do is capture borderline cases of direct discrimination where there are variables that correlate with certain PCs, which would otherwise fall outside of the scope of direct discrimination for want of sufficient correlation. Gerards and Borgesius in similar vein describe the CFREU as a ‘safety net’ to capture instances of data-driven discrimination that would otherwise escape liability under EU law.¹¹⁷ To argue this, the authors draw on the social-historical evolution of the CFREU as a relatively new source of EU law and draw on the inclusion of genetic features as a PC in Article 21 CFREU to illustrate the instrument's adaptability to continuously evolving social norms of ‘a priori unfair grounds for differentiation’.¹¹⁸

Under the CFREU, complainants would face fewer challenges in identifying the variable as a PC, and instead focus on attributing such as the cause of discrimination. Since a more open list naturally allows for more reasons as grounds for discrimination, explaining the causal occurrence of them is procedurally easier, since it is unnecessary for complainants to draw such strenuous connections between the ML process and the attribute or vulnerability in question. Because of this, the legal onus on complainants to produce a prima facie case and shift the burden of proof would be less onerous on complainants and require less evidence of the inner workings of the ADMs. The CFREU therefore has most impact on direct discrimination, since these are the instances where data-driven discrimination is at greatest risk of eliding the protection of the EU equality directives, subject to very narrow exceptions where the underlying bias of the human decision-maker is overtly declared or obvious to infer.

In indirect discrimination there is less need for the CFREU to act as a gap filler, since the protections are already quite broad given the ostensible neutrality behind the PCP requirement, as well as the fact that the effects-based focus is more logically aligned with the computational logic of data-driven discrimination. If it came to the issue of selecting appropriate statistical measurements and the setting of attainable evidential standards of proof for a particular disadvantage, the CFREU may encourage a rights-based approach to the determination of this by the CJEU or legislator, depending on how the issue is to be addressed. It could also help inform the CJEU on whether the ADMs constitute a legitimate aim at the justification stage of indirect discrimination by bringing fundamental rights into the consideration of them. However, the main impact of the CFREU, working on the assumption of horizontal effect, is to bring borderline cases, which would fall into the category of indirect discrimination in the EU equality directives, back into the remit of direct discrimination so that complainants can benefit from the stronger protections conferred thereunder.

B. General Data Protection Regulation

The GDPR procedurally strengthens the protections against discrimination in the EU equality directives by increasing a complainant's access to evidence. The overlap between discrimination and data transparency issues is explicitly acknowledged by the WP 29 Guidelines, noting that such practices can ‘perpetuate existing stereotypes and social segregation’ as well as ‘lead to inaccurate predictions, denial of services and goods and unjustified discrimination in some cases’.¹¹⁹ Similar concerns are reflected in the language of Recitals 71, 75 and 85, which provide further interpretative support to the application of the GDPR provisions.

The GDPR does not provide an unchallengeable right to explainable AI. Instead, it includes a cluster of rights and duties that minimize the extent to which respondents can shield themselves from liabilities through the use of ML. Notably, this includes the right to access information in Article 15 and Recital 63, as well as the notification duties in Article 13–14 and Recitals 60–62. The content of what needs to be disclosed from the ADMs is referenced in Articles 13(2)(f), 14(2)(g) and 15(1)(h) as: ‘The existence of automated decision-making, including profiling, referred to in Article 22(1) and (4) and, at least in those cases, meaningful information about the logic involved, as well as the significance and the envisaged consequences of such processing for the data subject.’

The crucial issue is whether these articles provide complainants without enough access to information to discharge the evidential burdens of causation under the EU equality directives. There are several key limitations that need to be considered. For one, the information may be limited to ADMs that come within the criteria of Article 22(1) and (4). Many recruitment ADMs would not meet this requirement since they often, albeit to varying degrees, involve some involvement of HR in the process.¹²⁰ The wording of these provisions suggests that Article 22 only applies to ‘solely automated decisions’ and requires evidence that this creates ‘legal or similarly significant effects’. EU jurisprudence has, however, gradually become more expansionist in its interpretation of the technologies that are caught by Article 22 in recent case law.

For example, the Austrian Federal Administrative Court found that the AI processing of the allocation of funding to jobseekers did not amount to a solely automated decision because a counsellor reviewed and, if necessary, diverged from the results of the algorithm. In contrast, the first instance found that the counsellor's role was merely to confirm the decision and that any intervention made was therefore based solely on the findings of the automated decision.¹²¹ Following the recent CJEU decision on the preliminary referral in Schufa, it is likely that these ambiguities have been put to rest. In the case, a German credit reference agency used ADMs to score the applicant's creditworthiness. The results were then sent to a bank, who subsequently rejected the application on the basis of the result reached by the ADMs. The CJEU held that the determining role of the credit reference agency was sufficient to invoke the protections of Article 22.¹²² Both the CJEU and Advocate General Pikamäe in their legal opinion emphasized that Article 22 would otherwise create a ‘lacuna in legal protection’, as well as frustrate the operation of Article 15.¹²³

Consideration should also be given to the fact that Articles 13(2)(f), 14(2)(g) and 15(1)(h) stipulate that they apply to access to information from ‘at least in those cases’, which may therefore catch instances of ADMs that are not caught within Article 22(1) and (4).

Significantly, these articles only refer to ‘meaningful information about the logic involved’ as well as ‘significance’ and ‘envisaged consequences’ of ADMs. These terms suggest that a data subject will not be entitled to full access to the training data or feature selection of a model but rather be limited to information explaining the rationale and criteria of the decision-making process behind the ML model. In the recruitment context, this translates to information on the hiring criteria used by the recruiter and the scoring attached to these criteria by the ADMs. But it will not necessarily entitle candidates to access information relating to the profiles of other candidates that may have been used as training data. However, if, on the other hand, certain human attributes were used as labels for variables or given certain weightings by the ML model, it is likely these would need to be disclosed under these rules.

It is further uncertain how specific the information would be, since the term ‘meaningful’ imposes an ambivalent threshold considering its inherent subjectivity as a choice of phrasing. Hacker argues with specific reference to Article 15(1)(h) that complainants will be able to access ‘aggregate information on the existence of algorithmic bias if bias can be understood as part of the consequences of processing for the data subject’.¹²⁴ This accords with the scientific understanding of data-driven discrimination. The interesting question is whether it would grant access to all types of ML bias or whether courts would draw distinctions on the type of bias in question. Selbst and Powles argue that the term ‘meaningful’ should be understood as enabling rather than defeating access to information. The authors argue this by reference to the Article 5 requirement that data processing is lawful, fair and transparent to the data subject, which the authors interpret as a minimum standard of disclosure that cannot fall below these principles. It then follows that the information should be sufficient for a data subject to contest a decision under Article 22(3).¹²⁵ Under this view, the GDPR is not to be interpreted restrictively but functionally, which would allow the disclosure of algorithmic bias, if this does not conflict with confidentiality or privacy interests.

Others, such as Wachter et al., interpret the articles more restrictively and argue that they should not be understood as conferring general rights to explanations.¹²⁶ Rather, the GDPR provides a right to be informed that is more restricted and only allows data subjects access ‘a limited right to explanation of the functionality of [ADMs]’.¹²⁷ For similar reasons, Ebers and Navas conclude that data protection and privacy laws should not be seen as a cure-all to discriminatory ADMs.¹²⁸ Notwithstanding this, Advocate General Pikamäe in his legal opinion in Schufa unequivocally introduces the notion of explainable ADMs as a corollary to the right of data subjects to have a human in the loop under Article 22:

The human intervention to be envisaged in this type of automated data processing ensures that the data subject has the opportunity to express his or her point of view, to obtain an explanation of the decision reached after such assessment and to challenge it in the event of disagreement with the decision.¹²⁹

Advocate General Pikamäe's analysis accords with the views of Selbst and Powles, that the GDPR entails a de facto right to an explanation. Under this view, complainants can rely on the GDPR to bridge the causal gap that emerges in the deficient access to proof they otherwise have under the EU equality directives. Whilst this will not guarantee unobstructed access to the complete system of operation behind the ADMs, particularly where confidentiality concerns cover aspects of the requested information, it will nonetheless provide access to the essential features that causally explain why the ADMs arrived at the discriminatory output.

This evidence will allow complainants to make out claims for direct discrimination in exceptional circumstances where (i) a PC was used as an express label or (ii) utilized as a ground truth on which the ML model operates. For indirectly discriminatory recruitment ADMs, the same may determine whether the respondent justifies the algorithm by reference to arguments of necessity and proportionality that are based on the accuracy and predictability of the model, or whether the complainants possess sufficient evidence to the contrary to rebut the respondent's justificatory arguments.

C. Artificial Intelligence Act

In aftermath to the political agreement reached on Friday 8 December 2023, the forthcoming AIA was adopted by the EU Parliament on Wednesday 13 March 2024, with 523 votes in favour, 46 against and 49 abstentions on the landmark legislative reform.¹³⁰ The Explanatory Memorandum of the 2021 draft AIA describes the legislation, inter alia, as a means to ‘complement existing Union law on non-discrimination with specific requirements that aim to minimize the risk of data-driven discrimination’.¹³¹ The AIA explicitly responds to the rise of algorithmic management in the workplace, with its White Paper declaring that ‘the use of AI applications for recruitment processes as well as in situations impacting workers’ rights would always be considered “high-risk”’.¹³²

As of 13 March 2024, the AIA establishes eight categories of high-risk AI systems in Annex III. These high-risk categories apply to AI systems that are used in decisions regarding employment, worker management and access to self-employment. Annex III also specifically includes situations where AI is intended to be used for ‘the recruitment or selection of natural persons, in particular to place targeted job advertisements, to analyse and filter job applications, and to evaluate candidates’.¹³³

The requirements for high-risk AI systems are found in Chapter 3 Section 2 of the AIA. This is the establishment of a risk management system under Article 9. This also includes the obligations in Article 10 that ensure that data sets are subject to good data governance and management practices. Within these obligations, Article 10(5) is the most relevant provision for discriminatory recruitment ADMs since it expressly provides for bias monitoring, detection and correction in processing sensitive categories of personal data where appropriate safeguards for fundamental rights and freedoms of natural persons whose data is processed is used. There are also obligations of technical documentation,¹³⁴ record-keeping,¹³⁵ transparency and provision of information to deployers¹³⁶ and of human oversight,¹³⁷ as well as obligations relating to accuracy, robustness and cybersecurity.¹³⁸ The provisions, however, are of limited use to complainants of discriminatory recruitment ADMs since many of its obligations are mainly applicable between the AI provider and deployer, rather than the person who is ultimately affected by the system.¹³⁹ The act defines providers as those who develop and commercialize AI.¹⁴⁰ Deployers are those who use AI under their authority.¹⁴¹

The obligations relevant to the deployer are instead found in Article 26 of the AIA, which requires them to adopt appropriate organizational and technical measures of compliance, as well as specifically ensuring that ‘deployers who are employers shall inform workers’ representatives and the affected workers that they will be subject to the use of the high-risk AI system’. Nonetheless, the main utility of the act for discriminatory recruitment ADMs derives from the addition of Article 68c in the consolidated text of 26 January 2024, which has, as of 14 March 2024, been included as Article 86 in the text formally adopted by the European Parliament.¹⁴²

The new Article 86 provides those who are subject to a high-risk AI system with the ‘right to obtain from the deployer clear and meaningful explanations of the role of the AI system in the decision-making procedure and the main elements of the decision taken’. Furthermore, the right to clear and meaningful explanations applies ‘to the extent that the right referred to in paragraph 1 is not otherwise provided for under Union law’. Consequently, the new right can be seen as an important legislative response to the debates regarding the right to explanation in the parallel context of the GDPR. From this perspective, Article 86 bears significant similarities with the logic of the transparency obligations in the GDPR when read together with Article 22 of GDPR. The enforcement of the AIA in the context of discriminatory recruitment ADMs is, crucially, less of an issue since the act does not face the restriction of applying only to ADMs ‘based solely on automated processing’ like Article 22. Rather, discriminatory recruitment ADMs are certainly caught within its legislative ambit since they are defined as ‘high risk’ in the AIA:

AI systems used in employment, workers management and access to self-employment, in particular for the recruitment and selection of persons, for making decisions affecting terms of the work related relationship promotion and termination of work-related contractual relationships for allocating tasks on the basis of individual behaviour, personal traits or characteristics and for monitoring or evaluation of persons in work- related contractual relationships, should also be classified as high-risk, since those systems may have an appreciable impact on future career prospects, livelihoods of those persons and workers’ rights…Throughout the recruitment process and in the evaluation, promotion, or retention of persons in work-related contractual relationships, such systems may perpetuate historical patterns of discrimination, for example against women, certain age groups, persons with disabilities, or persons of certain racial or ethnic origins or sexual orientation. AI systems used to monitor the performance and behaviour of such persons may also undermine their fundamental rights to data protection and privacy.¹⁴³

The new right underlines the fact that the explanations must be both ‘clear and meaningful’, which emphasizes the adequacy of the explanations themselves, rather than merely the information provided thereunder. It would therefore follow that the explanations given by the deployer must not only be instructive on the AI system but additionally make practical sense in the way that they describe the AI system. However, the act leaves significant room for interpretation on the precise type as well as ambit of information that the clear and meaningful explanations must relate to.

In contrast to the current text adopted by the EU Parliament, the amendments submitted on the 20 June 2023 expressly extended the right to a clear and meaningful explanations to the (i) role of the AI system, (ii) the main parameters of the decision taken and (iii) the related input data.¹⁴⁴ Yet the adopted text may be narrower in the scope of information that comes within its regulatory ambit, since the right to clear and meaningful explanations only mentions two aspects of the individual decision-making: (i) the role of the AI system in the decision-making procedure and (ii) the main elements of the decision taken. It is also noteworthy in relation to (ii) that the word ‘parameters’ has been changed to ‘elements’. Perhaps the new wording has been chosen to incorporate the related input data into this limb of the text since the term ‘elements’ is more generic in its language. It is therefore arguable that the new wording has the equivalent application to related input data, despite the omission of a direct reference to it.

Within the adopted text of Article 86, limb (i) is relatively straightforward and most likely refers to the degree of automation and whether the HR process is entirely reached by ADMs or shared with a human recruiter who is overseeing the AI system. It likewise includes an explanation on what tasks the ADMs complete. For instance, whether ADMs source and screen candidates, evaluate and select them, or potentially compile success profiles, as well as whether this involves résumé screening, chatbot or video interviews, or general performance or engagement analytics.¹⁴⁵ What remains unclear, however, is whether the AIA endorses a more functional or a more technical type of explanation by the deployer. Arguably, the words ‘clear and meaningful’ are somewhat context-specific, since different cases will require different degrees of insight in the explanation provided by the deployer. The reference to the ‘role’ of the AI system, however, suggests that the overall explanation in this limb is driven towards the organizational purpose of the AI system.

Limb (ii) could potentially capture more technically related aspects of the discriminatory recruitment ADMs. This reading is supported by an interpretation that includes related input data within the definition of ‘main elements. The main elements of the decision-making could under this view include access to data indicating how bias emerged in training data, label and feature selection, as well as potentially self-learning bias in ML. This view is further supported by the fact that the AIA expressly stipulates that the explanation ‘should provide a basis on which the affected persons are able to exercise their rights’.¹⁴⁶ A more limited supply of data would not be enabling to complainants and thus not meet this legislative expectation. However, emphasis must be placed on the fact that the precise choice of wording, that being the reference to ‘main elements’, may alternatively restrict the explanation to information that attains a certain level of significance. The threshold of what elements attain this significance is not defined in the act and therefore uncertain. Further limitation would arise from an interpretation that additionally suggests that the right to explanation under the AIA only provides complainants with an explanation related to the main elements of the actual decision reached by ADMs, as an end result, rather than the main elements that are constitutive to the reaching of that decision in the first place.

D. Artificial Intelligence Liability Directive

To complement the AIA, the European Commission issued the AILD proposal on the 28 September 2022 to regulate the non-contractual civil liabilities of AI.¹⁴⁷ As the Explanatory Memorandum states, background to the new regulation are the ‘specific characteristics of AI, including complexity, autonomy and opacity (the so-called “black box” effect), [that] may make it difficult or prohibitively expensive for victims to identify the liable person and prove the requirements for a successful liability claim’.¹⁴⁸

The AILD proposal bridges the enforcement issues of the AIA in the context of private litigants who are not users in the way the AIA requires, since it provides a private action to individuals who are, or may potentially be, affected by the harms envisaged by the AIA and thus would offer a crucial safeguard to complainants seeking redress against discriminatory recruitment ADMs. The AILD proposal would thus functionally distinguish itself from the AIA since the former is a cure, rather than preventative measure of algorithmic harm, and thus would be more amenable to the use of such technologies in recruitment, as well as in general organizational and managerial practices in the workplace. The AILD proposal consequently responds to the causal challenges faced by complainants seeking to evidence general algorithmic harms in two main ways.

First, it introduces rules on the disclosure of evidence and a rebuttable presumption of non-compliance in Article 3. To benefit from such disclosure, the complainant must ‘present facts and evidence sufficient to support the plausibility of a claim for damages’ pursuant to Article 3(1). Disclosure further requires that the complainant has undertaken all proportionate attempts at gathering the relevant evidence from the respondent in Article 3(2) and limits the disclosure of information to strict criteria of necessity, proportionality and legitimate interests in Article 3(4). Although a respondent's refusal to grant such information may invoke a presumption of non-compliance in Article 3(5), it is questionable how much more access the AILD proposal will practically grant complainants to information in comparison to the GDPR.

Second and relatedly, the proposal introduces a rebuttable presumption that establishes a causal connection between the output of AI and the fault of the respondent in Article 4 if all the following criteria are fulfilled. Under Article 4(1)(a) the respondent must be non-compliant with a duty of care pursuant to the AIA or other rules set at Union or national level. Article 4(1)(c) requires that it is considered ‘reasonably likely, based on the circumstances of the case’ that the given fault has influenced the relevant AI system output or lack thereof. Article 4(1)(c) requires the complainant to prove that the AI system gave rise to the damage. It is further noteworthy that Article 4(4) establishes an exception to the presumption in the case of high-risk AI systems, where a respondent can show that sufficient evidence and expertise is reasonably accessible for the complainant to prove the causal link in Article 4(1).

The AILD proposal does not entail a reversal of a burden of proof that completely alleviates the complainant's onerous burden to demonstrate the inner workings of the AI but only maintains a presumption where the regulatory criteria are fulfilled by the complainant. Many argue that this does not go far enough in addressing the informational imbalances between the parties.¹⁴⁹ A specific focus is placed by these groups on the greater recognition of so-called immaterial damages such as to the fundamental rights of freedom of expression, human dignity and discrimination.¹⁵⁰ Others such as the JUST-AI Jean Monnet Research Group further challenge the extent to which the AILD proposal can be characterized as a presumption of fault in a strict legal sense, since Article 4 does not technically propose such a finding on ‘facts held as established without prior evidence (like the presumption of innocence, for example)’.¹⁵¹ Rather, the proposal simply requires evidence of a different kind and degree.

Thus, whilst it is in theory possible that the AILD proposal does apply to discriminatory ADMs in recruitment, the scope and function of the provision remains a moot point until the draft has passed further legislative stages. Notwithstanding this, the current legislative objective indicates that the EU leans towards the enactment of the AILD proposal as a private course of civil action to support the forthcoming AIA, alongside other technology laws reforms such as the PLD proposal that was issued in combination with the AILD.¹⁵²

5. Conclusion

Pasquale in his book calls for an intelligent society that understands how biased and prejudiced inputs into algorithms generate discrimination and social stereotyping.¹⁵³ This idea is deeply embedded in the requirements for complainants to causally explain the data-driven discrimination by recruitment ADMs to discharge their procedural burden in the EU equality directives.

These causal requirements are, however, problematic where ADMs are operated through ML, which introduces a series of informational challenges that complicate the abilities of complainants to discharge their burdens of proof. Because of these informational challenges, most cases of discriminatory recruitment ADMs cannot be considered directly discriminatory, since ML prevents complainants from causally evidencing that discrimination occurred on the grounds of a factor that correlates with their PC. In indirect discrimination, the problems generally arise at the justificatory stage where ML becomes conducive to the respondents’ abilities to defend themselves against any claims by reference to objectively justified factors that are unrelated to discrimination. Other sources of EU law must therefore be considered as additional measures to provide convincing evidence of data-driven discrimination by the ADMs.

One solution is to draw on the CFREU, particularly in cases where discriminatory recruitment ADMs could amount to direct discrimination. Under this approach, borderline cases of data-driven discrimination that arise from the use of intersectional variables, or proxies that are inextricably linked to PCs, could amount to direct discrimination in the CFREU unlike in the EU equality directives. Complainants could then benefit from stronger protections against discriminatory recruitment ADMs since direct discrimination is much harder to justify than indirect discrimination. Whilst there is less general need for the CFREU in indirect discrimination claims, the CFREU may have some residual utility when it comes to issues that are yet undecided in case law, such as the setting of a precise statistical metric or threshold, since the Charter could contribute to rights-informed discourse on these issues by courts and potentially legislators alike.

Alternatively, access to information rights and related disclosure obligations can be directly enforced through the GDPR, or, indirectly via the forthcoming AIA if the AILD proposal is successfully passed into legislation over the coming years. These pick up on the current evidential limitations of the EU equality directives. For direct discrimination the expansive interpretation of the relevant articles may facilitate access to evidence of data-driven discrimination, which will assist cases where a PC was used as an express label in a model, or otherwise where the ground truth of ML is biased. This will avoid the risk of rare cases of directly discriminatory recruitment ADMs falling into the less stringent protections of indirect discrimination. For indirect discrimination, where the legal logic is already quite aligned with the computational logic of data-driven discrimination, greater access to evidence will allow courts to scrutinise the source of bias in ML more closely, which may make the difference between whether recruitment ADMs attract liability or are justified. This enhances complainants’ protections against indirectly discriminatory recruitment ADMs, albeit not to the same strictness as direct discrimination, but to a sufficiently practical degree that recognizes the reality that recruitment ADMs are ultimately not inherently bad, but are like so many other automated technologies, and, despite their benefits, easily able to become dangerous if not used properly by humans.

Footnotes

Acknowledgements

I would like to thank Professor Simon Deakin at the University of Cambridge for his invaluable insights. I am further grateful for the feedback given to me at the European Human Rights Law Conference 2023 at the Cambridge Centre for Public Law.

ORCID iD

Christine Carter

Why the algorithmic recruiter discriminates: The causal challenges of data-driven discrimination