Sage Journals: Discover world-class research

Abstract

In this article, I want to draw attention to a class of cases that is ignored, sometimes deliberately, in the debate about the probative value of naked statistical evidence (NSE). I am talking about cases in which ‘statistical’ propositions are the principal subject of proof. I will show that they are legally relevant and remain immune to the arguments against NSE put forward in the evidence literature. All of this, I will conclude, makes it convenient to pay more attention to them. First, I will highlight the hypothetical cases discussed in the NSE debate and the kind of propositions that must be proved in them: ‘singular’ propositions. Second, I will show that there are other cases, also relevant in legal systems, in which the principal factum probandum are ‘statistical’ propositions. Third, I will argue that this difference is substantial for the NSE debate.

Keywords

Free will incentives naked statistical evidence reference class problem sensitivity statistical propositions

Introduction

How should we justify decisions about proved facts in a trial in order to be rational? This question leads to an important debate in the evidence literature about how to explain evidential reasoning: What sort of inference is it, how is it formally structured and what are its criteria for correction? We will focus here on a small slice of this discussion, which concerns the probative (or epistemic) value to be assigned to statistical evidence when it is the only kind of evidence available for decision-making.

By ‘statistical evidence’ we will understand any assertion of statistical propositions or laws, that is, statements that refer, among other parameters, to the relative frequency of occurrence of certain types of events.¹ When statistical evidence is the only kind of evidence available to determine whether the proposition expressed by a factual statement should be declared proved, we will refer to it as ‘naked statistical evidence’ (NSE henceforth).² Leaving aside the question of what value is actually assigned to NSE in trials,³ we will focus our attention on the studies that address the value that should be assigned to it.

In this article, I want to draw attention to a class of cases that have been being ignored, sometimes deliberately, in the debate about the probative value of NSE. A missing piece of the puzzle. I am talking about cases, such as those of indirect discrimination, in which ‘statistical’ propositions are the principal subject of proof (or factum probandum). I will show that they are legally relevant and, moreover, that they remain immune to the arguments against NSE put forward in the evidence literature. All of this, I will conclude, makes it convenient to pay more attention to such cases, either in delimiting the probative value of NSE or in formulating models of evidential reasoning.

My argument will be structured as follows. First, I will reconstruct some features of the debate about the probative value that corresponds to the NSE in evidential reasoning. In particular, I will highlight the hypothetical cases discussed and the kind of propositions that (principally) must be proved in them, which I will call ‘singular’ propositions. Second, I will show that there are other cases, also relevant in legal systems, in which the principal factum probandum are propositions that I will call ‘statistical’. I will pay particular attention to the elucidation of the statements which express these propositions. Third, I will argue that the difference between the two types of cases (and the corresponding propositions) is substantial for the NSE debate, because it changes what can plausibly be said about the probative value that corresponds to the NSE. To justify this claim, I will review some of the most influential arguments against NSE put forward in the contemporary evidence literature and note that they are implausible when applied to cases in which statistical propositions are the principal subject of proof. Fourth, I will draw some conclusions.

Let us proceed in that order.

A debate on the proof of singular propositions

The academic debate about NSE in legal evidential reasoning seems to have taken its first steps in the early 1970s.⁴ It is a little over 50 years old and is currently enjoying good health, as evidenced by the papers published in recent years.⁵ Among the many aspects of this discourse, I will concentrate here on a few very precise features.

The question it seeks to answer can be summarised as follows. What probative value, if any, should be assigned to statistical evidence when it is the only kind of evidence available for deciding whether a proposition should be declared proved?

Looking for an answer to that question, the evidence literature often relies on hypothetical cases that serve to identify the possible paradoxes that would result from one or another proposed solution. According to this method, one proposal is better than another if it leads to decisions that are intuitively more acceptable in such laboratory situations.⁶

In this section I would like to highlight some common features of such examples. To do so, it seems a good idea to take three of the most discussed examples even today in the literature. Let us look at the terms in which they were originally formulated:⁷

Blue Bus case:

Plaintiff is negligently run down by a blue bus. The question is whether the bus belonged to the defendant. Plaintiff is prepared to prove that defendant operates four-fifths of all the blue buses in town. What effect, if any, should such proof be given? (Tribe, 1971: 1340–1341) ⁸

Gatecrasher case:

Consider, for example, a case in which it is common ground that 499 people paid for admission to a rodeo, and that 1000 are counted on the seats, of whom A is one. Suppose no tickets were issued and there can be no testimony as to whether A paid for admission or climbed over the fence. So by any plausible criterion of mathematical probability there is a ·501 probability, on the admitted facts, that he did not pay. The mathematicist theory would apparently imply that in such circumstances the rodeo organizers are entitled to judgement against A for the admission-money, since the balance of probability (and also the difference between prior and posterior probabilities) would lie in their favour. But it seems manifestly unjust that A should lose his case when there is an agreed mathematical probability of as high as ·499 that he in fact paid for admission. (Cohen, 1977: 75)

Prison Riot case:

In an enclosed yard are twenty-five identically dressed prisoners and a prison guard. The sole witness is too far away to distinguish individual features. He sees the guard, recognizable by his uniform, trip and fall, apparently knocking himself out. The prisoners huddle and argue. One breaks away from the others and goes to a shed in the corner of the yard to hide. The other twenty-four set upon the fallen guard and kill him. After the killing, the hidden prisoner emerges from the shed and mixes with the other prisoners. When the authorities later enter the yard, they find the dead guard and the twenty-five prisoners. Given these facts, twenty-four of the twenty-five are guilty of murder. Suppose that a murder indictment is brought against one of the prisoners—call him Prisoner I. If the only evidence at trial is the testimony of our distant witness, it would seem that a verdict of acquittal must be directed for the defendant. The prosecution's best case is purely statistical. Nothing distinguishes Prisoner I from the other twenty-four prisoners. (Nesson, 1979: 1192–1193)

These hypothetical cases have several features in common. Two of them are of interest here. On the one hand, they represent a situation in which it has to be decided, on the basis of statistical evidence alone, whether it is acceptable to consider it proved that a person has behaved unlawfully (committing a civil or criminal offence). They present judges with the challenge of deciding, on the basis of the NSE, whether it is correct to say ‘It is proved that p’, where p is the meaning of a factual statement such as ‘The accident suffered by the plaintiff on such and such a day and time was caused by a bus belonging to the defendant’ (for the Blue Bus case); or ‘The defendant is one of the people who did not pay to attend the event organised by the plaintiff on such and such a day’ (for the Gatecrasher case); or ‘The defendant is one of the people who participated in the murder of James Smith, the guard at the prison where he was being held, on such and such a day and time’ (for the Prison Riot case). On the other hand, the cases share a second common feature: all of these statements, whose epistemic status is to be decided, refer to unique and unrepeatable events, that is, to that particular traffic accident suffered by the plaintiff on that day and time, to that particular attendance by the defendant at the event held on such and such a day, to the murder of James Smith occurred on that day and time. Such pieces of language therefore consist of what we might call ‘singular’ statements.

According to their grammatical characteristics, we can define singular statements as those atomic statements that refer to the occurrence of a unique event (also called ‘single’ or ‘individual’, ‘case’ or ‘fact’). A ‘singular proposition’ is the meaning (or expression) of a singular statement.

A statement is ‘atomic’ if it is complete and cannot be decomposed into simpler statements, such as ‘The accident suffered by the plaintiff on such and such a day and time was caused by a bus belonging to the defendant.’ On the other hand, it is ‘molecular’ if it is composed of two or more atomic statements linked by symbols that serve as connectors.⁹ An event is ‘unique’ if it occurs at a certain time and place and there is no other event with the same attributes, at least for the person referring to it. ‘The accident suffered by the plaintiff on such and such a day and time…’ is an example of what I am talking about. But not because the plaintiff did not suffer another injury at the same time (that is contingent), but because the speaker wants to speak of a single event (‘the accident’), even if he may not be precise enough for his interlocutors in a certain context.¹⁰

I argue that such an emphasis in the NSE debate on proving singular propositions limits the scope of its conclusions. To understand this, however, the argument needs further elaboration.

A missing piece: The proof of statistical propositions

Having highlighted some of the features of the hypothetical cases used in the literature, I will shift focus and begin to demonstrate my claim: that legal systems also give relevance to substantially different cases that are ignored in the NSE literature. In this section, I will set out the kind of cases I have in mind and the propositions that must be proved in them, which I will call ‘statistical’ propositions. I will leave for later the explanation of why I consider the difference between these cases and the cases examined so far to be substantial for the NSE debate.

Taking anti-discrimination law as an input

To prove my claim, it is sufficient to find, in the universe of generic cases to which legal systems assign normative consequences,¹¹ at least one case that differs (substantially) from the hypothetical cases that we have just examined. I will do this in the following pages.

Anti-discrimination law provides the input I need. According to a distinction generally accepted in legal systems of both common law and civil law traditions, unlawful discrimination involves at least two types of cases or variants: direct on the one hand, and indirect on the other.¹² This distinction serves our purposes because it invites us to develop a broader view of what must be proved in trials. That is, what types of propositions constitute the principal subject of proof or factum probandum.

Let us start with the best known cases. When person X sues person Y, alleging that Y has directly discriminated against him by a particular action, X must prove (typically) one or more singular propositions. For example, in an employment discrimination case, claims such as ‘Y wanted to fill a vacancy in his company’, ‘X was interviewed by Y for the job’, ‘X was the most qualified among the applicants’, ‘Y hired someone else anyway’ and ‘X was not hired because of his ethnicity’ (or because of some other protected characteristic he possesses). They all refer to unique and unrepeatable events.

However, the universe of unlawful discrimination is not limited to such cases. At the same time, legal systems generally forbid indirect discrimination as well. According to a widely accepted definition, found (among other sources) in the European Directive 2006/54/EC on non-discrimination on grounds of sex, indirect discrimination occurs ‘where an apparently neutral provision, criterion or practice would put persons of one sex at a particular disadvantage compared with persons of the other sex’ (Article 2.1.b).¹³ The protected characteristics may change; indeed, there are provisions that protect attributes other than sex, such as ethnicity, disability, age and so on. What does not change, however, is that for a case of indirect discrimination to be instantiated, the factor in question (the provision, criterion, or practice) must cause a ‘particular disadvantage’ to a group of people with a protected characteristic.

The interesting point of the last requirement for our purposes is the following. As the case law has made clear, it is a necessary condition for a finding of ‘particular disadvantage’ that propositions such as those expressed in the following statements have been proved:

‘56% of all pupils placed in special schools in Ostrava were Roma. Conversely, Roma represented only 2.26% of the total number of pupils attending primary school in Ostrava. Further, whereas only 1.8% of non-Roma pupils were placed in special schools, the proportion of Roma pupils in Ostrava assigned to special schools was 50.3%.’¹⁴

‘[T]he combined method [for calculating disability] was applied in 4,168 cases in 2009, that is to say, in approximately 7.5% of all the decisions on disability. Of this total of 4,168, 4,045 cases (in other words, 97%) concerned women and 123 (3%) concerned men.’¹⁵

‘Black and Minority Ethnic (BME) candidates and older candidates had lower pass rates than white and younger candidates. […] The BME pass rate was 40.3% of that of the white candidates. The pass rate of candidates aged 35 or older was 37.4% of that of those below that age. In each case, there was a 0.1% likelihood that this could happen by chance.’¹⁶

‘[I]n 2008 it is estimated that [in Mexico] 2.3 million people are dedicated to housework and nine out of ten are women.’¹⁷

Moreover, it is also necessary (and jointly sufficient) that the state of affairs described by such statements be negatively evaluated, that is, that the interpreter (a judge or a jury) understands that such a situation harms a protected group.

Based on the above-mentioned statements, the following factors were found to be (indirectly) discriminatory: (a) the system for assigning pupils to special schools in the Czech Republic, as set out in the 2005 School Act and its regulations (for disadvantaging Roma pupils); (b) a method of calculating disability in Switzerland, as applied by the Disability Insurance Office of one of its cantons (for disadvantaging women); (c) the requirement of passing a test of core skills for promotion in the UK Home Office (for disadvantaging BME and older candidates); (d) the voluntary social security registration system for domestic workers in Mexico, as set out in Article 13 (II) of the Social Security Act (for disadvantaging women).

In light of the above, it can be argued that indirect discrimination cases require a statistical approach to the facts. An approach that focuses not on the situation of the individual members of a disadvantaged group, but on the overall (statistically measurable) impact of the challenged factor on the group as a whole.¹⁸ The ‘particular’ character of this impact is established by comparing it with the impact of the same factor on other assimilable groups. Factors commonly compared in indirect discrimination cases include proportions: the relationship between (the number of) members of one group in one situation and (the number of) members of another group in a similar situation.¹⁹

Consequently, when it is claimed that a factor indirectly discriminates against a protected group, it is (typically) necessary to prove one or more propositions that are substantially different from the singular propositions and which, for obvious reasons, we may call ‘statistical’ propositions.²⁰

Elucidating statistical statements

A statistical proposition is the meaning (or expression) of a ‘statistical statement’. As with singular statements, I will attempt to elucidate (at least in part) what statistical statements consist of. The effort is worthwhile because, as we have just seen, there are cases in which they express what must be proved, principally, as a condition for the application of this or that legal rule, e.g., the rule forbidding indirect discrimination. However, this is usually ignored in the evidence literature. Although some authors note that such cases may be legally relevant, they immediately ignore them without drawing all the consequences that this implies, among other issues, for how to delimit the probative value of NSE or, more generally, for how to model evidential reasoning.²¹

Statistical statements are a way of speaking about an accumulated plurality of individual facts. More precisely, they could be understood as statements that refer to the value that one or more numerical variables take as a function of some measurement parameter in a finite set of individual facts. For example, ‘70% of the people who failed the exam belong to group A’, where ‘having failed the exam’ and ‘belonging to group A’ are the features that the variable describes, and the proportion in which they occur together (70% of the time) is the parameter by which the variable quantifies them.²²

Statistical statements have in common with singular statements that both refer to unique and unrepeatable events.²³ But they differ in other content-related aspects. In contrast to singular statements, which refer to such-and-such individual fact, statistical statements provide information corresponding to a set of individual facts of a certain type, those having the general features described by such-and-such variables. However, they do not report on any particular event. Unlike singular statements, which can transmit data of different kinds and amounts, statistical statements transmit only numerical data, that is, they refer to facts as instances of a certain feature or attribute (having failed or not having failed the exam) that can be counted (7 out of 10 people in group A did so).²⁴ This implies that they never contain the data that individualise each event and make it unique. Reducing the amount of individual information transmitted seems to be one of the pillars of statistical wisdom, whose contribution to knowledge is precisely to sacrifice depth in order to gain a broader perspective.²⁵

Numerical variables in statistical statements can take on different kinds of measures or parameters. One such measure quantifies proportions. ‘70% of the people who failed the exam belong to group A’ is an example of a statistical statement that refers to proportions. A second type of measure stresses the central tendency of the values of the variables. Among others, it is possible to measure their ‘mean’ (the arithmetic average of all values), their ‘median’ (the value that lies in the middle of all values), or their ‘mode’ (the value that occurs most often among all values). ‘The average of the scores obtained in the exam was 60 out of 100’ is an example of a statistical statement that refers to the mean value of a variable. A third type of measure considers the variability of the values of the variables with respect to their central value. Measures may include their ‘variance’ (the average deviation of each value from the mean) or their ‘standard deviation’ (the square root of the variance). For example: ‘The standard deviation between exam scores was 15 points.’ Finally, a fourth type of measure stresses the relationships between the values of one variable and the values of another variable or variables. Among the options, notably the ‘regression coefficient’ stands out. It measures the marginal change in the value of a variable when, ceteris paribus, the value of another variable changes by one unit. For example: ‘In exams, belonging to group A explains why the average grade is 10 points lower than that of someone who does not belong to this group.’

Accordingly, we can distinguish at least two kinds of activities that can be done by asserting statistical statements. One is to describe a state of affairs: the values that a variable takes in a reference class (this is done, for example, in ‘70% of the people who failed the exam belong to group A’). The other is to explore the causes of a state of affairs described by other statistical statements: to find the explanation of why the values of a variable vary in a reference class (this is done, for example, with ‘In exams, belonging to group A explains why the average grade is 10 points lower than that of someone who does not belong to this group’).

Among the measures that can be expressed as proportions, we are particularly interested in those that quantify the number of times that individual events of a certain type occur in a given time-space occasion. In particular, those that quantify the relative frequency with which instances of a type of event (having failed the exam) possess a certain additional feature (belonging to group A). Statistical statements referring to relative frequencies can, for obvious reasons, be called ‘frequency’ statements.²⁶ Their basic form can be expressed as f_n(B|A), meaning: the relative frequency with which instances (or outcomes) of property B occur among ‘n’ actual outcomes²⁷ of property A. The class denoted by property A is called the reference class.²⁸

Statistical statements, especially frequency statements, are to be distinguished from the so-called ‘statistical laws’ with which they are associated in technical language. In contrast to the former, laws are empirical generalisations.²⁹ That is, they predicate over an (assumed as) infinite reference class: they claim to reach all their actual and potential instances.³⁰ Moreover, these laws are also to be distinguished from the axioms and theorems of mathematical probability theory, which are also general statements but, unlike them, have no empirical content but are logical in nature.³¹

A substantial difference

Until now, we have seen that the NSE debate focuses on certain hypothetical cases and ignores (sometimes deliberately) others that are also legally relevant. I will now argue that the difference between the two types of cases is substantial for this debate. Why? Because the arguments it advances against NSE, even if they were acceptable when applied to the former cases (involving singular propositions), make no sense when applied to the latter (involving statistical propositions). To justify this claim, I will review some of the most influential arguments against NSE put forward in the contemporary evidence literature, and note that they are implausible when applied to cases in which statistical propositions are the principal subject of proof.³²

Versions of the not-just-probabilities argument

Let us speak of a ‘not-just-probabilities argument’ to refer to any argument that denies the probative value of NSE on the basis of some defect that statistical evidence would have and other kinds of evidence would not.³³ Like I said, I will review some versions of this argument as they are presented in the literature. Since my strategy is to argue that even if they were acceptable for the cases considered under ‘A debate on the proof of singular propositions’, they are not for the cases considered in ‘A missing piece: the proof of statistical propositions’, I will omit the objections to their adequacy for the former cases.³⁴

(a) Reference classes. According to this argument, the problem with NSE is an epistemic one. In using NSE to infer whether a particular event has occurred, it should be noted that the same individual event can be subsumed under (or qualified as an instance of) an infinite number of reference classes. The same traffic accident could belong to the classes ‘traffic accidents on Fridays’, ‘traffic accidents at night’ and ‘traffic accidents caused by a bus’, among others. It should also be noted that each of these (infinitely many) reference classes may have different probability values associated with it. For example, traffic accidents may be rare on Fridays (only 5% of the total), but very common at night (90% of the total). Given these two factors, the problem is that the conclusion of any NSE-based inference about the occurrence of an individual event depends on the reference class under which one chooses to subsume the event in question, and not on the truth or falsity of its occurrence; that is, it depends on a judgement.³⁵

Allen & Pardo (2007, 2021) highlight the problem and use the Blue Bus hypothetical to illustrate it:³⁶

Suppose a witness saw a bus strike a car but cannot recall the color of the bus; assume further that the Blue Company owns 75 percent of the buses in the town and the Red Company owns the remaining 25 percent. The most prevalent view in the legal literature of the probative value of the witness's report is that it would be determined by the ratio of Blue Company buses to Red Company buses […]. But suppose the Red Company owns 75 percent (and Blue the other 25 percent) of the buses in the county. Now the ratio reverses. […] Each of the reference classes leads to a different inference about which company is more likely liable […] (Allen and Pardo, 2007: 109, emphasis mine)

In such situations, ‘nothing in the natural world privileges or picks out one of the classes as the right one; rather, our interests in the various inferences they generate pick out certain classes as more or less relevant’ (2007: 112). Thus, the reference class problem is an epistemological limitation (2007: 115).³⁷ It demonstrates that ‘objective probabilities based on a particular class of which an item of evidence is a member cannot typically (and maybe never) capture the probative value of that evidence for establishing facts relating to a speciﬁc event’ (2007: 114, emphasis mine).

(b) Sensitivity. According to this argument, the problem with NSE is an epistemic one: it is unable to support a belief that is sensitive to the truth of the proposition to which it refers, that is, a belief that we would not hold if the associated proposition were actually false. By definition, ‘S's belief that p is sensitive’ if and only if ‘[h]ad it not been the case that p, S would (most probably) not have believed that p’ (Enoch et al., 2012: 204).³⁸ The NSE offers no sensibility because what it expresses may be true or false regardless of the truth or falsity of the belief (about a singular proposition) it is intended to support.³⁹

Suppose, in the Blue Bus hypothetical, we rule for the (injured) plaintiff and against the Blue Bus company, relying solely on statistical evidence: its market share. In this scenario, ‘whether or not the finding matches the facts seems to be a matter of luck; we do not base our finding on anything that tracks the truth. And accordingly, Sensitivity is not satisfied’ (Enoch and Fisher, 2015: 575). The problem is that, if ‘had it not been one of its buses that caused the harm, nothing would have been different regarding the market shares […] In such a case, we would still have the exact same statistical evidence available to us’ (Enoch et al., 2012: 206–207).⁴⁰

(c) Free will. According to this argument, the problem with NSE is a moral one: using it to attribute culpability to an individual for a crime (a unique event) ‘requires presupposing that the individual's behaviour was determined by a certain causal factor which renders her behaviour unfree’, in a context where ‘it is necessary to presuppose the exact opposite: that the individual is free to determine her own behaviour’ (Pundik, 2017: 190). The idea is based on three basic premises.⁴¹ First premise: NSE that can be correctly used for predictive purposes are generalisations that reflect causal relationships between factors, either directly or indirectly (via a common cause). Second premise: If a certain behaviour is causally determined by a factor that lies outside the agent's will, so that the agent cannot refrain from it, then it is not free.⁴² Third premise: It is not morally justified to attribute culpability to an individual for behaviour done without free will.⁴³

Therefore, if the available NSE would express a causal connection (and thus allow predictive use), it should not be used to hold a person liable. In Pundik's words, ‘a generalisation which requires presupposing that the individual's behaviour was determined by a property which the individual shares with other group members and rendered his behaviour unfree should not be used to attribute culpability to that individual’ (2017: 213).⁴⁴

(d) Incentives. According to this argument, the problem with the NSE is a moral one: a conviction based on it ‘does not contribute in a positive way to the incentive structure for lawful behaviour, since the evidence is not caused by unlawful behaviour’ (Dahlman, 2020: 167).⁴⁵ If we did not reject its use, a person could be condemned only because he belongs to a certain reference class, regardless of the specific behaviour (a unique event) he has committed. Therefore, anyone in a similar situation in the future would have no incentive to behave lawfully.⁴⁶

In the Prison Riot case, the defendant could be convicted for killing the prison guard during the riot merely because he was one of the participants in the riot, regardless of whether he also participated in the killing (on that particular occasion). This would send the wrong signal to anyone who might find themselves in a similar situation in the future. ‘Since participation in the riot [would be] sufficient for conviction, the prisoner [would know] that he will be convicted for participating in the killing, whether he decides to participate in the killing or not’ (Dahlman, 2020: 174).

A limit of the not-just-probabilities arguments

Not-just-probabilities arguments may be plausible for the cases considered under ‘A debate on the proof of singular propositions’, where singular propositions must be proved. But they are not plausible when applied to the cases considered under ‘A missing piece: The proof of statistical propositions’, where statistical propositions must be proved. This limits the scope of their conclusions. Let me briefly explain the reasons that support this claim by examining each of the versions mentioned above.

(a) Reference classes. The reference class problem arises only in ‘single case’ inferences or ‘inductive-statistical’ explanations, that is, in inferences or explanations (non-deductive)⁴⁷ whose conclusions are singular statements, those that refer to the occurrence of individual events.⁴⁸ The problem thus arises in the cases considered under ‘A debate on the proof of singular propositions’, but not in the cases considered under ‘A missing piece: the proof of statistical propositions’.⁴⁹

The reason is the following. A singular statement refers to the occurrence of an individual fact which, since it can be subsumed under an infinite number of reference classes, can also be associated with an infinite number of (true) values of frequency probability. Herein lies the problem: Which of these values should be used to determine whether a singular proposition is proved? In contrast, a statistical statement refers to certain information corresponding to a set of individual facts. The interesting point is that any set of facts, if precisely delimited, allows for only one true frequency probability value. Therefore, the problem of the reference class does not arise in proving statistical propositions.⁵⁰

For a better illustration, let us recover one of the statements given as an example in ‘Taking anti-discrimination law as an input’:

[T]he combined method [for calculating disability] was applied in 4168 cases in 2009, that is to say, in approximately 7.5% of all the decisions on disability. Of this total of 4,168, 4045 cases (in other words, 97%) concerned women and 123 (3%) concerned men.⁵¹

In this frequency statement, the reference class can be defined as ‘the cases of application of the combined method for calculating disability’ (A) and the parameter to be measured is the proportion of cases of application concerning women (B). For this parameter, f_n(B|A), there can be only one true value: 97% or another. Therefore, if this value is known, the problem of discretionary choice between several (and correct) values does not arise.

(b) Sensitivity. Suppose that the NSE is not sensitive to beliefs about singular propositions, as some authors suggest. Even if this were the case, the same would not hold for beliefs about statistical propositions: the NSE might be sensitive to them.

To make my point, it is useful to recall the definition of ‘statistical evidence’ adopted at the outset: it is any assertion of statistical propositions or laws. With this in mind, there are two possible scenarios that are relevant to the sensitivity issue. On the one hand, it could be that there is an identity relationship between what is asserted by an NSE and the statistical proposition to be proved in a trial, that is, that they are the same thing.⁵² In this scenario, the truth of one proposition is necessarily sensitive to the truth of the other. On the other hand, it could be that one proposition and the other do not have the same meaning, but are linked by an inferential relationship. A relationship that could be established, to mention just one method, by ‘statistical estimation’.⁵³ Through this inferential method, a statistical statement expressing the value of a variable in a sample is used to infer another statistical statement expressing the value of the same variable in the population from which the sample was drawn. In this scenario, there may also be sensitivity: if the sample has been drawn according to a procedure that ensures its representativeness, it is usually expected to reflect approximately how things are in the population; or conversely, if things are so and so in the population, they are expected to be reflected in the sample.⁵⁴

(c) Free will. The free will objection applies to the attribution of criminal liability for individual behaviour. According to Pundik, ‘whenever a generalisation is used to attribute culpability to an individual and this use requires presupposing the existence of some causal factor outside the agent's control’ (2017: 203, emphasis mine). By contrast, it does not work for those ‘areas of law that do not require culpability as a condition for liability’ (2020: 254) such as

compensation for loss of earnings (to calculate the life expectancy the claimant would have likely enjoyed had the defendant not unlawfully harmed them); toxic torts (to prove causation); employment law (to prove group-based discrimination); human rights (to prove the extent of damage caused by the violation of the claimant's human rights); and competition law (to calculate the economic damage resulting from price-ﬁxing). (Pundik, 2020: 255, emphasis mine) ⁵⁵

This observation is correct, but should be expanded as follows: the free will argument does not work in any case where statistical propositions are the principal subject of proof. On the one hand, because, depending on the type of statistical proposition at issue, its proof could involve merely the description of a state of affairs: the values assumed by a variable in a reference class.⁵⁶ On the other hand, because when it comes to investigating the causes of the states of affairs described by (other) statistical statements, it typically involves establishing relationships between the values of a numerical variable (e.g., the hiring rate in a particular type of job) and the values of one or more other variables (e.g., the sex of those hired), always at the statistical level.⁵⁷ It is never a matter of attributing to an individual the consequences of a (single) individual behaviour that must be assumed to be free. What can be attributed to him, at most, is the statistical bias of a set of individual behaviours (e.g., that being a woman, ceteris paribus, reduces the probability of being hired for a particular type of job). In such scenarios, attention is not focused on each individual hiring action, but rather on the variables that explain a set of hiring actions.⁵⁸

(d) Incentives. Finally, the incentives argument does not work for the cases considered under ‘A missing piece: the proof of statistical propositions’, for similar reasons to those invoked for the sensitivity argument. As we have seen, while the NSE is not sensitive to beliefs about singular propositions, it might be sensitive to beliefs about statistical propositions. Thus, if the legal system classifies the states of affairs described by statistical statements as unlawful, as in cases of indirect discrimination, a judgement based on an NSE that affirms or allows an inference of the existence of such a state of affairs contributes in a positive way to the incentive structure for lawful behaviour. Those responsible for creating or maintaining the situation at issue will have incentives to avoid or change it. Consider, for example, the case of the combined method for calculating disability:⁵⁹ if the NSE proves that this method results in a ‘particular disadvantage’ for women, a remedy based on this evidence ordering its discontinuation would create incentives to use another method that does not result in similar disadvantages.⁶⁰

Conclusions

This paper has made a very modest contribution to the NSE debate, which can be summarised as follows:

First, the hypothetical cases discussed in the debate about the probative value of NSE were characterised. In addition, the ‘singular’ statements that express what must be proved in these cases were elucidated.

Second, it was shown that there are also different cases that are relevant in legal systems, cases in which ‘statistical’ propositions are the principal subject of proof (or factum probandum). One section was devoted exclusively to the clarification of the statements expressing these propositions.

Third, it was argued that the difference between these two classes of cases is substantial for the NSE debate because it changes what is acceptable to claim about the evidential value of this kind of evidence. To justify this claim, some of the most influential arguments against NSE put forward in the contemporary evidence literature were reviewed, and it was noted that they are implausible when applied to cases in which statistical propositions are the principal subject of proof.

Please note that I neither claim nor need to claim that there is any author who denies or attempts to deny the conclusion I have drawn. My aim was merely to highlight a missing piece of the puzzle, an overlooked aspect that I think deserves more attention from evidence scholars.

Some additional conclusions can be drawn.

On the one hand, the evidence literature could defend itself by saying that, while it does not cover the whole universe of legally relevant cases, it focuses on those that most often end up in court. After all, the complexity of things always forces us to prioritise, and if one has to choose, it is always better to deal with the central cases than with the peripheral ones. If this information about the cases were true, it would not seem to me to be a bad strategy to make such a prioritisation when necessary. However, since the cases in which singular propositions must be proved are not the only relevant cases in legal systems, a discourse that focuses only on them should, for the sake of rigour, make it clear that its conclusions are of limited scope and apply only to a particular kind of case, outside a universe that extends beyond it. It is therefore desirable that the evidence literature be as explicit and precise as possible in this regard. The elucidation work done here (under ‘A debate on the proof of singular propositions’ and especially under ‘A missing piece: the proof of statistical propositions’) can be seen as a contribution to improving the precision of the way in which the scope of arguments for or against NSE is delimited.

On the other hand, since the criticism of the NSE cannot (plausibly) be extended to cases in which statistical propositions must be proved, it should be conceded that it is acceptable to assign positive probative value to this type of evidence.⁶¹ Once this step is taken, however, at least two crucial questions arise that, for obvious reasons, are not answered in the evidence literature: Under what conditions is positive probative value to be assigned to an NSE? According to which criteria is the probative value of an NSE to be weighted?⁶² Those who agree that the cases considered under ‘A missing piece: the proof of statistical propositions’ are important to the legal system will surely agree that these and other questions deserve more attention from evidence scholars. Opponents, by contrast, might argue that the cases under ‘A missing piece: the proof of statistical propositions’ are peripheral in the universe of cases that end up in court and should not distract us. To these people I would reply that even if these cases were indeed peripheral (I have no information to confirm or deny this), they would still be important for another reason that has nothing to do with their volume. I am talking about the high social, political and economic transcendence of the decisions made in their context, which I think is easy to see if we recall the factors declared discriminatory in the examples given under ‘Taking anti-discrimination law as an input’.

Finally, there is a conclusion to be drawn from the legal relevance of the cases considered under ‘A missing piece: the proof of statistical propositions’ that is related to the formulation of models of evidential reasoning. I can only sketch it here for further elaboration on a later occasion. Given the difference in form and correction criteria between some of the inference methods used in science for proving statistical propositions⁶³ and the methods used for proving singular propositions, one might think of the convenience of formulating at least two different models of judicial evidential reasoning: on the one hand, a model of Singular Evidential Reasoning (dealing with singular propositions) and, on the other hand, a model of Frequency Evidential Reasoning (dealing with statistical propositions).

Footnotes

Acknowledgements

The author would like to thank Professor Jordi Ferrer Beltrán (University of Girona) for the stimulus he gave to the writing and submission of this manuscript.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship and/or publication of this article: This work was supported by the University of Genoa and by the project PID2020-114765GB-I00 of the University of Girona, funded by MCIN/ AEI /10.13039/501100011033.

ORCID iD

Alejo Joaquín Giles

Notes

References

Alchourrón

Bulygin

(1971) Normative Systems. New York: Springer.

Alessandra

(1988) When doctrines collide: Disparate treatment, disparate impact, and Watson v. Fort Worth Bank & trust comments. University of Pennsylvania Law Review 137(5): 1755–1790.

Allen

(2021) Naturalized epistemology and the law of evidence revisited. Quaestio Facti. Revista Internacional Sobre Razonamiento Probatorio 2: 253–284.

Allen

Pardo

(2007) The problematic value of mathematical models of evidence. The Journal of Legal Studies 36(1): 107–140.

Allen

Pardo

(2021) Generalizations and reference classes. In: Dahlman

Stein

Tuzet

(eds) Philosophical Foundations of Evidence Law. Oxford: Oxford University Press, 301−313. Available at: https://dx.doi.org/10.2139/ssrn.3704970 (accessed 8 June 2023).

Ayer

(1963) Two notes on probability. In: The Concept of a Person: And Other Essays. London: Macmillan Education UK, 188–208.

Becker

(2018) The sensitivity response to the Gettier problem. In: Hetherington

(eds) The Gettier Problem. Classic Philosophical Arguments. Cambridge: Cambridge University Press, 108–124.

Blome-Tillmann

(2015) Sensitivity, causality, and statistical evidence in courts of law. Thought: A Journal of Philosophy 4(2): 102–112.

Chopin

Germaine

(2021) A comparative analysis of non-discrimination law in Europe 2021. European Commission. Available at: https://op.europa.eu/en/publication-detail/-/publication/7bcd5c53-8570-11ec-8c40-01aa75ed71a1/language-en/format-PDF/source-276548425 (accessed 8 June 2023).

10.

Climenhaga

(2018) Intuitions are used as evidence in philosophy. Mind: A Quarterly Review of Psychology and Philosophy 127(505): 69–104.

11.

Cohen

(1977) The Probable and the Provable. Oxford: Clarendon Press.

12.

Colyvan

Regan

(2007) Legal decisions and the reference class problem. The International Journal of Evidence & Proof 11(4): 274–285.

13.

Colyvan

Regan

Ferson

(2001) Is it a crime to belong to a reference class. Journal of Political Philosophy 9(2): 168–181.

14.

Dahlman

(2020) Naked statistical evidence and incentives for lawful conduct. The International Journal of Evidence & Proof 24(2): 162–179.

15.

Dahlman

Pundik

(2021) The problem with naked statistical evidence. In: Dahlman

Stein

Tuzet

(eds) Philosophical Foundations of Evidence Law. Oxford: Oxford University Press, 332−345. Available at: https://dx.doi.org/10.2139/ssrn.3758860 (accessed 8 June 2023).

16.

DeGroot

Schervish

(2012) Probability and Statistics, 4th ed. Boston, MA: Addison-Wesley.

17.

Enoch

(2021) How to theorize about statistical evidence (and really, about everything else): A comment on Allen. Quaestio Facti. Revista Internacional Sobre Razonamiento Probatorio 2: 285–298.

18.

Enoch

Fisher

(2015) Sense and ‘sensitivity’: Epistemic and instrumental approaches to statistical evidence. Stanford Law Review 67(3): 557–611.

19.

Enoch

Spectre

(2019) Sensitivity, safety, and the law: A reply to Pardo. Legal Theory 25(3): 178–199.

20.

Enoch

Spectre

Fisher

(2012) Statistical evidence, sensitivity, and the legal value of knowledge. Philosophy & Public Affairs 40(3): 197–224.

21.

Ferrer Beltrán

(2007) La Valoración Racional de La Prueba. Madrid: Marcial Pons.

22.

Finkelstein

Fairley

(1970) A Bayesian approach to identification evidence. Harvard Law Review 83(3): 489–517.

23.

Finkelstein

Fairley

(1971) The continuing debate over mathematics in the law of evidence: A comment on trial by mathematics. Harvard Law Review 84(8): 1801–1809.

24.

Frosini

(2002) Le Prove Statistiche Nel Processo Civile e Nel Processo Penale, 1st ed. Milano: Giuffrè.

25.

Gardiner

(2018) Legal burdens of proof and statistical evidence. In: The Routledge Handbook of Applied Epistemology, 1st ed. London: Routledge, 179–195.

26.

Gillies

(2000) Philosophical Theories of Probability, 1st ed. London: Routledge.

27.

Guthrie

(2000) Framing frivolous litigation: A psychological theory. The University of Chicago Law Review 67(1): 163–216.

28.

Hájek

(2007) The reference class problem is your problem too. Synthese 156(3): 563–585.

29.

Hempel

(1965) Aspects of Scientific Explanation and Other Essays in the Philosophy of Science. New York: The Free Press.

30.

(2008) A Philosophy of Evidence Law: Justice in the Search for Truth. Oxford: Oxford University Press.

31.

Kaye

(1979) The laws of probability and the law of the land. University of Chicago Law Review 47(1): 34–56. Available at: https://chicagounbound.uchicago.edu/uclrev/vol47/iss1/3 (accessed 8 June 2023).

32.

Kaye

(1980) Naked statistical evidence. The Yale Law Journal Finkelstein M (ed.) 89(3): 601–611.

33.

Kaye

(1982) The limits of the preponderance of the evidence standard: Justifiably naked statistical evidence and multiple causation. Law & Social Inquiry 7(2): 487–516.

34.

Koehler

(2002) When do courts think base rate statistics are relevant? Jurimetrics 42(4): 373–402.

35.

McGinley

(2011) Ricci v. DeStefano: Diluting disparate impact and redefining disparate treatment. Nevada Law Journal 12(3): 626–639. Available at: https://scholars.law.unlv.edu/facpub/646/ (accessed 8 June 2023).

36.

Nagel

(1939) Principles of the theory of probability. In: International Encyclopedia of Unified Science. Chicago, IL: The University of Chicago Press, vol. 1, no. 6, 1−80.

37.

Nagel

(1961) The Structure of Science: Problems in the Logic of Scientific Explanation. New York: Harcourt, Brace & World, Inc.

38.

Nesson

(1979) Reasonable doubt and permissive inferences: The value of complexity. Harvard Law Review 92(6): 1187–1225.

39.

Pardo

(2019) The paradoxes of legal proof: A critical guide. Boston University Law Review 99: 233–290.

40.

Picinali

(2016a) Base-rates of negative traits: Instructions for use in criminal trials. Journal of Applied Philosophy 33(1): 69–87.

41.

Picinali

(2016b) Generalisations, causal relationships and moral responsibility. The International Journal of Evidence & Proof 20(2): 121–135.

42.

Posner

(1999) An economic approach to the law of evidence. Stanford Law Review 51: 1477.

43.

Pritchard

(2008) Sensitivity, safety, and antiluck epistemology. In: Greco

(eds) The Oxford Handbook of Skepticism, 1st ed. Oxford: Oxford University Press, 437–455.

44.

Pundik

(2011) The epistemology of statistical evidence. The International Journal of Evidence & Proof 15(2): 117–143.

45.

Pundik

(2017) Freedom and generalisation. Oxford Journal of Legal Studies 37(1): 189–216.

46.

Pundik

(2020) Predictive evidence and unpredictable freedom. Oxford Journal of Legal Studies 40(2): 238–264.

47.

Pundik

(2008a) Statistical evidence and individual litigants: A reconsideration of Wasserman’s argument from autonomy. The International Journal of Evidence & Proof 12(4): 303–324.

48.

Pundik

(2008b) What is wrong with statistical evidence? The attempts to establish an epistemic deficiency. Civil Justice Quarterly 27: 61.

49.

Redmayne

(2008) Exploring the proof paradoxes. Legal Theory 14(4): 281–309.

50.

Reichenbach

(1949) Hutten

Reichenbach

(trans.) The Theory of Probability: An Inquiry into Logical and Mathematical Foundations of the Calculus of Probability, 2nd ed. Berkeley and Los Angeles, CA: University of California Press.

51.

Ross

(2020) Recent work on the proof paradox. Philosophy Compass 15(6): e12667.

52.

Russell

(2010) [1918] The Philosophy of Logical Atomism . London: Routledge.

53.

Salmon

(1971) Statistical explanation. In: Salmon

(eds) Statistical Explanation and Statistical Relevance. Pittsburgh, PA: University of Pittsburgh Press, 29–87.

54.

Salmon

(2006) Four Decades of Scientific Explanation, 2nd ed. Pittsburgh, PA: University of Pittsburgh Press.

55.

Sanchirico

(2001) Character evidence and the object of trial. Columbia Law Review 101: 1227–1311.

56.

Schurz

(2021) Probabilistic truthlikeness, content elements, and meta-inductive probability optimization. Synthese 199(3): 6009–6037.

57.

Spottswood

(2021) Paradoxes of proof. In: Dahlman

Stein

Tuzet

(eds) Philosophical Foundations of Evidence Law. Oxford: Oxford University Press, 317−331. Available at: https://dx.doi.org/10.2139/ssrn.3747260 (accessed 8 June 2023).

58.

Stigler

(2016) The Seven Pillars of Statistical Wisdom. Cambridge, MA: Harvard University Press.

59.

Thomson

(1986) Liability and individualized evidence. Law and Contemporary Problems 49(3): 199–219.

60.

Tillers

(2005) If wishes were horses: Discursive comments on attempts to prevent individuals from being unfairly burdened by their reference classes. Law, Probability and Risk 4(1–2): 33–49.

61.

Tribe

(1971) Trial by mathematics: Precision and ritual in the legal process. Harvard Law Review 84(6): 1329–1393.

62.

Venn

(1876) The Logic of Chance: An Essay on the Foundations and Province of the Theory of Probability, with Especial Reference to Its Application to Moral and Social Science, 2nd ed. London: Macmillan.

63.

von Wright

(2001) A Treatise on Induction and Probability. Oxfordshire: Routledge.

64.

Wasserman

(1991) The morality of statistical proof and the risk of mistaken liability decision and interference litigation. Cardozo Law Review 13: 935–976.

65.

Wells

(1992) Naked statistical evidence of liability: Is subjective probability enough? Journal of Personality and Social Psychology 62: 739–752.

A missing piece in the debate about naked statistical evidence

Abstract

Keywords

Introduction

A debate on the proof of singular propositions

A missing piece: The proof of statistical propositions

Taking anti-discrimination law as an input

Elucidating statistical statements

A substantial difference

Versions of the not-just-probabilities argument

A limit of the not-just-probabilities arguments

Conclusions

Footnotes

Acknowledgements

Declaration of conflicting interests

Funding

ORCID iD

Notes

References