A systematic account of probabilistic fallacies in legal fact-finding

Abstract

Evidence scholars have observed probabilistic fallacies in legal fact-finding and given them names since the 1980s (for example ‘Prosecutor's Fallacy’ and ‘Defense Attorney's Fallacy’). This has produced a rather un-organised list of over a dozen different probabilistic fallacies. In this article, the author proposes a systematic account where the observed probabilistic fallacies are organised in categories. Hierarchical relations between probabilistic fallacies are highlighted, and some fallacies are re-named to reflect the category they belong to and their relation to other fallacies in that category. All fallacies are precisely defined and illustrated with examples from real cases where they are committed by fact-finders. The result is a list of 12 probabilistic fallacies organised into 7 categories.

Keywords

Bayes evidence fallacy law probability

Introduction

It is well known that legal fact-finders are prone to errors in probabilistic reasoning. In 1987 William Thompson and Edward Schumann published an article that drew attention to probabilistic fallacies in the evaluation of legal evidence and coined the terms ‘Prosecutor's Fallacy’ and ‘Defense Attorney's Fallacy’ (Thompson and Schumann, 1987). Since then a number of scholars have pointed out and named other probabilistic fallacies in legal evidence (Aitken and Taroni, 2004; Cole, 2004; Dahlman, 2015; Dahlman et al., 2016; Fenton and Neil, 2000; Koehler, 1993; Koehler, 2008; Koehler and Thompson, 2006; Martire et al., 2013; Pilditch et al., 2019; Saks and Koehler, 2008). In an article published in 2014, Jonathan Koehler presents the following list of 17 documented fallacies of probabilistic reasoning in the law (Koehler, 2014: 211–212).

Prosecutor's Fallacy

Defense Attorney's Fallacy

Transposed Conditional

Source Probability Error

Numerical Conversion Error

Probability of Another Match Error

False Positive Fallacy

Base Rate Fallacy

Selection Bias

Individualization Fallacy

Fingerprint Examiner's Fallacy

Uniqueness Fallacy

Conjunction Fallacy

Disjunctive Errors

Imperfection Fallacy

Misconception of Chance

Pseudo-Diagnosticity

Similar lists of probabilistic fallacies in the evaluation of legal evidence have been compiled by Aitken and Taroni (2004: 78–87)¹ and other scholars.²

This research has been important for identifying the sources of error in legal fact-finding, but the received list is not very systematic. Different names are used for the same fallacy (or variations of the same fallacy) as if they referred to separate fallacies. For example, the fallacy of equating a conditional probability, P(A|B), with the inverted conditional probability, P(B|A), is sometimes referred to as the ‘Inverse Fallacy’ (Guthrie et al., 2000; Kaye and Koehler, 1991) and sometimes as the ‘Fallacy of the Transposed Conditional’ (Aitken and Taroni, 2004; Browne, 1998; Koehler, 2014).

In addition to this terminological disorder, the received account of probabilistic fallacies is inadequately systematic in several other respects. To begin with, it does not map out hierarchical relations where fallacies are variations of a more general fallacy. For example, there are different ways in which the ‘Inverse Fallacy’ can be committed, where different conditional probabilities are being inverted, and these variations should be distinguished from each other and labelled as different sub-fallacies of the ‘Inverse Fallacy’. One sub-fallacy is the inversion of the probability for a true positive, P(E|H),³ incorrectly equating it with the probability that the hypothesis at issue is true given the evidence, P(H|E). A second sub-fallacy, often referred to as the ‘Prosecutor's Fallacy’, is the inversion of the probability for a false positive, P(E|¬H), incorrectly equating it with the probability that the hypothesis at issue is false given the evidence, P(¬H|E). In a systematic account of probabilistic fallacies these errors should not be hidden under the generic label ‘Inverse Fallacy’ but distinguished from each with individual labels that clarify what they have in common and how they differ. In this article I will propose that they be labelled True Positive Inversion and False Positive Inversion.

Another thing that is lacking in the received treatment of probabilistic fallacies is attention to consequential relations between fallacies, where one fallacy leads to another. For example, when a fact-finder commits the ‘Inverse Fallacy’ (True Positive Inversion) of equating P(E|H) with P(H|E), that error will inevitably lead to the fact-finder disregarding the prior probability P(H), a fallacy commonly described as the ‘Base Rate Fallacy’. A distinction should be made between cases where the ‘Base Rate Fallacy’ is the reasoning error being committed and cases where it is the consequence of another reasoning error.

In this article, I will offer a more systematic and transparent account of probabilistic fallacies in legal fact-finding. I have used Koehler's list as a benchmark and developed a more systematic account by grouping the fallacies into seven categories. To make the terminology more transparent I have given some fallacies new names that better reflect what they are about, or to which category they belong. The result of this endeavour is that some fallacies on Koehler's list have merged, while others have been divided up into sub-fallacies. Most of the fallacies on my re-worked list are also on Koehler's list, in one form or other, but my list also adds some fallacies that are not on Koehler's list. My list contains the following 12 fallacies organised into 7 categories. In the section ‘A systematic account of probabilistic fallacies in legal fact-finding’, they will all be explained with examples.

Inversion Fallacies

True Positive Inversion

False Positive Inversion

Prior Probability Fallacy: Base Rate Neglect

Zero-Sum Fallacies

Mutual Increase Neglect

False Elimination

Reference Class Fallacy: Mismatch Neglect

Doubling Errors

Double Counting

Double Omission

Convergent Evidence Fallacies

Convergence Neglect

Product Fallacy

Dependence Neglect

Link-Skipping

This is not a complete list of every probabilistic fallacy that could be committed by legal fact-finders and it does not include every error in probabilistic reasoning that has been discussed in the literature on legal evidence. I am tempted to say that my list entails the ‘most fundamental’ fallacies, since it covers the basic elements of probabilistic reasoning with legal evidence and identifies what can go wrong in each of them, but it is problematic from a logical point of view to say that one fallacy is more fundamental than another, and I will therefore simply refer to it as a list of 12 fallacies organised into 7 categories.

The existing literature on probabilistic fallacies in the legal context does not distinguish between fallacies that are committed by judges or jurors and fallacies that are committed by other actors in the legal system, for example fallacies committed by expert witnesses who testify in court and report their findings. Koehler presents his list as fallacies of ‘probabilistic reasoning in the law’ and mixes fallacies committed by judges or jurors in their fact-finding with fallacies in forensic reporting. This is problematic, since the received list is often referred to as errors that judges and jurors make in fact-finding. Some fallacies can be committed across different categories of actors, but several fallacies on Koehler's list cannot be committed by judges or jurors in their fact-finding. ‘Pseudodiagnosticity’, for example, is not an error in the assessment of the evidence presented in court but an error in the gathering of evidence to be presented in court (Doherty et al., 1979) and is therefore not an error ‘in legal fact-finding’ but an error in police investigations and other evidence-seeking activities. My list is limited to fallacies that properly speaking occur ‘in legal fact-finding’ (see ‘Some conceptual distinctions’ below). It is limited to fallacies committed by judges and jurors when they decide what has been proven and does not include fallacies in forensic reporting. It is, of course, also important to identify and investigate fallacies in forensic work, but it is not the topic of this article.

My purpose is not to survey to what extent legal fact-finders actually commit the various fallacies on the list. The purpose of this article is only to explicate what the various fallacies are and organise them logically in relation to each other. The systematic account of probabilistic fallacies in legal fact-finding presented in this article is intended to be useful as a theoretical groundwork for empirical research on fact-finding by judges and jurors, and for teaching judges and jurors to avoid errors in probabilistic reasoning.

This article draws on research that I have conducted over the last 10 years in the Swedish legal system. Fact-finding in Swedish criminal trials is handled by judges (there is no jury), and judges explain quite specifically in their verdicts how they assessed various pieces of evidence and reached their conclusions. This makes the Swedish legal system an excellent place to study fallacies in legal fact-finding. In my research, I have read hundreds of Swedish cases and found many examples where probabilistic fallacies are openly articulated in the verdict. I will use some of these examples in this article to illustrate various fallacies. My findings on probabilistic fallacies in Swedish trials have been published in a book (Dahlman, 2018). Since the book is in Swedish, I am taking the opportunity in this article to share these findings (and further research) with a wider English-speaking audience.

The article starts in the next section with some important distinctions, and the section following that presents my systematic account of probabilistic fallacies in legal fact-finding with examples.

Some conceptual distinctions

In order to identify probabilistic fallacies in legal fact-finding we need to distinguish ‘probabilistic fallacies’ from other reasoning errors. A fallacy is a type of reasoning that can appear to be correct but is actually erroneous (Copi, 1961: 52; Hamblin, 1970: 12). There are many different fallacies, and they are traditionally grouped together by the kind of error they commit.

For an error to count as a ‘probabilistic fallacy’ it must be a formal violation of probability calculus, for example that a factor in the calculus is omitted or swapped with another factor (Base Rate Neglect is an example of a fallacy where a factor is omitted, and False Positive Inversion is an example of a fallacy where a factor is swapped). It should be noted that a factually incorrect numerical input into the probability calculus is not a formal error and therefore never counts as a ‘probabilistic fallacy’. Consider, for example, a fact-finder who incorrectly believes that mistakes in eyewitness identifications are extremely rare and therefore assigns a probability of one in a million to a false positive, which leads to an over-estimation of the probability that a suspect identified by an eyewitness is the perpetrator. This is a grave error, but it is an error of fact, not a formal error. A fact-finder who draws a mathematically correct conclusion from a factually incorrect assumption does not commit a probabilistic fallacy.

That a fallacy is a flawed type of reasoning ‘that can appear to be correct’ begs the question if it must be ‘deceptive’ to count as a fallacy. Is an error a fallacy if nobody commits it? There are various views on this in the literature—for an excellent overview, see Hansen (2002)—but I will leave this issue open in the present investigation. If an error in probabilistic reasoning must show a certain rate of recurrence to count as a fallacy, we need to know how often legal fact-finders commit a certain error to determine if it should be included on the list or not. As I made clear above (under ‘Introduction’), the purpose of this article is not to survey to what extent fact-finders actually commit the probabilistic fallacies that are discussed. That a fallacy is included in my list does not mean that it has a certain scientifically documented degree of deceptiveness.

In order to identify probabilistic fallacies in legal fact-finding we also need to distinguish ‘fact-finding’ from other activities. The term ‘fact-finding’ is used frequently in the theoretical literature on legal evidence but is rarely accompanied by an exact definition. Some scholars seem to use the term quite broadly, while others understand it more narrowly. I offer the following narrow definition, that will be used in the present investigation: fact-finding is what a judge or juror does when exercising the role in a legal trial of deciding (alone or together with others) what has been proven. According to this narrow definition, police investigators and prosecutors are not fact-finders. They assess evidence in their decisions to move an investigation in a certain direction and the decision to take a case to court, but they have no legal authority to decide what should be taken as proven in the application of the law. Forensic scientists and other expert witnesses are not fact-finders either. Their role in the legal system is not to decide, but to help fact-finders in their decision (Wahlberg and Dahlman, 2021). And a judge who presides over a jury trial is not a fact-finder. The judge decides what is admissible as evidence, but it is the jury that does the fact-finding in deciding what has been proven.

As I have already observed (under ‘Introduction’), Koehler's list of probabilistic fallacies ‘in the law’ does not differentiate between fallacies in fact-finding and fallacies in other activities related to legal fact-finding. Several fallacies on Koehler's list are errors in forensic reporting. The ‘Probability of Another Match Error’⁴ and the ‘Numerical Conversion Error’⁵ are statistically erroneous answers to questions that are sometimes answered by expert witnesses but are not the questions that are answered in fact-finding. The most obvious example on Koehler's list of a fallacy that is not a fallacy in fact-finding is the ‘Uniqueness Fallacy’.⁶ In this fallacy the forensic scientist oversteps the role given by the law to the expert witness and usurps the role of the fact-finder by declaring what has been proven. This fallacy can by definition not be committed by someone who legitimately acts in the role of legal fact-finder. Moreover, it is not even a probabilistic fallacy. What is being violated here is not the probability calculus, but the rules of legal procedure.⁷ Koehler's list also includes a fallacy in the reasons for admitting fingerprint evidence, the ‘Fingerprint Examiner's Fallacy’.⁸ This is not a fallacy in legal fact-finding. As explained above, the admissibility of evidence is an issue that precedes fact-finding but does not belong to fact-finding. The list that I will present in the next section only contains probabilistic fallacies in fact-finding.⁹ As we shall see, there are some fallacies that can be committed in fact-finding as well as in other activities, for example the ‘Prosecutor's Fallacy’. These are, of course, included in my list, to the extent that they occur in fact-finding. In some trials, one of the parties before the court presents an argument that commits a probabilistic fallacy. If the fact-finder picks up this argument, not realising that it is fallacious, and uses it in the assessment of the evidence, it becomes a fallacy ‘in fact-finding’ although it originated as a fallacy in advocacy.

A systematic account of probabilistic fallacies in legal fact-finding

The following subsections cover 12 fallacies organised in categories around the following elements in evidence assessment. The first deals with conditionalisation and covers several fallacies where conditional probabilities are inverted. The next subsection deals with prior probability and the fallacy of neglecting it, followed by a subsection dealing with competing hypotheses and the incorrect view that they always go up and down like scales. The fourth subsection deals with the reference classes problem and the fallacy of neglecting mismatch. The next subsection covers fallacies of double counting or double omission. The following subsection deals with the integration of convergent evidence and covers several fallacies where convergent evidence is under-estimated or over-estimated. The final subsection deals with evidence chains and the fallacy of link-skipping.

Inversion fallacies

The fallacy of equating a conditional probability, P(A|B), with the inverted conditional probability, P(B|A), is listed by Koehler as the ‘Inverse Fallacy’. As I have already observed (under ‘Introduction’), this error can manifest itself in two different ways that should be distinguished from each other, depending on the conditional probability being inverted: True Positive Inversion and False Positive Inversion. I will therefore label this item on the list in the plural and refer to it as the Inversion Fallacies.

The probability for a ‘true positive’ outcome is the probability of seeing the evidence (E) if the hypothesis at issue (H) is true, P(E|H). In legal evidence, it could for example be the probability that an eyewitness would identify the suspect as the perpetrator at a line-up (E) if the suspect were the perpetrator (H), or the probability that the victim's DNA profile would ‘match’ a blood stain on the suspects clothes (E) if the blood came from the victim (H). The probability for a true positive, P(E|H), has diagnostic value for the probability that the hypothesis is true given the evidence, P(H|E). An increase in the probability of the true positive leads ceteris paribus to an increase in the probability of the hypothesis and is described in statistics as an increase in ‘sensitivity’. However, it is a serious mistake to equate the probability for a true positive, P(E|H), with the probability of the hypothesis given the evidence, P(H|E), since the latter is an entirely different matter that also depends on other factors. Incorrectly equating P(E|H) with P(H|E) commits the fallacy of True Positive Inversion. A fact-finder who commits this fallacy thinks that since it is probable (to some degree) that we would see the evidence if the hypothesis is true, it is probable (to the same degree) that the hypothesis is true given this evidence. As an example, a fact-finder committing True Positive Inversion could reason as follows: ‘It is highly probable that the defendant is guilty because the evidence against him is exactly what we would expect to see in a case where the defendant is guilty’.

A fact-finder who commits this fallacy overlooks that the diagnostic value of the evidence not only depends on the probability for true positive, P(E|H), but also on the probability for false positive, P(E|¬H), i.e., the probability of seeing the evidence if the hypothesis is false. A fact-finder who commits the fallacy of True Positive Inversion can therefore also be described as committing ‘False Positive Neglect’.¹⁰ Consider a piece of evidence that is highly probable if the defendant is guilty, but also highly probable if the defendant is innocent, for example finding the suspect's DNA on the victim when they are husband and wife. In such a case it is obviously erroneous to say that it is highly probable that the husband is guilty since his DNA on the wife is exactly what we expect to find if he killed her.

The ratio between the probability for true positive and the probability for false positive, P(E|H)/P(E|¬H), is a factor in Bayesian probability calculus that can used as a measure of how strongly the evidence supports the hypothesis. If, for example,¹¹ P(E|H) = 1 and P(E|¬H) = 0.1, the ratio is 10, meaning that it is 10 times more probable to see the evidence if the hypothesis is true than if it is false. If, instead, P(E|H) = 1 and P(E|¬H) = 0.01, the ratio is 100, meaning that it is 100 times more probable to see the evidence if the hypothesis is true than if it is false, and the evidence consequently lends stronger support to the hypothesis (by a factor of 10). As this example shows, the diagnostic value of the evidence depends strongly on the probability for a false positive.

True Positive Inversion was committed in a famous Swedish case—the confessions of Thomas Quick—that is often described as the greatest scandal in Swedish legal history.¹² Thomas Quick was a patient at a mental institution who confessed in therapy that he was a serial killer. Quick claimed responsibility for over 30 unsolved murders in Sweden and Norway. When the police investigated the locations where he claimed to have killed and buried his victims, no bodies or forensic traces of any kind could be found, but he was nevertheless prosecuted and convicted for eight murders committed between 1994 and 2001. The verdicts stated that Quick was found guilty because his confessions contained details about the victims and the crime scenes that ‘only the perpetrator could know’. In 2008 Quick retracted his confessions and said that he had made the whole thing up to get attention. The case was re-opened and Quick was exonerated on all charges. The new investigation revealed that Quick had been able to figure out the details that ‘only the perpetrator could know’ by manipulating the police to ask leading questions. When I studied the case files from the trials where Quick was convicted, I noted that ‘leading questions in police hearings’ appeared in the original trials as an alternative explanation of Quick's detailed knowledge about the crimes. In fact, the prosecutor addressed this alternative explanation and sought to refute it by calling the police investigator to the stand who had conducted all the hearings with Quick. The police investigator said that he had not used leading questions. This would be revealed as completely untrue a decade later when Quick retracted his confessions and the case was re-opened, but at the time the court concluded that there had been no leading question since the police investigator had said so in his testimony.¹³ The probability for true positive was, of course, very high since the police investigator had testified just as a diligent officer who had not asked any leading questions would testify, but the fact-finders completely overlooked that an officer who had asked leading questions would most likely have testified in exactly the same way (False Positive Neglect).¹⁴ If he was not aware that he had asked leading questions he would not say that he did, and if he was aware he would probably not have admitted it, since it was a serious breach of police protocol. The correct assessment of the testimony should have been that the police investigator was expected to deny that he had asked leading questions whether he had actually done so or not, and that this statement therefore had little evidential value and did not merit the conclusion that no leading questions were asked. To consider it proven that the police investigator did not ask leading questions just because he testified just as a police investigator is expected to testify if he had not asked any leading questions commits the fallacy of True Positive Inversion.

Furthermore, committing True Positive Inversion does not only neglect the probability for false positive. It also neglects that P(H|E) depends on the prior probability P(H) of the hypothesis before the evidence was considered.¹⁵ In literature on probabilistic fallacies, P(H) is often referred to as the ‘base rate’, and overlooking P(H) is therefore often described as ‘Base Rate Neglect’ or ‘Base Rate Fallacy’ (Bar-Hillel, 1980; Fenton and Neil, 2011; Gigerenzer, 1991; Koehler, 1996; Thompson and Schumann, 1987). As I have already mentioned (under ‘Introduction’), neglecting the base rate is sometimes a fallacy in its own right and sometimes the consequence of another fallacy. In relation to the inversion fallacies, Base Rate Neglect is a consequence caused by True Positive Inversion or False Positive Inversion. It should be noted that Base Rate Neglect is one of several consequences that follow from True Positive Inversion. As we have seen, False Positive Neglect is also a consequence of True Positive Inversion.

Moving on to False Positive Inversion, the probability for a ‘false positive’ outcome is the probability of seeing the evidence (E) if the hypothesis at issue (¬H) is false, P(E|¬H). In legal evidence, it could for example be the probability that an eyewitness would incorrectly identify the suspect as the perpetrator at a line-up (E) in a case where the suspect is not the perpetrator (¬H), or the random match probability that the victim's DNA profile would match a blood stain on the suspects clothes (E) if the blood stain comes from someone else than the victim (¬H).

Equating the probability for false positive, P(E|¬H), with the inverted probability that the hypothesis is false given the evidence, P(¬H|E), commits False Positive Inversion. A fact-finder who commits this fallacy thinks that the probability of seeing the evidence if the hypothesis is false is also the probability that the hypothesis is false given the evidence, for example ‘since the probability that an eyewitness would incorrectly identify the suspect if he isn’t the perpetrator is less than 10%, and the eyewitness made a positive identification, the probability that the suspect is not the perpetrator is less than 10%…’. This incorrect inference about P(¬H|E) leads to an incorrect conclusion about the probability that the hypothesis is true given the evidence, P(H|E). Continuing the example: ‘…and since the probability is less than 10% that the suspect isn’t the perpetrator, there is a probability over 90% that he is guilty’. In reality, the probability could be much lower. As we have seen in the previous section, P(H|E) also depends on the prior probability (‘base rate’), P(H). If P(H) is low, P(H|E) could be much lower than the 90% that the fact-finder in the example erroneously believes (Base Rate Neglect). If, for example, the prior probability that the suspect is the perpetrator, P(H), is 1%, the probability for a true positive, P(E|H), is 90% and the probability for a false positive, P(E|¬H), is 10%, the probability that the suspect is the perpetrator given the eyewitness identification, P(H|E), is only 8%.¹⁶

False Positive Inversion is the most notorious fallacy in legal fact-finding and has become the standard example of fact-finder's inability to handle probabilities (see, for example, Byron, 2012; Fenton and Neil, 2011; Jackson, 1996; Verheij et al., 2016). In the literature on legal evidence False Positive Inversion is often referred to as ‘Prosecutor's Fallacy’ due to the influential article (Thompson and Schumann, 1987) that coined this label. When the hypothesis at issue is a forensic source-level-hypothesis, for example that a blood stain on the suspect's clothes comes from the victim, False Positive Inversion is sometimes called ‘Source Probability Error’ (Aitken and Taroni, 2004: 81–82). In my view, False Positive Inversion is a better label than ‘Prosecutor's Fallacy’ or ‘Source Probability Error’, since it captures what the fallacy is about.

False Positive Inversion was committed by the Swedish Supreme Court in a case about a burglary in a watch shop 2001.¹⁷ One of the burglars had cut himself on broken glass at the crime scene and left blood traces that matched the defendant's DNA profile. A forensic expert testified that the probability of a random match was negligibly small, and the Supreme Court ruled that since the probability for a false positive was negligible it could be ‘taken for certain’ that the defendant was one of the burglars.

As mentioned, the ‘likelihood ratio’ is the probability for a true positive divided by the probability for a false positive, P(E|H)/P(E|¬H). It is therefore possible that a fact-finder could commit a combination of True Positive Inversion and False Positive Inversion, leading to the likelihood ratio being incorrectly equated with the posterior odds, P(H|E)/P(¬H|E). It follows from Bayes Rule that the likelihood ratio equals the posterior odds if the prior probability is 50-50 (see note 15), but in all other cases equating the likelihood ratio with the posterior odds is incorrect and has the consequence that the prior is neglected. Incorrectly equating the likelihood ratio with the posterior odds has an effect on the posterior probability similar to False Positive Inversion but not identical. Their posterior probability functions converge when the probability for true positive approaches one and the probability for false positive approaches zero.

Prior probability fallacy: Base Rate Neglect

As we have seen in the previous section, True Positive Inversion and False Positive Inversion cause Base Rate Neglect. With regard to these reasoning errors, Base Rate Neglect should be treated as an effect of the inversion fallacy, not as the error of reasoning. Nevertheless, there are cases where disregard of the prior probability is the error of reasoning being committed and should be classified as Base Rate Neglect in its own right. As an example, consider a murder trial where the cause of death is controversial—the prosecution claims that it was homicide while the defence argues that the cause of death was suicide. A legal fact-finder who disregards the known difference in prior probability distribution between these rival hypotheses commits Base Rate Neglect. Suicide is a more frequent cause of death than homicide. By neglecting this base rate, the fact-finder inadvertently favours the prosecution.

This fallacy was committed in a Swedish case from 2019 where a famous influencer, Abbe ‘Blattelito’ Alsaadi, was prosecuted for the murder of his girlfriend. She had been found dead in her bathtub with burned-out candles on the bathroom sink, and her death was initially treated as a suicide. The autopsy showed that she had overdosed on tramadol and drowned. The police started to suspect that Alsaadi was responsible for her death and had staged the scene to look like a suicide. Alsaadi was prosecuted for murder and convicted by the Municipal Court. In Sweden, suicide is 11 times more common than homicide, but this was not taken into account by the court, when it assessed the probability that it was a homicide rather than suicide.¹⁸ The verdict was later overturned by the Appeals Court.

Zero-sum fallacies

An error of reasoning that is based on a misconception of the relation between competing propositions is a fallacy that is categorised in argumentation theory as a False Dichotomy (Govier, 2001: 293). There are several variations of this fallacy, and they occur in the evaluation of legal evidence when the fact-finder misconceives the relation between competing hypotheses (Dahlman, 2020: 1120–1121). Applied to competing hypotheses, the symbolic representation of legal decision-making as balancing the Scales of Justice suggests that evidence pushing one hypothesis down makes the competing hypotheses go up by just as much. This is true if the competing hypotheses are exclusive and exhaustive, but in other situations it can happen that one goes down more than the other goes up, and it can also happen that the evidence makes both hypotheses go up. Toby Pilditch, Norman Fenton and David Lagnado have observed in several empirical studies that people tend to overlook this and assume that competing hypotheses always behave as scales (Pilditch et al., 2019). A fact-finder who commits this error reasons under the incorrect notion that the net-effect of changes in competing hypotheses always add up to zero. Pilditch and his co-authors have therefore labelled it the Zero-Sum Fallacy. Since it manifests itself in two variants, I have included it on my list as the category of Zero-Sum Fallacies.

As I have just mentioned, a piece of evidence can increase the probability of two competing hypotheses. This happens with hypotheses that are not exclusive when the likelihood of the evidence is maximised under the condition that both hypotheses are true, and it happens with hypotheses that are not exhaustive when the likelihood is minimised if both hypotheses are false. As an example, consider a case where a Griess Test has been used to detect traces of nitroglycerin on the hands of a suspect, and the test was positive. This supports the hypothesis (H₁) that the suspect has recently handled explosives. The positive test result could also be caused by other things, for example touching playing cards that leave traces of nitrate (H₂). The probative value is insufficient to prove that the suspect has been handling explosives, but it would be incorrect to conclude that it lacks probative value ‘since it just as well could be caused by the handling of playing cards’. A fact-finder who makes this assessment commits the fallacy of Mutual Increase Neglect. H₁ and H₂ must be assessed in relation to the hypothesis (H₃) that the suspect has not recently handled explosives or playing cards or anything else that leaves traces of nitrate, and the fact-finder should have concluded that the positive test result increases the probability of H₁ as well as H₂.

The other variant in the category of Zero-Sum-Fallacies occurs in situations where a piece of evidence increases the probability of the hypothesis advocated by one party (H₁) and decreases the probability of the competing hypothesis (H₂), but one changes more than the other. This happens when the hypotheses are not exhaustive, and the evidence also changes the probability of some other hypothesis (H₃), or several other hypotheses. This can lead to a situation where evidence in support of H₁ decreases the probability of H₂ to practically zero, but does not increase H₁ to practically 100%, since the probability of H₃ has not been reduced to practically zero. In such cases, a fact-finder who commits the Zero-Sum Fallacy, and overlooks H₃, will incorrectly infer that the probability of H₁ is practically 100%, ‘since the alternative has been eliminated’. I will refer to this variant of the Zero-Sum-Fallacy as False Elimination. A fact-finder who commits False Elimination, incorrectly thinks that P(H₁) + P(H₂) = 1, and therefore infers that P(H₁) = 1 – P(H₂) = 1–0 = 1.

The Swedish Supreme Court committed the fallacy of False Elimination in a case from 2013 where a man was standing trial for the attempted murder of his wife.¹⁹ It was clear that the couple had been fighting and that she had threatened to leave him. On the night in question neighbours had heard loud noises and screaming from their apartment. The couple lived on the sixth floor and the fight ended with the wife falling from the balcony. She fell 13 meters and hit a stone-paved patio and survived miraculously, but her brain damage was so severe that she could not tell how she had come to fall from the balcony. According to the prosecution the husband had beaten her until she was more or less unconscious and had then lifted her up and carried her on his shoulder out on the balcony and thrown her over the railing. The husband claimed that he had not hit her and had not tried to kill her. According to his story, they had been shouting at each other and when the fight ended she was in a desperate state and tried to commit suicide by throwing herself over the railing. The forensic evidence was consistent with the prosecutor's hypothesis but completely at odds with the husband's story. Blood splatter on the walls and floor of the apartment clearly proved that he had beaten her, and a blood stain on the shoulder of his shirt supported the hypothesis that he had carried her on his shoulder when she was bleeding. The Supreme Court concluded that since the competing hypothesis offered by the husband had been eliminated, the prosecutor's hypothesis had been proved, and the husband was convicted of battery and attempted murder. The verdict overlooks a third possibility. She had threatened to leave him, and this was unacceptable and shameful according to the religious views of the husband. It could have been the case that the husband was not trying to kill her but trying to prevent her from leaving him. In this scenario, the husband beats her up and threatens to kill her if she leaves him. To scare her he carries her out on the balcony and holds her over the railing and says ‘If you leave me, I will kill you!’ At this moment she tries to escape his grip and thereby comes to fall from the balcony. This hypothesis is possible and consistent with all the forensic evidence in the case. If this hypothesis is true, the husband assaulted his wife but did not commit attempted murder, since he did not intend that she should fall from the balcony.

Reference class fallacy: Mismatch neglect

To assign probabilities for true positive and false positive, the fact-finder draws on available reference data about frequencies in the reference class at issue. Assigning a probability to the false positive that an eyewitness incorrectly identifies the suspect as the perpetrator in a line-up can, for example, draw on data from scientific studies on errors in eyewitness identification. In some cases, such reference data is presented to the fact-finder during the trial. A shoeprint at the crime scene showing that the perpetrator wore the same brand of shoes as the defendant could, for example, be accompanied by reference data from footwear retailers about the frequency of this brand among shoes in general, as an indication of the random match probability that the defendant by coincidence would wear the same brand if he was not the perpetrator. In many situations, however, the fact-finder is not served with such reference data, and must instead use personal information about the world as reference data. For example, in a trial where the defendant is given an alibi by his mother, the fact-finder must assign a probability to the false positive that the mother would say that the defendant was with her at the time of the crime if this is not true, drawing on personal knowledge about the inclination of mothers to lie to protect their sons.

The fact-finder uses an estimate of a relative frequency in the world outside the case, and assigns the same value to a conditional probability in the case. For this transportation to be correct, the relative frequency should pertain to the same reference class as the hypothesis at issue. For example, the probability that a mother would lie in court to protect her son should not be informed by the frequency of mothers lying to protect their sons in other contexts where they are not under oath and commit perjury. How the reference class should be defined has been discussed in the literature as the ‘reference class problem’ (Allen and Pardo, 2007; Colyvan and Regan, 2007). As James Franklin has demonstrated, the proper reference class includes all features that are relevant to predict the event in question (Franklin, 2011). Unfortunately, there is not always available data for this reference class. It is often the case that the available data pertains to a reference class that includes some but not all features of the proper reference class. A fact-finder who neglects this mismatch between the available data and the proper reference class and incorrectly treats the available data as if it pertained to the proper reference class commits a Reference Class Fallacy that I will refer to as Mismatch Neglect. This fallacy is not included on Koehler's list.

Mismatch Neglect can be illustrated with a Swedish case about a brutal murder in Syria in 2013.²⁰ At a police-raid in Sweden, a memory stick was found that contained an ISIS video documenting the execution of Syrian government employees captured at a power plant in Aleppo. The video shows a group of ISIS-soldiers with scarfs covering their faces who talk to each other in Swedish and Arabic and then decapitate the captured men. The owner of the memory stick admitted that he had made several trips to Syria and participated in ISIS-activities but claimed that he had not taken part in the execution depicted in the video. He and another man were nevertheless brought to trial. According to the prosecution, the defendants could be positively identified as two of the executioners in the video by several physical features that were visible although their faces were covered. One of these features was a scar on the wrist of one of the executioners in the same position as a scar on one of the defendants. To assess this piece of evidence the fact-finders had to consider the probability for a false positive, i.e., the probability that the executioner in the video would have such a scar on his wrist if he were not the defendant. The fact-finders were helped in this task by a forensic expert who assessed the probability for a false positive at 1/500, drawing on data about the frequency of scars. The court based its verdict on this assessment and overlooked a mismatch between the reference data and the proper reference class. The reference data informed the court about the frequency of scars in the general Swedish population, but the proper reference class should have pertained to the frequency of scars among men affiliated with ISIS. It is reasonable to assume that scars are more frequent in this reference class than in the general population. A fact-finder who neglects this difference will under-estimate the probability of a random match, and will consequently over-estimate the value of the evidence (likelihood ratio).

Doubling errors

The fallacy of Double Counting is committed when a probability assessment has already taken account of a certain circumstance as evidence and the fact-finder then updates the probability in a way that counts the same circumstance as evidence a second time. This may seem to be a silly mistake that no careful fact-finder would commit, but there are situations where it is not readily obvious that a certain piece of evidence is integrated in a probability distribution.

In a Norwegian handbook on legal evidence it is suggested that the prior probability in a criminal trial, before the evidence against the defendant has been presented, should be equated with the probability that a random defendant is guilty. If, for example, the fact-finder believes that 75% of the people who are brought to trial by the public prosecutor's office are guilty of the crime they are charged with, the fact-finder should set the prior probability at 75% (Eide, 2016: 104). This line of reasoning overlooks that the prosecutor's decision to select a person for trial is based on the evidence. If prosecutors picked defendants at random, with no consideration of the evidence, they would never manage to select guilty people 75% of the time, as the fact-finder in the example assumes. The probability assessment of 75% that a defendant is guilty has already taken account of the evidence that motivated the prosecution to select him for trial and make him a defendant. If that evidence, when it is presented at the trial, is used by the fact-finder to update that 75% probability further, the evidence is counted twice. In fact, this in one of the reasons why fact-finders are obligated to operate under a presumption of innocence, that forbids them to think in this way.

Unclear expert testimony can also cause Double Counting. Trace evidence often displays several matching features and it is not always clear to the fact-finder which of these features are included in an expert's assessment of the random match probability. Consider, for example, a shoeprint at a crime scene that matches a shoe that belongs to the defendant with regard to three features: shoe size, sole pattern and wear marks. An expert on shoeprints explains in court how he compared the shoeprint with the shoe under a microscope to conclude that the wear marks match. The expert says that the random match probability is 1 in 10,000. Is this assessment only referring to the wear marks? Or is it the expert's assessment of the probability that a random shoe would have the same shoe size, sole pattern and wear marks? Suppose that the latter is what the expert intends to convey, but the fact-finder incorrectly interprets the 1/10,000-probability to include only the wear mark, and therefore adds the matching shoe size and sole pattern to the shoeprint evidence (‘since the probability that the wear marks would happen to match is already 1 in 10,000, the probability that the shoe size and sole pattern would also match if the shoeprint was not made with the defendant's shoe must be at least one in a million’). This line of reasoning counts the shoe size match and the sole pattern match twice.

The opposite error occurs when the expert's assessment does not include a certain matching feature but the fact-finder incorrectly gets the impression that the feature has been accounted for in the expert's assessment, and therefore does not add its probative value to the overall assessment of the case. Such Double Omission means that the probative value of the matching feature is never counted. This happened in a Swedish burglary case in 2017, where an expert testified that the wear marks on a shoeprint had no probative value. The shoeprint matched the defendant's shoe in size and sole pattern, and this clearly had some probative value, but the fact-finders misunderstood the expert testimony to mean that the shoeprint had no probative value whatsoever.²¹

Convergent evidence fallacies

The evidence structure where several pieces of evidence support the same hypothesis has been treated in the literature on legal evidence under different names. Some scholars call it ‘corroborating evidence’ (Gardiner, 2023), while others prefer the term ‘convergent evidence’ (Cohen, 1977; Schum, 2009). I find the term ‘corroborating’ problematic since it may invoke the idea of a temporal ordering where one piece of evidence comes first and the other comes after and corroborates. I will therefore refer to evidence that supports the same hypothesis as ‘convergent’.

In cases with convergent evidence, the fact-finder needs to assess the combined support of the evidence, and this can go wrong in various ways. Convergent evidence is subject to several different fallacies with different effects. As we shall see in this section, some of these fallacies lead to an under-estimation of the combined support while others lead to an over-estimation.

Since the standard of proof is high in criminal cases it is hard for a single piece of evidence to carry sufficient probative value to fulfill the burden of proof by itself. In most cases, the prosecutor's hypothesis is supported by convergent evidence, where each piece of evidence is insufficient on its own, but the prosecutor argues that the combined probative value meets the standard of proof. In such cases, the defence attorney sometimes argues that each piece of evidence should be assessed separately vis-à-vis the standard of proof. If a piece of evidence does not meet the standard of proof, the fact-finder should discard it and move on to the next piece of evidence. This line of reasoning is, of course, incorrect. It means that convergent pieces of evidence are not integrated, and I will refer to this fallacy as Convergence Neglect. The inevitable effect of Convergence Neglect is that each piece of evidence is worthless if it does not fulfil the standard of proof by itself.

This fallacy was observed by Thompson and Schumann in their seminal article from 1987 and illustrated with the following example.

Suppose, for example, that the defendant and perpetrator share a blood type possessed by only 1% of the population. Victims of the fallacy reason that in a city of 1 million there would be approximately 10,000 people with this blood type. They conclude there is little if any relevance in the fact that the defendant and perpetrator both belong to such a large group. What this reasoning fails to take into account, of course, is that the great majority of people with the relevant blood type are not suspects in the case at hand. The associative evidence drastically narrows the group of people who are or could have been suspects, while failing to exclude the defendant, and is therefore highly probative, as a Bayesian analysis shows. (Thompson and Schumann, 1987: 171)

Thompson and Schumann called this error ‘Defense Attorney's Fallacy’, and this label has caught on. However, just like ‘Prosecutor's Fallacy’, the label is not very informative about the nature of the fallacy. I will therefore refer to it as Convergence Neglect. In an article on fingerprint evidence, Koehler describes the ‘tendency to treat imperfect information as irrelevant’ as the ‘Imperfection Fallacy’ (Koehler, 2008: 1100). This seems to be yet another label for the fallacy that we are discussing here. As we have seen above, the literature on probabilistic fallacies confusingly uses different names for the same fallacy (or variations of the same fallacy) as if they referred to separate fallacies.

Convergence Neglect was committed in one of the most infamous cases in Swedish criminal history—the murder of Malin Lindström—and resulted in an incorrect acquittal that was overturned 24 years later. Malin Lindström was a 16-year-old girl who disappeared in 1996. When her body was found in the woods six months later it was clear that she had been kidnapped, sexually molested and murdered. Her arms and legs were tied with duct tape, her sweater and bra were cut to pieces, her jeans were stained with sperm, and she had been stabbed six times in the back with a knife. There were no usable traces of the perpetrator, as the forensic scientists were unable to extract a DNA profile from the sperm on Malin's jeans. The main suspect in the police investigation was a young man who was the last person to be seen with Malin before she disappeared. When the police searched his home they found textile fibres on a pair of scissors in his garage that matched Malin's blue sweater, a plastic bag hidden under an outhouse containing knifes and two rolls of tape of the same two kinds that Malin was tied with, and drawings of naked women tied up in the same way as Malin and wounded in the same places as Malin. The man was prosecuted in 1998 and convicted in the Municipal Court but acquitted in the Appeals Court. The defence attorney, Ulf Holst, argued that each piece of evidence should be assessed separately vis-à-vis the standard of proof, and the Appeals Court followed this line of reasoning and committed Convergence Neglect.²² After the acquittal the murder of Malin Lindström rested in the police archives for over 20 years until a cold case unit re-opened the file and managed to extract a DNA profile from the sperm stain on Malin's jeans, with technology that was not available in the 1990s. The DNA profile matched the man who had been acquitted in 1998. A new trial was held in 2022, and he was found guilty. In the new verdict, the court did not only point to the new evidence, but explicitly stressed that the fact-finders in 1998 had made the mistake of only looking at each piece of evidence separately, overlooking the combined support of the convergent evidence.²³

Fact-finders should integrate the probative value of convergent evidence, but an ambition to do so does not guarantee that they get it right. It can be difficult for someone who is not trained in probability calculus to integrate convergent evidence correctly. Let us for example consider a case where three pieces of evidence independently support the same hypothesis, and the probability for false positive is 1/100 for each piece of evidence. This might not appear to be very strong evidence, but it follows from the product rule of probability calculus (for independent evidence supporting the same hypothesis) that the probability of seeing all three pieces of evidence if the hypothesis is false is one in a million (1/100 × 1/100 × 1/100 = 1/1,000,000). If the probability for true positive for each piece of evidence is close to 100%, it is close to a million times more probable that we would see the three pieces of evidence if the hypothesis it true than if it is false. A fact-finder who fails to understand and apply the product rule, and under-estimates the combined probative value of the convergent evidence, commits a fallacy that I will refer to as the Product Fallacy. The effect of the Product Fallacy goes in the same direction as Convergence Neglect. The evidence is under-estimated, and this could lead to an incorrect acquittal.

The Product Fallacy is not included on Koehler's list, but my impression from studying Swedish cases is that it merits a place. I have seen many cases where I suspect that the fact-finders have under-estimated what follows from their own assumptions according to the product rule, but it is often hard to say for sure that this is the case since fact-finders rarely express their assumptions in numbers. A case about an assassination at a bus stop in 2020 can be used as an illustration. A young man was waiting at a bus stop in Stockholm when a person came up from behind and shot him in the back. CCTV images showed that the perpetrator was wearing a green parka-style jacket. A man involved in organised crime was charged with the murder but claimed that he was innocent. The case against him was built on two pieces of evidence: gun cartridges from the crime scene and a jacket found in his apartment. The cartridges had DNA that matched the defendant. When he was asked to explain how his DNA was on the cartridges if he was innocent, he said that he belongs to a criminal gang where guns rotate. According to his explanation, he handled the gun and the ammunition at some prior occasion. A jacket of the same colour and type as the shooter was seen wearing in the CCTV images was found in the apartment where the defendant lived with his brother. Blood stains on the front of the jacket matched the victim and gunshot residue on the jacket matched the cartridges at the crime scene. DNA from several people, including the defendant, was found on the jacket. The defendant claimed that the jacket was not his. According to him, it must have belonged to one of his criminal associates. Several of them had lived in the apartment from time to time. The defendant was found guilty by the Municipal Court, but the verdict was overturned. According to the Appeals Court, it could not be ruled out that the perpetrator was one of the defendant's criminal associates, and he was acquitted.²⁴ The conclusion comes across as surprising and leaves the reader of the verdict with a feeling that the Appeals Court may possibly have under-estimated the combined probatory value of the cartridges and the jacket.²⁵ The ruling of the Appeals Court has recently been overturned by the Swedish Supreme Court.²⁶

As we have seen, the combined probative value of independent evidence that supports the same hypothesis is considerably greater than the value of each piece of evidence by itself. If, however, the evidence is not independent, the combined probative value could be reduced significantly by the probabilistic dependence between the evidence. Consider, for example, two eyewitnesses who have both identified the suspect as the perpetrator. These testimonies are two pieces of evidence (E₁, E₂) supporting the same hypothesis (H), that the suspect is the perpetrator. Let us first suppose that the convergent evidence is independent. The eyewitnesses have made their observations separately and have not been in contact with each other. In this case, the probability that both witnesses would incorrectly identify the suspect as the perpetrator (false positive) is P(E₁|¬H) × P(E₂|¬H). If, for example, the probability for an incorrect identification is 10% for each witness, the probability is 1 in 100 (0.1 × 0.1 = 0.01) that both witnesses would incorrectly say that the suspect is the perpetrator. Let us now, instead, consider a situation where the eyewitness identifications are not independent. The eyewitnesses sat in a car together when they saw the perpetrator and talked about their impressions of him afterwards. We must now factor in that they could have influenced each other by conditioning the probability for a false positive for one of them on the fact that the other has identified the suspect as the perpetrator. The probability that they would both make an incorrect identification is now P(E₁|¬H) × P(E₂|¬H,E₁). Let us suppose that the probability that the second eyewitness would incorrectly identify the suspect given that the first eyewitness has incorrectly identified him as the perpetrator, P(E₂|¬H,E₁), is 50%. The probability that both eyewitnesses would make an incorrect identification is now 1 in 20 (0.1 × 0.5 = 0.05) instead of 1 in 100. A fact-finder who overlooks that convergent evidence is dependent commits the fallacy of Dependence Neglect. Incorrectly treating dependent evidence as if it were independent creates an over-estimation of the combined probative value of the evidence and could lead to a false conviction. In the literature on legal evidence, Dependence Neglect has been discussed in several articles on the infamous Collins case (Fairley and Mosteller, 1973: 246; Koehler, 1997: 215; Tribe, 1971: 1335)

It should be pointed out that the effects of dependencies vary considerably. Some dependencies reduce the combined strength of the evidence significantly, while others only have a marginal effect on the combined probative value. In the latter case, the fact-finder can make the probability calculus easier with an approximate assessment that does not factor in minor dependencies. Such deliberate simplifications do not commit the fallacy of Dependence Neglect. In fact, they are often necessary to make probability calculus practically possible. Cases with lots of evidence often entail a number of minor dependencies that need to be discounted to make the calculus feasible.

Dependence Neglect was committed in a Swedish case from the 1980s—the murder of Catrine da Costa—that has been the subject of several books and is considered by many as one of the greatest scandals in Swedish legal history.²⁷ In the summer of 1984 four plastic bags with body parts were found at different locations in Stockholm. They belonged to a prostitute, Catrine da Costa, and two men were prosecuted for killing and dismembering her. One of the defendants was a pathologist who worked at a forensic medicine facility, the other was a general physician. The prosecution presented a complex collection of circumstantial evidence against the two defendants, including, among other things, the testimonies of a married couple who ran a photography shop. They testified that a man had come into their shop and handed in a roll of film, and when they had it developed for him they noticed that it contained pictures of a dismembered body. When they read in the papers about the murder of Catrine da Costa, they realised that it was her body in the photos and contacted the police. They were asked by the police to watch a line-up video, and they both identified one of the eight men in the line-up as the man who had handed in the roll of film and picked up the developed photos. The man they identified was one of the two suspects (the general physician). At the trial the court found this to be sufficient proof that the general physician had taken part in the dismembering of the body.²⁸ The fact-finders stressed in the verdict that it was highly unlikely that both the husband and wife would have identified him if he was not the customer in question. The fact-finders treated the testimonies as convergent independent evidence although they were clearly dependent. The couple admitted in their testimony that they had talked a lot about the strange customer and his physical appearance after he had left. They watched the line-up video separately, but they viewed it several times before they were sure who to identify and communicated with each other between the viewings. Dependence Neglect led the fact-finders to over-estimate the combined probative value of the testimonies.

Dependency between pieces of convergent evidence does not only occur when witnesses have been in contact with each other. There are probabilistic dependencies in many other situations as well. Consider, for example, a case where a burglar has been captured on a CCTV image wearing a black Adidas training jacket and white sneakers and the suspect was apprehended on the same night as the burglary wearing a black Adidas training jacket and white sneakers. Suppose that reference data on jackets says that approximately 1 person in 100 wears a black Adidas training jacket, while reference data on shoes says that approximately 1 person in 10 wears white sneakers. On the basis of this data a fact-finder might infer that the probability (false positive) that both the suspect's jacket and his shoes would match the perpetrator if he is innocent must be 1 in 1,000 (1/100 × 1/10 = 1/1,000). This would, however, overlook that these two items of clothing are often worn together. The probability that a random person wears a black Adidas jacket is 1/100, but the probability that a person wearing white sneakers is wearing such a jacket is lower, say 1/20. Neglecting this probabilistic dependency leads to an over-estimation of the probative value of the evidence.

In summation, the assessment of convergent evidence is susceptible to three different fallacies that pull in opposite directions. Convergence Neglect and the Product Fallacy make the fact-finder under-estimate the evidence, and could lead to a false acquittal. Dependence Neglect makes the fact-finder over-estimate the evidence, and could lead to a false conviction.

Link-skipping

When a piece of evidence is presented in support of a hypothesis it is often the case that the relation of support goes via a series of inferences. Evidence of this kind is often categorised as ‘indirect’ or ‘circumstantial’. Consider, for example, a burglary case where a shoeprint at the crime scene matching a shoe found in the defendant's apartment is presented by the prosecution as evidence that the defendant was the burglar. The matching features (E) support the hypothesis (H₁) that the shoeprint was made with the defendant's shoe, which supports the hypothesis (H₂) that the person wearing the shoe when it made the print was the defendant, which in turn supports the hypothesis (H₃) that the defendant was the burglar. This is an evidence chain with uncertainty in each link.

E H₁ H₂ H₃

The match (E) supports that the shoeprint was made with the defendant's shoe (H₁), but it could have been some other shoe with the same features. If it was the defendant's shoe (H₁), that supports that the defendant made the shoeprint (H₂), but it could have been someone else wearing the defendant's shoe. And if it was the defendant who made the shoeprint (H₂), that supports that the defendant was the burglar (H₃), but the defendant could have visited the crime scene before the burglary, and the burglar could be someone else. A fact-finder who assesses the probative value of the evidence (E) for the ultimate hypothesis in the evidence chain (H₃) must take every alternative hypothesis and every possible combination of events into account. The probability for a false positive, P(E|¬H₃), is the sum of the probabilities for all scenarios where the evidence is observed and the ultimate hypothesis is false (another shoe that also matches the shoeprint, someone else wearing the defendant's shoes, the defendant visiting the crime scene before the burglary).

An error occurs when the fact-finder incorrectly treats the support that the evidence (E) gives to the subsequent hypothesis (H₁) in the evidence chain as if it pertained to a hypothesis further down the chain, typically the ultimate hypothesis (H₃). By doing so, the fact-finder skips one or more links in the chain. Since every link entails uncertainty, this fallacy of Link-Skipping discounts parts of the uncertainty in an evidence chain, and this inevitably leads to an under-estimation of the probability for false positive, which leads to an over-estimation of the probative value of the evidence chain. As an example, consider the burglary case above. Suppose that a shoeprint expert testifies and says that the probability (false positive) of the observed match if the shoeprint is not made with the defendant's shoe is 1 in 10,000, P(E|¬H₁) = 0.001. If the fact-finder takes this to mean that the probability (false positive) of the observed match given that the defendant is not the burglar is 1 in 10,000, P(E|¬H₃) = 0.001, the fact-finder commits the fallacy of Link-Skipping. The possibility that someone else could have worn the defendant's shoes and the possibility that the defendant could have visited the crime scene at some other time are disregarded, resulting in an under-estimation of the probability for a false positive.

The three hypotheses in the burglary case can be categorised and distinguished from each other as a ‘source-hypothesis’ (H₁), an ‘activity-hypothesis’ (H₂) and an ‘offense-hypothesis’ (H₃), using the terminology introduced by Rebecca Cook, Ian Evett, Graham Jackson, Paul Jones and Jim Lambert (Cook et al., 1998: 232–233). In this terminology, Link-Skipping incorrectly equates the source-hypothesis with the offense-hypothesis. The fallacy has been discussed in the literature as the ‘Ultimate Issue Error’ (Aitken and Taroni, 2004: 82; Koehler, 1993: 32), and is included on Koehler's list under this label. In my view, Link-Skipping is a name that better captures what the fallacy is about.

Conclusion

In this article I have presented a systemic account of probabilistic fallacies in legal fact-finding. I have presented a list of 12 fallacies organised into 7 categories. If we compare it with Koehler's list, a number of differences can be noted. Some of the fallacies on my list merge fallacies on Koehler's list that are actually the same fallacy. For example, False Positive Inversion merges ‘Prosecutor's Fallacy’ and ‘Source Probability Error’. Some fallacies have been re-named to better reflect what the fallacy is about. For example, the fallacy referred to by Koehler as ‘Ultimate Issue Error’ is called Link-Skipping on my list. Some of the fallacies on Koehler's list are not on my list since they are not properly speaking fallacies ‘in legal fact-finding’, for example ‘Fingerprint Examiner's Fallacy’. And my list introduces some fallacies that are not on Koehler's list, for example the Product Fallacy and Dependence Neglect.

The list that I have presented in this article covers 12 probabilistic reasoning errors that a legal fact-finder could commit. To what extent fact-finders are inclined to commit these fallacies is a question for a different paper. There are some empirical studies on this question (e.g., Dahlman et al., 2016; Guthrie et al., 2000; Schweizer, 2005), but they have not been reviewed here. An important issue that also needs more research is how fact-finders can be helped to avoid probabilistic fallacies.

Footnotes

Acknowledgements

Thanks to Colin Aitken, Frans Alkemade, Ronald Allen, Edward Cheng, Joseph Gastwirth, Richard Gill, Joseph Kadane, David Lagnado, Moa Lidén, Anne Ruth Mackor, Yvonne McDermott Rees, Anders Nordgaard, Henry Prakken, Paul Roberts, Emily Spottswood, William Thompson, William Twining and two anonymous reviewers for helpful suggestions on previous versions of this manuscript.

Declaration of conflicting interests

The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Christian Dahlman

Notes

References

Aitken

Taroni

(2004) Statistics and the Evaluation of Evidence, 2nd ed. Chichester: Wiley.

Aitken

Taroni

Bozza

(2021) Statistics and the Evaluation of Evidence, 3rd ed. Chichester: Wiley.

Allen

Pardo

(2007) The problematic value of mathematical models of evidence. Journal of Legal Studies 36: 107–140.

Andersson

(2016) Skälig Misstanke. Stockholm: Wolters Kluwer.

Bar-Hillel

(1980) The base-rate fallacy in probability judgements. Acta Psychologica 44: 211–233.

Browne

(1998) The strangely persistent ‘transposition fallacy’: Why ‘statistically significant’ evidence of discrimination may not be significant. The Labor Lawyer 14: 437–456.

Byron

(2012) Evidentiary fallacies and empirical data. American Philosophical Quarterly 49: 175–182.

Cohen

(1977) Jonathan the probable and the provable. Oxford: Oxford University Press.

Cole

(2004) Grandfathering evidence: Fingerprint admissibility rulings from Jennings to Llera Plaza and back again. American Criminal Law Review 41: 1189–1276.

10.

Colyvan

Regan

(2007) Legal decisions and the reference class problem. International Journal of Evidence and Proof 11: 274–285.

11.

Cook

, et al. (1998) A hierarchy of propositions: Deciding which level to address in casework. Science & Justice 38: 231–239.

12.

Copi

(1961) Introduction to Logic, 2nd ed. New York: Macmillan.

13.

Dahlman

(2015) The felony fallacy. Law, Probability and Risk 14: 229–241.

14.

Dahlman

(2018) Beviskraft—Metod för bevisvärdering i brottmål. Stockholm: Norstedts.

15.

Dahlman

(2020) De-biasing fact-finders with Bayesian thinking. Topics in Cognitive Science 12: 1115–1131.

16.

Dahlman

Zenker

Sarwar

(2016) Miss rate neglect in legal evidence. Law, Probability and Risk 15: 239–250.

17.

Doherty

, et al. (1979) Pseudodiagnosticity. Acta Psychologica 43: 111–121.

18.

Eide

(2016) Bevisvurdering—Usikkerhet og sannsynlighet. Oslo: Cappelen Damm.

19.

Fairley

Mosteller

(1973) A conversation about Collins. University of Chicago Law Review 41: 242–253.

20.

Fenton

Neil

(2000) The jury observation fallacy and the use of Bayesian networks to present probabilistic legal arguments. Mathematics Today 37: 61–102.

21.

Fenton

Neil

(2011) Avoiding probabilistic reasoning fallacies in legal practice using Bayesian networks. Australasian Journal of Legal Philosophy 36: 114–150.

22.

Franklin

(2011) The objective Bayesian conceptualization of proof and reference class problems. Sydney Law Review 33: 545–561.

23.

Gardiner

(2023) Corroboration. American Philosophical Quarterly 60: 131–148.

24.

Gigerenzer

(1991) How to make cognitive illusions disappear: Beyond heuristics and biases. European Review of Social Psychology 2: 83–115.

25.

Govier

(2001) A Practical Study of Argument, 5th ed. Belmont: Wadsworth.

26.

Guthrie

, et al. (2000) Inside the judicial mind. Cornell Law Review 86: 777–830.

27.

Hamblin

(1970) Fallacies. London: Methuen.

28.

Hansen

(2002) The straw thing of fallacy theory: The standard definition of fallacy. Argumentation 16: 133–155.

29.

Holmgård

(2019) Bevisning i brottmål. Stockholm: Norstedts.

30.

Jackson

(1996) Analyzing the new evidence scholarship: Towards a new conception of the law of evidence. Oxford Journal of Legal Studies 16: 309–328.

31.

Kaye

(2011) The expected value fallacy in State v. Wright. Jurimetrics 52: 1–6.

32.

Kaye

Koehler

(1991) Can jurors understand probabilistic evidence? Journal of the Royal Statistical Society 154: 75–81.

33.

Koehler

(1993) Error and exaggeration in the presentation of DNA evidence at trial. Jurimetrics 21: 21–40.

34.

Koehler

(1996) The base rate fallacy reconsidered: Descriptive, normative and methodological challenges. Behavioral and Brain Sciences 19: 1–17.

35.

Koehler

(1997) One in millions, billions and trillions: Lessons from People v. Collins (1968) for People v. Simpson (1995). Journal of Legal Education 47: 214–223.

36.

Koehler

(2008) Fingerprint error rates and proficiency tests: What they are and why they matter. Hastings Law Journal 59: 1077–1100.

37.

Koehler

(2014) Forensic fallacies and a famous judge. Jurimetrics 54: 211–220.

38.

Koehler

Chia

Lindsey

(1995) The random match probability in DNA evidence: Irrelevant and prejudicial? Jurimetrics 35: 201–220.

39.

Koehler

Thompson

(2006) Mock juror’s reactions to selective presentation of evidence from multiple-opportunity searches. Law and Human Behavior 3: 455–468.

40.

Lindeberg

(2008) Döden är en man. Stockholm: Fischer & Co.

41.

Martire

, et al. (2013) The expression and interpretation of uncertain forensic evidence: Verbal equivalence, evidence strength and the weak evidence effect. Law and Human Behavior 37: 197–207.

42.

Morrison

, et al. (2017) A comment on the PCAST report: Skip the ‘match’/‘non-match’ stage. Forensic Science International 272: e7–e9.

43.

Nance

(2016) The Burdens of Proof: Discriminatory Power, Weight of Evidence and Tenacity of Belief. Cambridge: Cambridge University Press.

44.

Olsson

(1990) Catrine och rättvisan. Stockholm: Carlssons.

45.

Pilditch

Fenton

Lagnado

(2019) The zero-sum fallacy in evidence evaluation. Psychological Science 30: 250–260.

46.

Råstam

(2012) Thomas Quick: The Making of a Serial Killer. Edinburgh: Canongate.

47.

Robertson

Vignaux

Berger

(2016) Interpreting Evidence: Evaluating Forensic Science in the Courtroom, 2nd ed. Chichester: Wiley.

48.

Saks

Koehler

(2008) The Individualization Fallacy in Forensic Science Evidence. Vanderbilt Law Review 61: 199–220.

49.

Schum

(2009) A science of evidence: Contributions from law and probability. Law, Probability and Risk 8: 197–231.

50.

Schweizer

(2005) Kognitive Täuschungen vor Gericht—Eine Empirische Studie. Zurich: University of Zurich.

51.

Thompson

Schumann

(1987) Interpretation of statistical evidence in criminal trials: The prosecutor’s fallacy and the defense attorney’s fallacy. Law and Human Behavior 11: 167–187.

52.

Tribe

(1971) Trial by mathematics: Precision and ritual in the legal process. Harvard Law Review 84: 1329–1393.

53.

Verheij

, et al. (2016) Arguments, scenarios and probabilities: Connection between three normative frameworks for evidential reasoning. Law, Probability and Risk 15: 35–70.

54.

Wahlberg

Dahlman

(2021) The role of the expert witness. In: Dahlman

, et al. (eds) Philosophical Foundations of Evidence Law. Oxford: Oxford University Press.