Abstract
Boxing has long grappled with the problem of biased or “bad” judging. At its worst, this leads to “Robberies”, where boxers are widely seen as being denied rightful victories. Such incidents risk alienating fans and athletes. To address this problem, we propose a minimalist adjustment to the scoring system: the winner would be decided from the round-by-round scores of the judges, rather than relying on the judges’ overall bout scores. This approach, known as consensus scoring, is rooted in social choice theory and utilises majority rule alongside middlemost aggregation functions. We show that this scoring method creates a coordination problem for actively partisan judges and theoretically attenuates their influence on fight outcomes. Our analysis and simulations, using a stylised model of strategic judging behaviour, demonstrate the potential of consensus scoring to significantly decrease the likelihood of a single partisan judge from swaying the result of a closely contested bout.
Introduction
Boxing has a reputation for partisan and corrupt judging. At the amateur level, some decisions in Olympic gold medal bouts have attracted criticism and ridicule, becoming boxing folklore, such as Roy Jones Jr.’s defeat in the 1988 (Seoul) light heavyweight final to a South Korean fighter (Ashdown, 2012), and Joe Joyce’s defeat in the 2016 (Rio de Janeiro) super heavyweight final (Ingle, 2021; Rumsby, 2021). In professional boxing, there is longstanding suspicion about the integrity of judges (e.g., US Senate, 2001). Recent perceived “robberies” include Haney Vs. Lomachenko (Wainwright, 2023) and the first two editions of Alvarez Vs. Golovkin (Reid, 2023).
The prevalence of probable judging bias in combat sports has also been documented in a growing literature of empirical academic papers (e.g., Holmes et al., 2024; Lee et al., 2002). This behaviour was also described vividly in the recent judge-led independent investigation McLaren (2022) report, which examined unethical conduct in Olympic boxing after being commissioned by the Association Internationale de Boxe Amateur (AIBA). While the report did propose improved appointment processes and training of judges, it did not explore how to make the incentives inherent in the judging process more resilient to biases and corruption.
This short paper models the decisions of boxing judges and proposes an alternative scoring method that has the potential to significantly attenuate judge bias. Currently, scoring at the elite level is on a per-judge basis, with three judges usually employed for elite professional bouts and five at the Olympic amateur level. Under this system, a judge individually and subjectively scores each round according to the “10-Points Must” rules and then, in most cases, their entire “vote” goes to the boxer with the highest total score over all rounds. In most cases, the judge’s vote then goes to the boxer they scored as winning a majority of rounds. 1 After the scorecards of the judges are collected, victory in a bout is then awarded to the boxer who receives the votes from a majority of judges. If neither boxer receives a majority of the judges’ votes, due to at least one tied scorecard among the judges, then the bout is a draw. In this system, “aggregation over rounds and then judges”, or “majority judges rule”, it is relatively straightforward for a judge to ensure their vote goes to their favoured boxer. They just need to award them half the rounds (i.e., 7 of 12 for a world championship level men’s professional bout). They can do this while minimising backlash, by choosing the best rounds for their favoured boxer. 2
The change to the scoring system that we propose, “aggregation over judges and then over rounds”, or “Consensus Scoring”, is for each round to be awarded based on the aggregate scores over all judges. Normally, this would lead to whoever wins the majority of rounds winning the bout, rather than whoever wins on a majority of the judges’ scorecards. This represents a minimalist change to the scoring system in the sport, so that the aggregation of judges’ scores is first between them within rounds, and then over rounds, rather than vice versa. The minor nature of this change to the scoring system is sufficient to introduce a significant coordination problem for an actively partisan judge, and may be acceptable among fans. 3
In fact, this rule change has been considered before in the sport of boxing. It was proposed in 2000 by the National Association of Attorneys General Boxing Task Force in the United States, which was set up after the Professional Boxing Safety Act became law in 1997 (National Association of Attorneys General Boxing Task Force, 2000). Motivated by an especially contentious decision in the heavyweight world championship bout between Evander Holyfield and Lennox Lewis in 1999, the Task Force suggested Consensus Scoring, based on a recommendation from a mathematician at Stanford University, Dr. Ralph S. Levine. Subsequently, Algranati and Cork (2000) evaluated the potential effects of changing to the Consensus Scoring rule, using the judges’ scorecards from every professional world title bout between 1986 and 1999. They found that applying this scoring rule would have had minimal impacts on the outcome of these bouts. More recently, Berthet (2024) carried out a similar exercise with the same results by applying a consensus scoring rule on historical judging scorecards for thousands of mixed martial arts bouts, which have a similar scoring system to boxing.
Importantly though, while both of these ex post empirical studies recognised the potential benefits of Consensus Scoring in reducing the influence of anomalous judge scorecards on outcomes, neither could account for how a biased or partisan judge may have actually changed their scoring when faced with a Consensus Scoring system. We show that Consensus Scoring introduces a coordination problem for biased judges, which decreases their incentive to incorrectly award rounds towards a favoured boxer. This implies an inaccuracy in the counterfactual exercise of scoring bouts by consensus scoring when the observational data was generated by judges who were scoring under the majority judges system. We address this theoretically and with simulations, to demonstrate that the minimalist change to the status quo scoring system implied by Consensus Scoring could substantially alter the incentives and strategic behaviour of a partisan judge.
We focus on modelling the simplest practical case, with three judges for a bout, one of whom is biased in favour of one boxer. We also simplify our analysis by considering a hypothetical close fight, without knockdowns or point deductions, and where the rounds are all tight. 4 In this case, under majority judges rule, the presence of a partisan or biased judge can substantially increase the probability of a boxer winning despite being outnumbered by unbiased judges. Under Consensus Scoring, even if the partisan judge awards a majority of rounds to a favoured boxer, then this will have no impact on the final outcome unless those rounds align with the decisions of the other judges. This trivially mitigates the marginal influence of a single passively biased judge (e.g., a judge who has the same nationality as one of the fighters and displays nationality bias), whose anomalous scores become less relevant for the outcome of the bout. But more notably, the coordination problem implied by our proposed rule change also means that, to achieve a high probability of victory for their favoured boxer, the partisan judge, would have to decide to award more rounds to their favoured boxer than in the current system. This exposes them to scrutiny and potential backlash, as boxing pundits and fans will often criticise poorly awarded rounds on judges’ scorecards. 5 Our analysis and simulations of the model demonstrate that the scoring rule change could be highly effective in diminishing the incentives for biased judging in boxing and its influence on the outcomes of bouts.
From a theoretical perspective, the proposed Consensus Scoring rule is an application of majority rule and the middlemost aggregation function from social choice theory, which minimise the effective manipulability of outcomes by graders (e.g., Arrow, 1963; Balinski & Laraki, 2007; Young, 1974a, 1974b). This principle is already applied somewhat to the scoring in boxing, since the decision of the middlemost judge now determines the bout result. In Consensus Scoring, we instead suggest awarding bouts based on the aggregated middlemost round-by-round votes instead.
Our goal of improving the incentives of judges to score fights fairly in boxing is closely related to the focus of Frederiksen and Machol (1988), who analysed the subjective judging in sports like figure skating and dance, where the judges need to decide between multiple competitors, a setting where Arrow’s (1963) theorem implies that all possible ways to combine judge preferences have some undesirable characteristics. Frederiksen & Machol proposed a new method for aggregating judge scores for such situations that attenuates some of these issues. Their context though faced the problem of the Arrow Impossibility Theorem (social choice paradox), given there were more than two alternative outcomes in the contest. That theorem does not apply here for a boxing bout since it consists of just two competitors, only one winner, and potentially biased judges.
In general, we contribute to the vast literature that either carries out post hoc analysis of changes to scoring rules and laws in sports or proposes new changes based on theory (for recent surveys see Kendall & Lenten, 2017; Wright, 2014). Our work falls into the latter type of study, particularly where minimalist changes have been proposed that could still in theory substantially improve the fairness of sports outcomes. For instance, in the world’s most popular sport, association football, recent contributions have used simulations to explore whether incentives and outcomes could be altered significantly under different tie-breaking rules in round-robin tournaments (Csató, 2023; Csató et al., 2024), whether dynamic sequences in penalty shootouts could be fairer (Csató & Petróczy, 2022), and whether the allocation system for the additional slots of the expanded FIFA World Cup could be improved according to the stated goals of the organisers (Krumer & Moreno-Ternero, 2023).
Finally this paper builds on a growing literature studying various incentive issues in boxing and other combat sports (Akin et al., 2023; Amegashie & Kutsoati, 2005; Butler et al., 2023; Butler, 2023; Dietl et al., 2010; Duggan & Levitt, 2002; Tenorio, 2000). However, to the best of our knowledge, the incentives of boxing judges have not yet been studied directly, given the scoring rules they face, despite a well-developed literature on the influences and implications of biased decision making by the referees and judges in other sports (e.g., Bryson et al., 2021; Chowdhury et al., 2024; Dohmen & Sauermann, 2016; Reade et al., 2022, including other combat contests Brunello & Yamamura, 2023).
The remainder of our short paper proceeds as follows. In Section “The Model - A Partisan Judge in a Boxing Bout”, we setup a stylised model of potentially biased judging and strategic behaviour in a boxing contest. Section “Analysis, Results, and Discussion” describes our analysis and discussion of the model. The detailed proofs of the main propositions regarding the scoring rules are presented in the Online Appendix, as are variations on the main results from simulating the model.
The Model - A Partisan Judge in a Boxing Bout
Consider a contest between two boxers of equal ability, in the Blue and Red corners. We assume each sequential round
Each judge,
Judges have a utility of:
where
We consider the case of two fair judges who have
Under majority judges rule, the middlemost judge scorecard determines the bout. Under Consensus Scoring, the middlemost judge determines each round’s winner, and then the middlemost round determines the bout. Judges award rounds separately and simultaneously.
Analysis, Results, and Discussion
The partisan judge (
If
For three-round bouts, in which Red won a majority of rounds according to the true realisations,
We can calculate the probability of each fair judge awarding a round for Blue, denoted by
Proposition 1 establishes that Consensus Scoring is more robust to partisan judging than the majority judges rule for three-round bouts. We numerically solve the model to establish the robustness of this result in longer bouts.
8
We use a benchmark parametrisation of
To demonstrate a partisan judge’s decision making, Figure 1 shows the probability of Blue winning the bout, given they truly won 6 rounds, for each number of rounds the partisan judge awards them. Under majority judges rule, there is a sharp increase in the probability of Blue winning if the partisan judge awards them more than 6 rounds. If Blue truly deserved to win 4 or 5 rounds, then, to award Blue the win, the partisan judge only needs to risk the backlash associated with giving them 3 or 2 more rounds on their scorecard. In contrast, Figure 1 shows that under Consensus Scoring, a judge cannot secure a sharp increase in the probability of Blue winning by giving them a small number of extra rounds; more rounds only gradually increase Blue’s chances.

Simulated Probability of Blue winning, when both boxers truly won 6 of the 12 rounds, and 1 of the 3 judges favours Blue.
Figure 2 shows the impact of these differing incentives for the partisan judge, from running a series of simulations and counting the proportion of times each boxer wins under the two scoring systems, conditional on the true number of rounds won by Blue. When deciding the contest by majority judges, there is a high probability of erroneous results when Blue truly won only 4-6 rounds. When Blue truly wins the most rounds, the partisan judge unduly helps to lock in a deserved victory, so there is not a large difference in the number of incorrectly awarded bouts.

Probability of a “correct” result depending on the number of rounds truly won by Blue and how judges’ scores are aggregated.
Finally, in the Consensus Scoring case, it can be noted from Figure 2 that the probability of the Blue boxer winning always increases when the biased judge awards them more rounds. This is in contrast to the majority judges case, when a judge ceases to impact the result at the point at which they award a majority of their card to a boxer. For instance, consider a bout where the fair judge sees 10 rounds with
This effect, however, does not tend to lead to a greater probability of an erroneous result under Consensus Scoring. The main reason for this is that the effect occurs in a context where Blue has likely won a large majority of rounds and is likely to win the bout. The more important case is when a bout is more even and there is a sharp increase, under the majority judges rule, in the winning probability at the 7 round level in Figure 2.
This point can be seen in Figure 3, which shows the probability of each possible outcome on the y-axis and the number of rounds Blue truly won (excluding noise) on the x-axis. Under majority judges rule (bottom panel), in evenly matched bouts, where the true result is a draw, Blue wins 47.0% and Red wins 11.2%. When evenly matched bouts are awarded under Consensus Scoring (top panel), Blue wins 19.3% and Red wins 13.4%.

Probability of each outcome depending on the number of rounds truly won by Blue and how judges’ scores are aggregated.
Figure 3 also shows the frequencies where one boxer wins despite the other deserving outright victory, e.g., the blue area to the left of the vertical black line. Under majority judges rule, it is more likely for an erroneous victory to be in favour of Blue than Red; in this parametrisation, a robbery in favour of Blue is 12.5 times more likely than a robbery in favour of Red. Under Consensus Scoring, the likelihood of a robbery is still in Blue’s favour, by a multiple of 1.99, because there is still some incentive for the partisan judge to favour Blue. But this scoring system can substantially attenuate Blue’s advantage from the presence of a partisan judge. There are also fewer robberies in absolute terms.
For robustness, the Online Appendices demonstrate extensions and checks on our analysis. Appendix C considers simulations with alternative parametrisations of the benchmark model, and, in Appendices D-F we repeat the analysis for setups consistent with women’s professional, men’s Olympic, and women’s Olympic boxing, respectively (i.e., different numbers of rounds and judges). The results of all these extensions support our key findings: deciding bouts by Consensus Scoring, compared with by majority judges, makes it less likely that a partisan judge sways the outcome of a bout.
Supplemental Material
sj-pdf-1-jse-10.1177_15270025251348186 - Supplemental material for They were Robbed! Scoring by the Middlemost to Attenuate Biased Judging in Boxing
Supplemental material, sj-pdf-1-jse-10.1177_15270025251348186 for They were Robbed! Scoring by the Middlemost to Attenuate Biased Judging in Boxing by Stuart Baumann and Carl Singleton in Journal of Sports Economics
Footnotes
Acknowledgments
We are grateful for comments and advice from Anwesha Mukherjee.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental materials for this article are available online.
Notes
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
