Sage Journals: Discover world-class research

Abstract

Proportionality Analysis (PA) is usually perceived as applying a rationality-based formula to determine whether a legal act is (un)constitutional. However, behavioral economics suggests that decisionmakers—including judges—may be susceptible to various cognitive biases, which implies that PA might be similarly affected. Using a vignette experiment, we examine how different framings of legal cases influence PA judgments across three groups: administrative judges, law students, and non-law students. Results show that judges demonstrate minimal susceptibility to framing effects when conducting PA, suggesting that legal expertise and professional experience can provide significant protection against cognitive biases in judicial decision-making. These findings provide reassuring evidence for the rationality of PA as applied by professional judges, while demonstrating the debiasing impact of legal training and expertise. However, we also find that judges remain susceptible to other behavioral effects when making decisions that are unrelated to PA. We discuss the relevance of our findings for the current debate surrounding constitutional review, contrasting PA—used frequently around the globe—with the specific constitutional review process in the United States.

Keywords

proportionality analysis framing judges constitutional law behavioral law and economics

Introduction

Proportionality Analysis (PA) is a framework used by courts when reviewing clashes between rights and public interests. It embodies the old principle that one should not use a sledgehammer to crack a nut (Brems & Lavrysen, 2015).¹ Thus, if a public measure (e.g., a legal act or an administrative decision) infringes on a right, PA asks whether that infringement is proportional. Answering the question requires taking a set of analytical steps, such as placing weights on rights and applying an optimization formula to resolve clashes.

The logic underlying PA has had immense success in both international law and the national law of many countries (see Barak, 2012; Bulman-Pozen & Seifter, 2023; Collings & Barclay, 2022; Greene, 2018; Jackson, 2015; Popelier & Van De Heyning, 2013; Stone Sweet & Mathews, 2008). Scholars have, consequently, described PA as a “central feature of rights reasoning” (Klatt & Meister, 2012, p. 691), a “master concept of public law” (Mathews, 2017, p. 2), or even a “global constitutional principle” (cf. Kremnitzer et al., 2020, p. 9; Peters, 2017; Petersen, 2017, p. 6).² The United States is a notable exception: it does not formally engage in PA (Aleinikoff, 1986; Cohen-Eliya & Porat, 2011; Tebbe & Schwartzman, 2021) and relies more on categorical definitions and tiered scrutiny rather than explicit proportionality balancing (see, e.g., Greene, 2018). However, scholars have noted that even U.S. constitutional law engages in balancing of rights and interests, at least implicitly (Waldron, 2003).³ This emphasizes the importance of understanding how judges engage in such practices, whether through formal PA or other balancing mechanisms.

The prevailing understanding of these practices, particularly PA, has been shaped by certain assumptions about judicial reasoning. Hitherto, PA has been largely seen as a process in which judges rationally apply formulas to determine the outcome of legal cases (see Alexy, 2017; Brems & Lavrysen, 2015; Möller, 2012; Popelier & Van De Heyning, 2013). But does PA truly lead to a more rational decision? From numerous experiments, we know that adjudicators, such as judges and arbitrators (Franck et al., 2016; Guthrie et al., 2001; Helm et al., 2016; Rachlinski & Wistrich, 2018; Spamann & Klöhn, 2016), might fall prey to various heuristics and biases. In other words, “judges are incurably human” (Frank, 1931, p. 24), just like everybody else (Posner, 1993).

However, there is only scant evidence on the role of such biases in PA and no direct evidence on the effects of well-established behavioral effects, such as framing or loss aversion. For example, Sulitzeanu-Kenan et al. (2016) ask Israeli legal experts to conduct a PA of targeted killing scenarios. They find that the evaluation is affected by the experts’ policy preferences and various dimensions of the case’s facts, but they do not test for specific biases and heuristics. Broude and Levy (2019) similarly use scenarios that arise in international humanitarian law to investigate how a specific bias (the “outcome bias”)—affects PA.⁴ Their study contrasts laymen and experts, but does not include judges. Steiner et al. (2022) contrast different versions of PA (described in terms of either necessity or balancing) and find that a necessity test yields stronger protection of human rights compared to balancing. Their study, hence, also does not directly measure behavioral biases. Kantorowicz-Reznichenko et al. (2022) use a vignette study that describes a demonstration prohibited by the authorities. Varying the identity of the demonstrating organization, they elicit views about the legality of the prohibition using a simplified version of PA.⁵ Their study focuses on detecting an ideological (and not a behavioral) bias and uses a lay sample. Our study, thus, differs from the all the existing studies in two main ways: (i) we test for behavioral biases in the form of framing effects and (ii) we contrast different subject groups with varying degrees of legal expertise—non-law students, law students, and judges.⁶

Specifically, we are interested in whether legal expertise can work as a debiasing mechanism, enabling subjects to overcome the (potential) influence of framing effects. While many studies generally find cognitive biases also among expert adjudicators (outside the context of PA), some studies do find evidence of diminished bias in experts, including legal experts, compared to other populations (see, e.g., Broude & Levy, 2019; Mizrahi, 2018; Shereshevsky & Noah, 2017).⁷ Our experiment tackles this issue directly, thereby contributing also to the literature comparing laymen and experts.

In our experiment, subjects conduct a full PA of different vignettes (legal cases) by eliciting answers to the traditional tests of proportionality, including whether the act in question meets the scope of a right (and if so, whether it is the core of the right); whether the purpose of the act is proper (or legitimate), whether the measures used are suitable and necessary, and whether the act adequately balances public interests and rights. Each vignette has two different versions (only one of which is presented to the subject), which only differ in their framing. The differences in the frame each correspond to a typical behavioral effect, mostly focusing on the role of loss aversion (we included one additional vignette unrelated to loss aversion; see Appendix B for details).

We find clear evidence of framing effects in PA in some (but not all) contexts, especially for non-law students. Namely, non-law students were more likely to classify an act as disproportional if (i) its goal was framed as achieving a gain (rather than averting a loss) and (ii) if the act was framed as converting a stochastic loss into a certain loss (rather than converting a stochastic gain into a certain gain).⁸ Law students, however, were only affected by the latter: they were unaffected by the framing of the goal; but were equally affected as non-law students when the act was framed as converting a stochastic loss into a certain loss. Judges were only weakly affected by the framing in both of these vignettes. Specifically, framing had no effect on judges’ overall classification of the act as (dis)proportional; it only affected some of the interim stages of PA.

In order to test whether judges were able to avert the influence of framing due to their expertise, we presented them with standard problems from behavioral economics that focus on specific biases, such as overconfidence and conjunction fallacy. Interestingly, judges did demonstrate susceptibility to these biases, suggesting that they are not generally immune to behavioral biases. Rather, once one steps out of the domain of expertise, judges are still very much susceptible to standard behavioral effects. Overall, our findings suggest that framing effects do seem to matter in some contexts and that expertise may de-bias individuals in their professional context, but not in other contexts.

Importantly, we do not submit that PA is unhelpful for the process of legal reasoning. The need to structure one’s thought as a step-by-step analysis might even operate as a debiasing device, moving one’s thought process from System 1 (quick, emotional) to System 2 (slow, deliberate, rational).⁹ Yet, biases may still arise at each step of the way, possibly with spillover effects from one stage to the other. Whether this is the case is the object of this study.

The article’s contribution is fourfold. Firstly, to our knowledge, we are the first to test whether framing affects PA. We do so using a unique subject group, consisting of administrative judges and students in Germany. The evidence suggest that professional judges conducting PA are largely protected from framing effects that significantly influence other decision-makers. This finding challenges some previous studies, which found that judges in the United States are susceptible to framing in a legal context (see Rachlinski & Wistrich, 2018, with eight such studies), but is otherwise in line with the aforementioned general literature that finds that expertise can mitigate bias. Second, we demonstrate a clear gradient where legal training progressively reduces susceptibility to framing effects, with judges showing minimal bias, law students showing intermediate effects, and non-law students showing more substantial bias. Third, we show that this protection is domain-specific—judges remain susceptible to biases outside their area of expertise. Fourth, we enrich the discussion on the rationality of PA, pointing out how additional biases may affect it more generally (notwithstanding the possibility that the discursive method underlying PA might lead to more rational decisions compared to other alternatives).

The remainder of the article is organized as follows: the second Part (“The Theory of Proportionality and its Applications”) provides an introduction to PA, with specific attention placed on its different elements. The third part (“Biases and Heuristics Relevant to Proportionality Analysis”) entails a brief overview of behavioral effects—biases and heuristics that have been identified by the literature on behavioral law and economics; particularly those that potentially apply to PA. The fourth part (“Hypotheses Development”) follows by connecting these behavioral effects to the distinct elements of PA and by developing hypotheses. The fifth part (“Experiment”) 5 is dedicated to our experiment, describing its design and results. The sixth part (“Discussion”) discusses the general importance of our findings and elaborates on how and why they are relevant to the United States. The seventh and last part concludes.

The Theory of Proportionality and its Applications

PA is a structured judicial methodology employed in constitutional and administrative review to evaluate whether governmental actions that restrict individual rights or freedoms are justified under the circumstances (Barak, 2012; Bendor & Sela, 2015). This analytical framework systematically examines whether state interventions achieve a proper balance between legitimate governmental objectives and the degree of rights infringement imposed. The analysis typically proceeds through several sequential stages, looking at the aim of the government’s actions, the connection between the aim and the means, the availability of alternative measures, and a (normative) cost-benefit calculation.

Many scholars view PA as a rationality-based procedure (Brems & Lavrysen, 2015; Popelier & Van De Heyning, 2013), close to the reasoning of a cost-benefit analysis. Each step is designed to introduce rational scrutiny, to ensure the measure that infringes on human rights is not arbitrary or excessive. This view is reflected the influential conceptualization of PA developed by Alexy (2000, 2003, 2014, 2017), which remains prominent in legal scholarship. However, critics argue that PA is not truly objective but rather masks moral and political decision-making behind formal legal reasoning (Greene, 2018; Webber, 2010).¹⁰ Waldron (2003) has raised similar concerns about judges who implicitly balance rights and public interests in the U.S. context, though he does not address behavioral biases in judicial decision-making.

Although the precise application and the steps taken in PA differ across jurisdictions (see, e.g., Andenas & Zleptnig, 2006), we will follow a stylized version, omitting some of the fine-grained details that comparative legal scholars discuss. In particular, we consider a two-stage model of PA for constitutional review, similar to the one used, for instance, in Germany, Canada, South Africa, and Israel (see Barak, 2012; Bendor & Sela, 2015; Cohen-Eliya & Porat, 2010; Greene, 2018; Jackson, 2004; Petersen, 2020). This model is based on the distinction between the scope of the right in question and the extent of its protection (Barak, 2012; Bulman-Pozen & Seifter, 2023). Thus, judges first consider aspects pertaining to the petitioner’s right itself (do the petitioner’s claims demonstrate that a protected right is being interfered with? If so, is it an interference with the core of the right?). Provided that an interference with the right was found, PA has to be applied to determine whether the interference was justified.

PA is composed of four elements (or “prongs”): (i) a proper purpose (if a statute is under consideration), (ii) suitability (or rational connection), (iii) necessary means (or least intrusive means), and (iv) proportionality strictu sensu (balancing).¹¹ Suitability and necessity (the second and third prongs) focus on instrumental rationality and empirical concerns, whereas a proper purpose and balancing (the first and fourth prongs) are partially normative: they “express the requirement that principles be realized to the greatest possible extent given countervailing normative concerns” (Kumm, 2007, p. 137). We discuss each prong in further detail below.

Proper Purpose

Let us turn first to the proper purpose of statutes. Deciding whether a purpose is “proper” can be challenging, as it is a value-laden component (see Schlink, 2012). Nonetheless, in some cases, it is relatively easy to satisfy the threshold requirement of a proper purpose, especially if the constitution does not explicitly restrict aims or makes special requests (Kumm, 2007). For instance, Germany’s basic law does not have an explicit constitutional foundation for a “proper purpose”. The purpose has to be legitimate, but does not have to be compelling or enumerated. The German approach simply requests legality of an administrative act or general constitutionality of the purpose (Grimm, 2007; Schlink, 2012). Israel,¹² Canada,¹³ and South Africa¹⁴ have explicit foundations (see Barak, 2012). As an example from the international sphere, Article XX of the General Agreement on Tariffs and Trade and Article XIV of the General Agreement on Trade in Services of the World Trade Organization enumerate purposes allowed to be pursued by restrictive trade measures (Bartels, 2015), reflecting the idea of proper purpose as well.

Suitability and Necessity

The principle of suitability covers the question of whether the measure or the law is suitable to achieve its goal. The mode of reasoning is a ‘means-end’ relationship (see, e.g., Kretzmer, 2013). Alexy (2000) uses the following formalization: suppose that a measure M interferes with principle P₁ in order to (supposedly) promote principle P₂. If M does not actually promote P₂, then omitting it improves on P₁ at no cost. Hence, this step reflects of the idea of Pareto-optimality (Alexy, 2014; Chang & Dai, 2021; Petersen, 2020; van Aaken, 2003): omitting M improved one position without detriment to the other.

Yet, as the measure applied by a sensible legislator will usually promote their aim at least to some degree, there is a next step—that of “necessity”. The necessity test covers whether there are other, less intrusive means, which are equally able to achieve the stated goal of the measure.¹⁵ This principle requires that if there are two measures, M₁ and M₂, which promote P₁ equally, then the less intrusive measure (i.e., the measure detracting least from P₂) is chosen. Again, this closely reflects Pareto-optimality (Alexy, 2000): if there exists a less intrusive measure that is equally suitable for achieving its goal, then switching to it improves on P₂ without any cost. To avoid over-reliance on intuitions, both the suitability and necessity tests would, in theory, require an empirical analysis to determine the relevant probabilities, benefits, and costs of each principle under each of the possible measures (see, e.g., Greene, 2018).

Proportionality Strictu Sensu or Balancing

The third principle, proportionality strictu sensu (balancing),¹⁶ captures a balance between the satisfaction of one principle and the detriment to another. To capture this balance, Alexy (2003) developed a formula that combines several elements: the weight of the colliding principle in the abstract, the weight of these principles in the concrete case, and the reliability of empirical assumptions.¹⁷ This forumla establishes scales on the intensity of interference with rights (“light”, “moderate”, “serious”) and the importance of the right (e.g., the right to life has a more abstract weight than freedom of expression, though the concrete weight has to be determined on a case-by-case basis). These scales are supposed to be justified by giving reasons in order to be contestable. Alexy’s formula assigns numerical values to each of the elements (Klatt & Meister, 2012),¹⁸ which should then enable to decision maker to determine whether there is “significant disproportionality between the marginal benefit to the government and the marginal cost to the rights bearer” (Greene, 2018, p. 59).¹⁹

Differences Across Jurisdictions

While all jurisdictions that apply PA go through somewhat similar steps, there are some differences in the fine-grained details (Greene, 2018; Grimm, 2007; Kremnitzer et al., 2020; Petersen, 2017), some of which are more important than others. For instance, Canada considers whether the purpose of the act is of a “pressing and substantial concern”, whereas Germany only requires a “legitimate purpose” (Grimm, 2007, p. 388). However, Germany then considers whether the purpose is sufficiently important as part of proportionality strictu sensu, such that the final outcome of whether the act is struck down is still quite similar in both countries.²⁰ Thus, although each country takes into consideration different factors in a different order, the relevant factors can arguably be found somewhere along the prongs of PA, leading often to de-facto identical decisions.²¹ In our experiment, we focus attention on the subjects’ stated overall conclusion of whether the law is proportional but also look at subjects’ responses to the questions about the steps (prongs) that preceed the overall conclusion, thus making those fine-grained insights relevant for all jurisdictions, no matter how they use the prongs.

Furthermore, we submit that our study of PA with German judges offers relevant insights for American constitutional jurisprudence. At first glance, there are some methodological differences between U.S. judicial review and PA (see generally Cohen-Eliya & Porat, 2010, 2021; Greene, 2018; Lord, 2023).²² In particular, the American approach, often characterized by Dworkin’s idea of ‘rights as trumps,’ is sometimes seen as categorical—suggesting that litigants either possess rights that override governmental action or they do not (Greene, 2018, p. 65; Lord, 2023, pp. 13–14). American constitutional review also uses a seemingly different process, applying levels of scrutiny (see, e.g., Bulman-Pozen & Seifter, 2023): (i) “rational basis” (for non-fundamental, non-suspect or quasi-suspect rights) that examines only if laws rationally relate to their aims; (ii) “intermediate scrutiny” that requires important interests and substantial means-ends relationships (e.g. Klein, 1984);²³ and (iii) “strict scrutiny” (for fundamental rights) that demands compelling state interests with narrowly tailored implementation.²⁴

Nevertheless, the American approach is not fundamentally different from PA. Like PA, American constitutional analysis often requires identification of a governmental purpose (e.g., an important interest under intermediate scrutiny or a compelling interesting under strict scrutiny) and an ends-means analysis (e.g., “narrow tailoring” under strict scrutiny”, or a rational relationship between the challenged law and a legitimate government interest under rational-basis review). We submit that much of our discussion on framing is relevant to U.S. constitutional review as well, including the formulation of important interests.

More generally, the discussion of PA connects to current U.S. legal debate through three main points. First, prominent legal scholars have advocated for the Supreme Court to adopt PA as an alternative to its current approach, with notable articles in the Harvard Law Review (Greene, 2018), Columbia Law Review (Bulman-Pozen & Seifter, 2023), and the Yale Law Journal (Jackson, 2015) all supporting this shift. Critics of the Dobbs²⁵ decision have also suggested PA could have yielded different outcomes.

Second, some scholars argue that proportionality is already implicitly used by some Supreme Court justices (Cohen-Eliya & Porat, 2011) and increasingly applied in other contexts, including content moderation by social media platforms (Douek, 2021). More broadly, as the idea of proportionality is, to some extent, already recognized in U.S. constitutional law (Jackson, 2004),²⁶ judges may employ elements of proportionality reasoning,²⁷ making our findings applicable even without formal adoption of PA. Third, the United States is subject to international law and may be a party to litigation taking place in tribunals that implement PA explicitly. Therefore, our findings are informative for understanding whether heuristics and biases are likely to emerge within such litigation.

Biases and Heuristics Relevant to Proportionality Analysis

Legal institutions have built on particular assumptions of what may be termed ‘perfect’ rationality, roughly conforming to Rational Choice theory (see, e.g., Chapman, 1994; Jolls et al., 1998), which assumes the rationality of adjudicators and other actors applying the law (implicitly or explicitly). The rationality assumption has been called into question by cognitive psychologists and behavioral economists, such as Daniel Kahneman, Amos Tversky, and Gerd Gigerenzer.²⁸ These scholars explored systematic biases and heuristics—“mental shortcuts” or “rules of thumb” used in decision-making—that counter the rationality assumption, searching for a more realistic model of human behavior. Subsequently, the value and validity of applying a rationalist theory to legal questions has also been questioned by a movement often referred to as “behavioral law and economics” (see generally Zamir & Teichman, 2018). Behavioral law and economics focuses on systematic divergences from perfect rationality mainly using experimental studies in the lab under controlled conditions. Yet PA itself has largely been untouched by the behavioral analysis of law.

Kahneman and Tversky mainly dealt with facts and elementary logic, demonstrating that heuristics sometimes lead to errors. Kahneman (2011) differentiates between a fast and a slow system of human decision-making (so-called “dual system” theory).²⁹ The first system (System 1) is intuition, and the second (System 2) is logical thinking and reasoning. Intuitive decisions occur quickly, automatically, simultaneously, and without effort; they are associative and emotional. This system is prone to cognitive errors (see, e.g., Morewedge & Kahneman, 2010). The second system is, by contrast: slow, controlled, rule-governed, flexible, and non-emotional. It requires effort. Human beings often switch between these two systems when they have reason to do so; for instance, when they become aware of earlier failures of their own doing (Kahneman, 2011; 2013). It has been mainly tested by the Cognitive Reflection Test (“CRT”; Frederick, 2005),³⁰ which we also use in our experiment as a measure of individual proneness for falling prey to (System 1’s) intuitions. In the following paragraphs, we briefly explain the biases that might influence PA, with a special view to the type of subjects participating in our experiment.

In standard rational-choice models (“expected utility” concept), as in the weighing formula of Alexy in PA, the utility of each possible outcome is weighted by its (objective) probability. However, many experiments reveal that individuals may deviate from their expected utility, for instance, because their utility function is ‘rank-dependent’, that is, people assign different weights to different outcomes, which may diverge from the simple objective probabilities (for an overview, see Diecidue & Wakker, 2001). The most prominent version of such a function is so-called “Prospect theory” (Tversky & Kahneman, 1979; 1992).

In a nutshell, Prospect Theory identifies three key effects: (i) reference-dependent utility, (ii) loss aversion, and (iii) diminishing sensitivity. Let us briefly explain each in turn. The first effect implies that individuals do not think of payoffs in absolute terms, but rather compare them to a baseline—a “reference point”. Payoffs that are above the reference point are perceived as a gain, whereas those below the reference point are perceived as a loss (Kőszegi & Rabin, 2006; 2007; for a summary, see Feess & Sarel, 2022; Sarel, 2022). For example, an employee who was expecting a large bonus at the end of the year but received a small bonus gets a positive sum of money, but may still perceive it is a loss because it is less than expected.

The second effect, loss aversion, relates to how people weigh losses vs. gains: Prospect Theory predicts that individuals care more about incurring a loss than an equally sized gain. In other words, the increase in utility from a 100 Euro gain is less than the increase in utility when averting a 100 Euro loss. Most studies suggest that losses are approximately twice as powerful, psychologically, as gains (see Brown et al., 2024).

The third effect, diminishing sensitivity, describes how individuals become less sensitive to changes in value as the magnitude increases. This means that the psychological impact of an additional dollar gained or lost decreases as the total amount grows larger (in absolute terms). Prospect theory illustrates this with an S-shaped utility function that is concave in the domain of gains and convex in the domain of losses—changes close to the reference point have a large marginal effect on utility (as the slope is steeper), whereas changes far away from the reference point have a much smaller marginal effect. The most important implication of this effect is a divergence in risk attitudes across domains: individuals are risk-averse in the domain of gains but risk-seeking in the domain of losses (see, e.g., Sarel, 2022; Shefrin & Statman, 2003).³¹

Consequently, when the situation involves losses (compared to the reference point) individuals are more willing to take risky gambles and potentially incur large losses (to which they are marginally less sensitive), as long as they do not have to incur a loss with certainty.³²

Prospect theory is closely related to framing (see, e.g., Rachlinski & Wistrich, 2018), as the same situation can be either framed as a gain or as a loss, subsequently affecting how people perceive the payoffs from their decisions. Unlike Rational Choice Theory, which assumes description-invariance (that is, equivalent formulations of a choice problem should give rise to the same preference order; see (Arrow, 1982), framing effects imply that logically-equivalent presentations of a circumstance might nonetheless lead individuals to different choices. For instance, decisions may vary depending on whether circumstances are presented as positive or negative. Decisions about medical interventions are a typical example: A standard (rational choice) model would predict that patients would choose the most secure therapeutic method independent of how the choice is presented to them (for example, as death rates or survival rates) but framing effects imply that if a relatively safe therapeutic method is presented to a patient in terms of death rates (that is, potential loss), and an unsafe method is presented in terms of survival rate (that is, potential profit), then patients might suddenly prefer the unsafe over the safe method.³³ Interestingly, framing affects not only patients and their relatives, but also medical staff (i.e., experts; Druckman, 2004).³⁴ Thus, even expert decisions can be considerably influenced by factors that determine the manner in which a problem is presented, including the law (Rachlinski & Wistrich, 2018). In our context, this implies that adjudicators may reach different rulings, depending on whether different elements of the case are presented negatively or positively, as this can affect whether they perceive the legal outcome of the case as a gain or as a loss. These perceptions may be related either to the judge’s own ‘payoffs’ from adjudicating (for example, judges may perceive being overturned on appeal as a loss, causing them to be more careful from making errors)³⁵ or to the payoffs of the litigating parties (which the judge may care about). The important point is that framing can distort a rational decision, working via System 1.

There are many biases documented in the behavioral economics literature that could potentially affect PA, including the availability heuristic, representativeness heuristic, certainty effect, anchoring effects (Tversky & Kahneman, 1973, 1974, 1981),³⁶ and the hindsight bias (Guthrie, 2006, p. 432).³⁷ However, some biases seem especially likely to arise in PA due to its sequential multi-prong structure. For instance, A judge who decides that a petitioner’s claim resides within the scope of a right takes a sort of mental step in the direction of declaring the act as disproportional. The judge may then proceed down that path (irreseptive of the facts) due to a either a psychological need for maximal coherence³⁸ or a psychological need for consistency (for a meta-analysis, see Mullen & Monin, 2016). The judge might also perceive a deviation away from the direction of the initial step as a loss, echoing the “endowment effect” (Kahneman et al., 1990; Knetsch, 1989).³⁹ However, our experiment largely focuses on framing effects related to prospect theory, which we examine in detail in the following section.

Hypotheses Development

Behavioral Effects in Proportionality Analysis

If all human beings succumb to biases and heuristics, including—as hitherto found in research—judges and arbitrators, it is interesting to explore whether these also affect PA.⁴⁰ While judges may derive some direct utility from their PA decisions, the decision primarily affects the parties to the dispute. It is not directly obvious to which degree judges internalize the impact of their decisions on the parties: do judges care whether PA imposes a loss on the petitioner or on the defendant? There is some experimental evidence suggesting that biases such as the endowment effect may be weaker when decisions are taken by agents on behalf of third parties.⁴¹ When judges conduct a PA, their decision is not precisely “on behalf” of the parties, but the same mechanism that weakens the bias may nonetheless arise.

In previous studies on judges and arbitrators (not in the context of PA), some biases and heuristics were indeed found to be present (Franck et al., 2016; Guthrie et al., 2001; Helm et al., 2016; Rachlinski & Wistrich, 2018; Spamann & Klöhn, 2016). Subjects were specifically found to be at least somewhat susceptible to framing effects: one study found that judges reached different decisions (in an analysis unrelated to PA), depending on whether the defendant’s payoffs in a civil case were framed as a gain or a loss (Guthrie et al., 2001), but were less susceptible to these effects compared to other decision-makers (experts and laypeople). Another study found evidence of framing effects among U.S. judges in eight experiments on civil-dispute settings (product liability, contracts, bankruptcy, and others; Rachlinski & Wistrich, 2018).⁴²

Yet it is possible that a framing effect only arises for some judges and not others, due to differences in institutional, cultural, or personal attributes. For instance, unlike U.S. judges, the German administrative judges who participated in our experiment have been educated in the civil-law tradition and are used to operating under the constraints of the German court system. This might entail various biasing or debiasing mechanisms that are absent in the U.S.⁴³ As one example, a civil-law judge who is more used to methodically applying codified rules might react differently to how a case is framed compared to a common-law judge who is more used to applying case law.⁴⁴

Whether PA is more or less susceptible to behavioral biases is far from straightforward. On the one hand, the need to structure one’s thought in steps might operate as a debiasing device, forcing the decisionmaker to exert more cognitive effort and potentially switch from System 1 to System 2. On the other hand, each step may be subject to biases, which may even spillover from one step to another.

For instance, consider the issue of uncertainty in PA. Following Alexy, all steps of PA (except the proper purpose) are based on, or partially influenced by, factual uncertainty. Inter alia, uncertainty in PA translates into probabilities, for instance, the probability that the means chosen will achieve the purpose of the act (as part of suitability ) or the probability that a less intrusive measure will be as effective (as part of necessity). Furthermore, in proportionality strictu sensu, all weights are multiplied with probabilities.

When dealing with probabilities as part of PA, several heuristics and biases discussed above may play a role. First, when the legislator claims that P₂ will be achieved with some probability of X%, this may already anchor the adjudicator to that X. Second, judges may misestimate or misperceive probabilities due to many of the aforementioned biases (availability heuristic, representativeness heuristic, and so on).⁴⁵ Third, judges may respond to how the facts or legal questions are framed. Henceforth, we will focus on the latter, specifically on how framing connects with the effects identified by Prospect Theory (reference-dependence, loss aversion, and diminishing sensitivity).

Framing Uncertainty in PA: Reference Point and Diminishing Sensitivity

A legal case can be framed as involving either losses or gains, depending on which reference point it induces, and thereby influence whether the judge is more or less sensitive to the consequences of striking down a legal act. Specifically, judges would prefer a riskless prospect to a risky prospect of equal expected value in a gain frame, but prefer the opposite in a loss frame.

To understand how this applies to PA, consider that legal measures often involve uncertain outcomes—what we call “stochastic” effects, meaning the results involve some degree of randomness or probability rather than certainty. For instance, a security measure might reduce terrorist attacks by some unknown amount, or an environmental regulation might prevent an uncertain number of health problems. The question becomes: how do decision makers evaluate policies that convert these uncertain (stochastic) outcomes into more certain ones? We hypothesize that:

Decision makers tend to see an act as more proportional if it is framed as converting a stochastic gain into a certain gain rather than converting a stochastic loss into a certain loss, ceteris paribus.

In other words, when a policy is framed as securing uncertain benefits (like “this measure might prevent some attacks”), people are risk-averse and appreciate measures that make those benefits more certain. But when the same policy is framed in terms of uncertain losses (like “without this measure, some attacks might occur”), people become more accepting of that uncertainty—they are less motivated to eliminate the risk through government intervention. This difference in risk tolerance across gain and loss frames would cause decision makers to view the same policy as more or less proportional depending solely on how it is presented.

Framing the Legal Act’s Purpose: Reference Point and Loss Aversion

Whether a purpose of the challenged measure is perceived as “proper” may also depend on how it is framed. The formulation of governmental purposes can vary along several dimensions that may influence their perceived legitimacy and weight. Purposes can be presented abstractly (such as “environmental protection”) or concretely (such as “preventing 10,000 premature deaths annually”). They can focus on collective benefits or highlight the plurality of individuals who would gain from the measure. Additionally, purposes framed as addressing highly salient concerns like national security may naturally receive more weight from decision-makers.

Most relevant to our study is how purposes can be framed in terms of gains versus losses. The same governmental objective can be stated either as avoiding a negative outcome (highlighting what would be lost in the absence of the measure) or as securing a positive benefit (highlighting what would be gained from the measure). For instance, an anti-terrorism measure can be framed either as “averting losses from terrorist attacks” (loss frame) or as “promoting a more secure society” (gain frame). Similarly, a public health intervention can be presented as “avoiding the dangers of poor health and death” (loss frame) or as “promoting healthy lifestyles” (gain frame). Such framings can potentially affect the judges’ reference point, which again determines whether the outcomes are perceived as a loss or a gain.

Given people’s tendency toward loss aversion—caring more about preventing losses than achieving equivalent gains—we expect that purposes framed as preventing negative outcomes will seem more compelling and legitimate than those framed as achieving positive outcomes. We therefore hypothesize:

Decision-makers tend to see an act as more proportional if it is framed such that its purpose is to prevent a loss rather than to achieve a gain, ceteris paribus.

Individual Susceptibility to Framing: the Effect of Legal Training

In addition to the general influences of behavioral effects, our study also focuses on the role of experience and expertise. From an economic standpoint, specialization entails the exploitation of comparative advantages: those who become legal experts can use their time more effectively and incur lower effort costs while working on legal questions (see, e.g., Rachlinski et al., 2007). The traditional perspective in common law views legal expertise more as a superior ability to conduct analogical reasoning by “recognizing a similarity between the facts of some previous case and the facts of the instant case” (Schauer & Spellman, 2017, p. 249). Thus, the idea is that lawyers and judges are able to engage in analogical reasoning that differs from those of laymen. Schauer and Spellman (2017, p. 261) argue, however, that lawyers are not experts in analogical reasoning as such, but rather that they possess an ability to “see analogies that others do not and […] see structural and relational similarities (and differences) when others see only surface similarities and differences.” In other words, the expertise of lawyers lies in their ability to retrieve relevant sources of comparison within the legal domain and identify similarities (Schauer & Spellman, 2017, p. 263). This ability is generated through “immersion in legal categories – through study or practice or both” (Schauer & Spellman, 2017, p. 264).

But how does such legal expertise affect the susceptibility to cognitive biases? Does the ability to identify similarities mean that legal experts would be able to distinguish the facts of the case from how they are framed? Given the aforementioned findings in the literature that legal experts might be less susceptible to framing effects, we hypothesize that:

Legal expertise mitigates the framing effects.

Specifically, we expect that framing effects will be strongest among non-law students, weaker among law students, and even weaker (or non-existent) among judges. This reflects the notion that expertise, knowledge, and experience should help avoid the effects of biases due to framing.⁴⁶ To be clear, we do not claim that this reflects a cardinal scale—the difference in expertise between law students and judges may be drastically larger than the difference between law students and non-law students. We only assume that there is an ordinal ranking of expertise, such that judges are more experts than law students, who are more experts than non-law students.

Experiment

We designed an online experiment to test our four hypotheses. Sub-part 5.1. describes the experimental design. Sub-part 5.2. outlines our procedures. Sub-part 5.3. presents descriptive statistics and our findings.

Experimental Design

Our experiment presents subjects with three vignettes, each containing a brief legal case, followed by a series of questions that capture the various prongs of PA. We developed two versions for each vignette that differ only in framing, each tailored to test (at least) one of the possible biases and heuristics reviewed above. As subjects only see one version of each vignette, we can attribute any differences in the decisions to a framing effect. Furthermore, all three vignettes are purposefully built around novel problems (mostly dealing with new technologies) to avoid a situation where a subject’s familiarity with some existing cases would potentially affect their decision.

While the choice to use novel problems may somewhat detract from the vignette’s realism (as subjects have not faced such scenarios before), we submit that this is irrelevant for our study, for multiple reasons. First, the exact degree of realism is generally irrelevant because it is completely orthogonal to our framing treatments (for a discussion of how orthogonality in vignettes ensures validity, see Su & Steiner, 2020). In other words, as the degree of realism remains fixed across our treatments—which differ only in framing—it does not matter how realistic the underlying scenario is. Second, many of the existing experiments on framing include scenarios that lack realism,⁴⁷ and our vignettes do not raise any unusual difficulties in that regard.⁴⁸ Third, what may seem implausible in other countries, may not be implausible in Germany or Europe.⁴⁹ The vignettes are presented to subjects in a randomized order, so that we can rule out any interference of ‘order effects’, but for ease of presentation, we will provide numbers for the vignettes.

The vignettes are all written in German, but a full-text English translation is provided in our supplemental materials.

Vignette 1: Cryptocurrencies. Our first vignette tests our H1 (recall: this hypothesis deals with diminishing sensitivity and risk attitudes the framing of consequences of the disputed law as a either certain loss or a certain gain). Subjects are presented with a scenario where a fictitious state requires individuals who trade in cryptocurrencies to acquire a “laymen’s certificate” as a pre-condition for selling. The claim of the state in this scenario is that this certificate helps sellers to avoid scams that otherwise would cause some people to lose their money. Our treatment is then in the framing of the state of the world with this act versus without the act (that is, if it is nullified by the court). The treatment closely follows the approach of Tversky and Kahneman (1981) in their seminal article on the framing of decisions (see also Rachlinski & Wistrich, 2018). In their original article, subjects had to choose between two options that yield the same expected outcome, which is framed either as a gain (lives saved) or as a loss (deaths), where one option is stochastic whereas the other entails certainty. Following the same logic, our vignettes say that there are 600,000 market participants, and differ only in the description of the two options faced by the subjects. For instance, the gain frame says that:

“(a) Without the act that requires a laymen certificate, it is estimated that with 1/3 probability no market participant will keep their money, and with 2/3 probability all of the market participants will keep their money.

(b) With the act that requires a laymen certificate, 400,000 market participants will keep their money.”

Whereas the loss frame says that:

“(a) Without the act that requires a laymen certificate, it is estimated that with 1/3 probability all of the market participants will lose their money, and with 2/3 probability none of the market participants will lose their money.

(b) With the act that requires a laymen certificate, 200,000 market participants will lose their money.”

Note that the only difference is in the use of “keep” vs. “lose”, but numerically both versions are identical. Following H1, we expect that in the loss-frame treatment, subjects will tend to avoid option (b) because it refers to a certain loss, and prefer the stochastic option (a). Respectively, the opposite should occur in the gain-frame treatment.

Vignette 2: Bees. Our second vignette tests our H2 (recall: this hypothesis deals with the framing of the purpose of the act as preventing a loss or achieving a gain). Subjects are presented with a scenario where a fictitious state mandates the use of a special spray in gardens of private residences (house or apartment). In the gain frame treatment, the purpose of the act is framed as aiming to

“protect the life of bees and thus lead to a healthy and sustainable environment”,

whereas the loss frame treatment says instead that the purpose of the act is to

“prevent the death of bees and thus avoid an unhealthy and unsustainable environment”.

Thus, the only difference is in whether the purpose of the law is framed as acquiring a gain or preventing a loss. Following our H2, we expect that subjects will be more likely to classify the act as proportional in the loss frame treatment.

To avoid confusion, note the following key difference between Vignette 1 and Vignette 2: in the first vignette, the loss-frame treatment refers to a legal act that may induce a certain loss (instead of a stochastic one) whereas in the second vignette, the loss-frame treatment refers to a legal act that may avert a loss. That is why we anticipate judges in a loss-frame treatment to be more supportive of the legal act in Vignette 2 but less supportive in Vignette 1.

PA prongs. To facilitate a full-blown PA, we elicit subjects’ agreement (on a 7-point likert scale) with the following statements:⁵⁰

1. The act meets the scope of the right.⁵¹

2. The act meets the core of the right.

3. The act promotes a proper purpose.

4. The act is suitable to achieve its purpose.

5. The act is necessary to achieve its purpose.⁵²

6. The act is proportional in the strict sense.⁵³

7. The act is overall proportional.

Our main variable of interest is the seventh item—an overall judgment of proportionality. However, we also measure the other prongs to get a better understanding of the mechanisms leading subjects to their judgment. Additional measurements are described below.

Procedures

Our experiment was conducted separately on two samples: students (law/non-law) and judges. For the student sample, we imposed a minimum sample size requirement to ensure sufficient statistical power. For the judges, we simply ran the experiment on all those who participated. Our supplemental materials entail a brief power analysis, showing that our sample is of sufficient size to detect even fairly small effects.

We ran the experiment first with the student sample, which included law students and non-law students. The experiment was conducted online during the months of September-October 2021 using the standard software Qualtrics.⁵⁴ Subjects were invited to participate directly through the social science lab at the University of Hamburg, restricting the invitation to students only, and imposing a comparable group size of law students and non-law students. We asked subjects to complete the survey in one go and most answers (over 90%) were completed within 1 h from the beginning of the survey.⁵⁵ Subjects were paid a fixed fee of 9 EUR for their participation. After completing a consent form, subjects proceeded to answer the three vignettes (our vignettes 1 and 2, and the additional vignette described in Appendix B, in a randomized order, as mentioned). Thereafter, subjects were asked to complete the aforementioned CRT⁵⁶ (in which the items were again presented in random order) and then filled out a brief questionnaire. The questionnaire measured variables such as basic demographics (for example, gender, age-cohort, mother tongue, education; see our supplemental materials for full details), whether the subject had previous acquaintance with behavioral economics, and a question about the general tendency to take risks in order to measure risk aversion on a 10-point scale (Dohmen et al., 2010). In total, 110 law and 121 non law students participated.

Thereafter, we ran the experiment with German administrative judges. Procedures were similar to those used for the student sample, with a few exceptions: First, judges were invited to participate through the President of the highest administrative court of the state of Lower Saxony (Niedersachsen), in conjunction with an invitation to a judicial workshop, which took place in June 2022. All judges were from the same federal state within Germany: Lower Saxony. Second, for obvious reasons, judges did not receive any payment for their answers. Third, judges were asked a few additional questions related to their experience on the bench and several questions that tested whether they fall prey to biases outside of their domain of expertise. We describe these additional questions in more detail in the sub-part on “Are Judges Susceptible to Behavioral Effects Outside Their Domain of Expertise?” below. We also introduced minor changes for practical reasons.⁵⁷ Importantly, all of the additional questions were asked only after the vignette study was completed, so they could not confound our results. Answers from judges were gathered in the months leading up to the judicial workshop (February – April 2022), yielding 86 valid responses.⁵⁸

Descriptive Statistics and Findings

Our sample consists of 317 subjects: 121 non-law students, 110 law students, and 86 judges. Table 1 compares descriptive statistics between the three groups. We list the variables we compare in the first column and specify the p-value in the last column. The table delivers three main insights. First, law and non-law students in our sample do not differ much in their general attributes. Second, judges are older and more risk-averse. The share of females is also smaller among judges compared to our student sample. Three, there are no significant differences in performance in the CRT, but there are differences in prior knowledge of behavioral economics and whether German is the subject’s mother tongue. Table A1 in Appendix A further provides a breakdown of descriptive statistics across our framing treatments in order to check whether randomization worked well. The table reveals that the subject’s features are overall well-balanced across treatments (cross-treatment differences are almost all insignificant at the 5% level),⁵⁹ suggesting that randomization indeed worked quite well. In any case, we account for any remaining differences later in our analysis by using control variables in our regressions.

Table 1.

Comparison of Descriptive Statistics

Factor	Non-law students	Law students	Judges	p-value
Number of subjects	121	110	86
Female, mean (SD)	0.744 (0.438)	0.709 (0.456)	0.581 (0.496)	.038
Age, mean (SD)	28.331 (5.087)	26.627 (3.771)	46.047 (10.178)	<.001
Risk aversion, mean (SD)	5.686 (1.962)	5.864 (1.879)	7.070 (1.532)	<.001
CRT: Correct ans, mean (SD)	1.868 (1.103)	1.80 (1.056)	2.023 (0.982)	.33
Knowledge of behavioral economics, mean (SD)	0.124 (0.331)	0.045 (0.209)	0.140 (0.349)	.054
German mother tongue, mean (SD)	0.727 (0.447)	0.909 (0.289)	1.00 (0.000)	<.001

Note. This table compares the descriptive statistics between the three subject groups. We used the following statistical tests: Pearson’s $χ^{2}$ was used to compare our binary variables (Female, Knowledge of Behav. Econ., German Mother Tongue); ANOVA was used for comparing the Age and the mean of the CRT. Results are the qualitatively the same when using non-parametric tests.

Proportionality Assessment

Figure 1 compares the means of subjects’ choices for Overall Proportionality in the two main vignettes across the subject groups (for our additional vignette, see Appendix B). Table 2 complements the relevant information for this comparison, presenting a statistical comparison of all seven PA items.⁶⁰ This comparison yields several insights.

Figure 1.

Comparison of Mean Agreement Rates with Overall Proportionality. Note. This figure compares the mean rate of agreement with the statement that the act is overall proportional. The dark gray bars correspond to the loss frames, whereas the light gray bars correspond to the gain frames. The lines on top of the bars represent 95% confidence intervals

Table 2.

Descriptive Statistics of the Different Prongs of PA

	Non-law students			Law students			Judges
	Gain frame	Loss frame	p	Gain frame	Loss frame	p	Gain frame	Loss frame	p
Panel A: Crypto vignette
N	68	53		46	64		40	46
Scope of the right	5 (3, 6)	5 (3, 5)	.38	5 (3, 6)	5 (3, 6)	.66	6 (5, 7)	5.5 (3, 6)	.008
Core of the right	4 (3, 5)	3 (3, 5)	.12	3 (2, 5)	4 (3, 6)	.14	3 (2, 4)	3 (2, 5)	.92
Proper purpose	5 (4, 6)	6 (5, 6)	.015	6 (6, 7)	6 (5, 7)	.58	6 (6, 7)	6 (5, 6)	.042
Suitability	4 (3, 5)	5 (4, 6)	.002	6 (4, 7)	6 (5, 7)	.58	4 (2, 6)	5.5 (4, 6)	.008
Necessity	4 (3, 5)	5 (3, 6)	.086	4 (2, 5)	4 (3, 6)	.019	4 (2, 5)	5 (3, 5)	.034
Proportionality Strictu Sensu	4 (3, 5)	5 (4, 6)	.006	3 (2, 5)	4 (3, 5)	.052	3 (2, 5)	4.5 (3, 5)	.080
Overall proportionality	4 (3, 5)	5 (4, 6)	<.001	3 (2, 5)	4 (3, 5)	.023	3 (2, 5)	4 (3, 5)	.093
Panel B: Bees vignette
N	65	56		49	61		50	36
Scope of the right	5 (4, 6)	5 (3, 5.5)	.098	6 (6, 7)	6 (6, 7)	.44	7 (6, 7)	6 (6, 7)	.094
Core of the right	4 (3, 5)	4 (3, 5)	.66	5 (3, 6)	5 (3, 6)	.77	5 (3, 6)	3 (2, 5.5)	.078
Proper purpose	6 (5, 7)	6.5 (6, 7)	.056	7 (6, 7)	7 (6, 7)	.97	6 (6, 7)	6 (6, 6)	.28
Suitability	5 (4, 6)	6 (5, 6.5)	.010	6 (6, 7)	6 (6, 7)	.81	6 (6, 7)	6 (5, 6)	.05
Necessity	4 (3, 5)	5 (4, 6)	.002	5 (3, 6)	5 (3, 6)	.55	3.5 (3, 5)	3.5 (3, 5)	.89
Proportionality Strictu Sensu	5 (4, 6)	5 (4, 7)	.016	5 (4, 6)	5 (3, 6)	.36	4 (3, 5)	3.5 (3, 5.5)	.77
Overall proportionality	5 (3, 6)	6 (5, 6)	.003	5 (4, 6)	5 (3, 6)	.58	3 (3, 5)	4 (3, 5)	.58

Note. This table compares the agreements rates with the seven PA prongs. For each prong, the median (IQR) is presented for each of the treatments, separated by group. Within each group, a between-treatment comparison is conducted using a Wilcoxon rank-sum test, yielding the p-value listed in the table. Note that a higher IQR corresponds to the group whose values are (significantly or insignificantly, depending on the p-value) higher.

First, in the crypto vignette, both law and non-law students express a higher agreement rate with the statement that the act is overall proportional in the gain-frame treatment (light gray bars) compared to loss-frame treatment (dark gray bars). This difference is statistically significant at the 1% level in the non-law group and 5% level in the law group.⁶¹ Judges demonstrate the opposite trend, but the difference becomes insignificant once control variables are accounted for (see below). Thus, for the students, but not the judges, the behavior is in line with our H1: the loss frame induces subjects to see the act as less proportional when it is framed as converting a stochastic loss to a certain loss.

Second, in the bees vignette, we find stark differences: we find a framing effect for non-law students, but no effect for law students or judges. Hence, our H2 seems to hold for non-law students, but not for law students and judges. The direction of the effect on the non-law students is as expected: these subjects tend to view the act as more proportional when its purpose is framed as preventing a loss. Turning to Table 2 reveals that, interestingly, the framing affected not only the proper purpose prong but almost all prongs: in the loss-frame treatment, non-law subjects were also more likely to view the act as meeting the scope of the right, and as unsuitable, unnecessary, and unbalanced. Furthermore, some of the judges’ prongs seem to be affected—those referring to the scope of the right and suitability (in the same direction as non-law students) and the prong concerning the core of the right (for which non-law students were unaffected).

As a robustness check, we run also linear (OLS) regressions, which enable us to compare the groups while also controlling for underlying differences between the subjects (for example, their previous knowledge of behavioral economics). Our regression model is then

p r o p o v e r a l l_{i} = β_{0} + β_{1, i} T r e a t m e n t_{i} + β_{2} L a w s t u d + β_{3} J u d g e + β_{4, i} (T r e a t m e n t_{i} \times L a w s t u d) {+ β}_{5, i} (T r e a t m e n t_{i} \times J u d g e) + β_{x, i}^{'} X + ϵ,

where

i \in {1, 2}

is an indicator for the vignette,⁶²

p r o p o v e r a l l

is the measure of overall proportionality (same as the one used in Figure 1), Treatment is the framing condition (depending on the vignette),

L a w s t u d

and Judge are a dummy variables assigning 1 for the law-student group and judges group, respectively. The interaction terms capture whether the treatment effect differs across the three groups (law vs. non-law vs. judges), X is a vector of control variables, and

ϵ

is the error term. Importantly, we run separate regressions—one for each vignette—because the vignettes do not actually share any common treatment (that is, the element that we frame differently is not the same across vignettes). Thus, we cannot, for instance, use a three-way interaction. Instead, we treat each vignette as an independent task.

For the sake of brevity, we focus on the treatment’s average marginal effects, which capture how they affect the subjects on average, after accounting for the fact that each group may respond differently (see, e.g., Bea & Poppe, 2021). The results are provided in Table 3 (the OLS coefficients used to calculate the marginal effects are provided separately as Table A2 in Appendix A). Importantly, the average marginal effects are derived from regressions that include interaction terms. Panel A of the table lists the marginal effects for each of the three vignettes, whereas Panel B provides pairwise comparisons of these effects.⁶³

Table 3.

Effect of Framing (Average Marginal Effects)

	(1)	(2)
	Crypto gain frame	Bees gain frame
Panel A: Treatment effect
Non-law students	0.76^*** (0.25)	−0.98^*** (0.27)
Law students	0.84^** (0.33)	0.12 (0.37)
Judges	−0.53 (0.34)	−0.18 (0.34)
Controls	Yes	Yes
Observations	317	317
Panel B: Pairwise comparisons
Law stud. versus Non-law stud.	0.077	1.10^**
Judges versus Non-Law stud.	−1.29^***	0.79^*
Judges versus Law stud.	−1.37^***	−0.30

Note. This table presents (average) marginal effects of the framing treatment on each of the three subject groups: non-law students, law students, and judges. Robust standard errors are in parentheses. OLS coefficients can be found in Table A2 in Appendix A. Control variables are: Female dummy, age, German mother tongue, Knowledge of Behav. Econ., CRT: Correct answers, and order effects. *p < .1; **p < .05; ***p < .01.

Table 3 suggests that our findings are robust to the inclusion of controls (that is, even after we control for differences between subjects, the results persist). Starting from the crypto vignette (column 1), the effect (of being assigned to the gain frame) is positively significant for both non-law (p < .01) and law (p < .05) students, but insignificant for judges. Panel B complements the information by showing that judges indeed differ from the students (as they are unaffected whereas students are affected) but also that law and non-law students are equally affected (the difference of 0.077 is insignificant). Next, in the bees vignette (column 2), the effect on non-law students is negatively significant (p < .001) but the effect on judges and law students is insignificant. Summing up, our OLS regressions reinforce all of the aforementioned findings.

Are Judges Susceptible to Behavioral Effects Outside Their Domain of Expertise?

As mentioned, we also set out to measure whether the German administrative judges—who are experts on PA—fall prey to biases outside their domain of expertise. This measurement included three tasks, all of which were presented after all vignettes have been completed. Thus, these measurements could not confound our results.

The first task was the CRT, which we also presented to the other subjects in our sample. Recall from Table 1 that judges answered approximately 2 (out of 3) questions of the CRT correctly, which means they are at least somewhat susceptible to a behavioral bias (namely, they choose, on average, one answer that is incorrect). The second question relates to overconfidence in judicial decision-making outside the experiment and, therefore, could be presented to judges but not to the other subjects. The behavioral economics literature reveals that people are generally prone to overoptimism and overconfidence (Guthrie et al., 2001). In particular, people tend to believe that they are relatively better than others (a “better than average” effect; Alicke, 2005). The potential for overconfidence can have important implications for judges, as it could “prevent judges from maintaining an awareness of their limitations . . . [and] may make it hard for judges to recognize that they can and do make mistakes” (Guthrie et al., 2001, p. 811).

To measure overconfidence, we asked the judges to rate their relative performance across four judicial parameters: the rate of being overturned (OCapprate), their ability to assess witnesses (OCwitness), their procedural efficiency (OCefficiency), and the degree of justice that their decisions yield (OCjustice). Specifically, we asked them to place themselves in one of four quartiles: (1) the top 25%, (2) the second quartile (25%–50%), (3) the third quartile (50%–75%), or (4) the bottom 25%. The answers given by the judges are depicted in Figure 2. The figure shows that most judges placed their ability in the second quartile and over 13% placed themselves in the highest quartile (exact numbers are provided in Table A3 in Appendix A). Of course, by definition, only 50% of judges can actually belong in the two upper quartiles. Hence, this points at overconfidence.

Figure 2.

Overconfidence Among Judges

To provide more context as to the degree of overconfidence, Tables A4 and A5 in Appendix A compare the judges in our experiment with other comparable subjects in the existing literature, such as international arbitrators and U.S. administrative law judges, in terms of performance in the CRT and overconfidence measures. These tables reveal that the overconfidence trend among the German judges seems overall similar to previous experiments, where the only conspicuous difference is that German judges tend to classify themselves as belonging to the second quartile (and less so to the first quartile) more often. Thus, although they are overconfident overall, they are more modest than international arbitrators and U.S. administrative law judges.

Lastly, we presented judges with a classical problem in behavioral economics—the “Linda problem” (Tversky & Kahneman, 1983). In a nutshell, in the Linda problem, subjects are presented with a brief description of a female student, who was active in social goals during her study and was also outspoken and very bright.⁶⁴ The subject then needs to evaluate what is more likely that this person will be in the future. Among the options is one case that “sounds” correct because it is representative (for example, the woman will be a bank teller that is also active in the feminist movement) but actually captures a subset of another statement (for example, “Linda works at the bank and does activity X” is a subset of “Linda works at the bank”). In our experiment, subjects received eight statements about the woman and were asked to rank them from most likely to least likely.⁶⁵ Subjects who mark the special case as more likely than the general case in this task fall prey to a so-called “conjunction fallacy” (mistakenly thinking that the co-occurrence of two events is more likely than the occurrence of only one of them). Such a fallacy occurs because of a mental shortcut known as the “representativeness heuristic”—a tendency to make decisions based on prototype or stereotype that seem representative, while ignoring statistics.

The judges in our experiment exhibited the conjunction fallacy to a great extent, with 79.2% falling prey to it. Again, to put this in context, one study found that 92% of arbitrators fall prey to the conjunction fallacy (Helm et al., 2016). However, another (earlier) study found that less than 42% of German students demonstrated the fallacy (Fiedler, 1988). In comparison, judges in our experiment demonstrate a moderate susceptibility to the conjunction fallacy. Nonetheless, the key insight is that we do find evidence of susceptibility to overconfidence and the conjunction fallacy outside of the judges’ domain of expertise.

Discussion

General

In this study, we set out to check whether PA, which is usually considered a rationality-based process by its proponents (Alexy, 2017; Brems & Lavrysen, 2015; Möller, 2012; Popelier & Van De Heyning, 2013), is influenced by the biases and heuristics established in the behavioral economics literature. Our experiment focused on framing effects, where we let subjects conduct a PA in three vignettes (the two main vignettes and the additional vignette described in Appendix B) in which we manipulated the framing of the case. We then compared the answers of our three groups of subjects: non-law students, law students, and judges.

The findings overall support the conjecture that framing effects can play a role in PA. In particular, we find strong evidence of a framing effect when the act was framed as inducing a certain loss rather than as a certain gain. This effect was prevalent both for those with some degree of legal expertise (law students) and laymen (non-law students), but not for judges. Yet, in our bees vignette, neither judges nor law students were influenced by the framing, whereas the non-law students were. This seems to suggest that legal expertise can mitigate (or even eliminate) the framing effects in some contexts. The diminished framing effects among those with legal expertise is consistent with the findings of previous experiments that measured framing in other contexts (that is, not on PA; see, e.g., Broude & Levy, 2019; Mizrahi, 2018; Shereshevsky & Noah, 2017). Thus, we show that this insight extends to PA as well, which is especially interesting since one of the criticisms of PA has been that it opens a gate to subjective evaluations of the decision-maker (see, e.g., Gunn, 2005; Kaplow, 2019). We can thus show that certain behavioral biases can affect PA as a discursive tool for balancing decisions but also that legal expertise can mitigate the biases. The fact that judges are found to be less biased is “good news” for the proponents of PA, as it suggests that framing is less likely to influence the judgments in actual court cases. For further insights arising from the additional vignette, see our Appendix B.

At the same time, the findings also suggest that legal expertise (of judges) does not mitigate biases outside the domain of expertise: First, judges fared only slightly better than the students on the CRT, meaning they tended to answer intuitively – and wrongly. This is in line with other findings on judges and international arbitrators (see Table A4 in Appendix A). Second, the classical conjunction fallacy was found with judges, tested in a neutral, non-judicial setting. Third, judges were overconfident in their abilities as judges on several variables. The latter finding is in line with other studies having been administered to U.S. judges and international arbitrators on overconfidence in their ability to assess witness credibility, make quality decisions, provide parties with procedural efficiency, and avoid being challenged on appeal (see Table A5 in Appendix A). The judges in our experiment demonstrated overconfidence, especially in whether they delivered justice, but to a lesser degree: they put themselves predominantly in the second highest quartile and not in the highest, as international arbitrators and U.S. judges demonstrated in previous studies.

Application to American Balancing

Recall that several existing studies indicated that U.S. judges do exhibit susceptibility to framing effects, but in experiments conducted only in non-administrative and non-constitutional legal contexts.⁶⁶ Thus, we we do not know whether U.S. judges are susceptibile to framing in a setting similar to our study. The uncertainty here is twofold: First, U.S. judges may be inherently different from German judges (e.g., due to differences in legal tradition or training) and therefore differ on susceptibility. Second, the different structure of U.S. constitutional review, focusing on categories and tiered-scrutiny, may make any judge either more or less susceptible to biases compared to PA. Yet, our findings are still informative, with necessary modifications, for American judicial review despite these differences. Recall that scholars have noted at least three reasons for why the U.S. should care: a growing call to adopt PA in the U.S., implicit application of PA-like reasoning (balancing) in existing court cases, and the application in international law cases. Furthermore, given the claims that PA and the US-style review do not yield substantially different results anyway, the difference may be of degree rather than kind.

In particular, we submit that our main two vignettes are of direct relevance also to the United States: First, the uncertainty element (as in the cryptocurrency vignette) and idea of diminishing sensitivity is equally present in US-style balancing. It is possible, for instance, that if upholding the act would be framed as preventing a certain loss, judges would be more prone to applying a more lenient review standard (for example, rational basis instead of intermediate review) or that judges would apply the standard itself differently (for example, concluding that even under strict scrutiny, the law should be upheld).⁶⁷ Second, purpose framing (bee vignette) has particular significance, as scrutiny levels in the US depend on legislative purpose.

Limitations

While our study arguably provides an important contribution, it is also subject to some limitations. First, we cannot formally distinguish between legal training and (self-)selection into the law profession in general and into the judiciary in particular. Albeit law students and judges may, in theory, hold some “special traits” that make them less susceptible to framing compared to non-law students, those special traits would only be relevant under very specific conditions. In particular, such traits would have to explain why judges differ from the general pool of law students (some of which would likely become judges) in a way that happens to be correlated with susceptibility to framing. A much simpler explanation for our finding, following “Occam’s Razor” principle, would be that legal expertise and experience do mitigate the effect of framing. Namely, because German administrative judges must engage in proportionality analysis on a regular basis, they seem highly likely to face different frames of similar facts in their line of work. Moreover, lawyers presumably try to present a version most consistent with their client’s benefit using, inter alia, framing. Thus, a good judge needs to “filter out” the description and focus on the case in order to avoid mistakes. This experience may help judges develop expertise that enables them to circumvent the effects of framing. The same might hold, to a lesser extent, for law students, who face case studies in their legal training.

Second, as our sample only consists of German subjects, we cannot rule out that the effects may differ in other jurisdictions, particularly in common-law countries. A recent cross-country experimental study found no effects between common law and civil law judges in sentencing decisions (Spamann et al., 2021) and similar study on PA might need to be conducted to confirm whether our results are generalizable to other jurisdictions.

Third, our sample consists of three representative groups along a continuum of legal expertise: non-law students, law students, and (administrative) judges. There are, of course, other groups along the spectrum, including non-student laymen, lawyers, legal interns, paralegals, law professors, and other judges. However, these groups may differ from one another in many other features apart from expertise (for example, in the level of education, income, or analytical skills). While future studies may benefit from looking at such differences, we believe that our study design enables to look at the most relevant control groups.

Finally, a general difficulty with framing is that it is not binary: one may frame the same set of facts in many different ways. Our ceteris paribus setting allows us to contrast two versions for each vignette, but of course one could introduce others as well (for example, by making some points more or less salient). We leave such endeavors for future research.

Conclusion

The rationality assumption in adjudication, whereby interpreters decide objectively and without cognitive errors, has long been challenged on numerous accounts, both theoretically and empirically. Legal theorists have often articulated their suspicion that interpretation is the result of the result—interpreters tend to confirm their initial intuition by interpretation. Cognitive studies can help illuminate that suspicion. Experiments on the psychology of judging demonstrate that cognitive biases and heuristics affect the decision-making of national court judges and international arbitrators, such that their decision making process may deviate from perfect rationality.

We contribute to this discussion in several ways. First, we provide reassuring evidence that professional judges conducting PA are largely protected from framing effects that significantly influence other decision-makers. This finding challenges concerns about judicial susceptibility to cognitive biases in constitutional review and supports the practical rationality of PA when applied by trained legal experts. Our findings demonstrate that legal expertise serves as an effective debiasing mechanism within judges’ professional domain. While non-law students showed substantial framing effects and law students showed intermediate effects, judges remained largely unaffected by the same framings in their PA judgments. However, we also find that judges remained susceptible to biases outside their field of expertise. Thus, judges seem to be able to override their intuitions, but only when deciding in their professional context, Second, we enrich the discussion on the rationality of proportionality analysis. Although we show that proportionality analysis can be prone to framing among non-experts, these framing effects follow predictable patterns consistent with established behavioral theories. Thirdly, we add to the more general discussion on whether experiments with students allow inferences for the decision-making of professionals. Although many studies have found cognitive biases in expert adjudicators, there have also been studies showing that biases may be generally diminished in experts compared to other populations (see, e.g., Broude & Levy, 2019; Mizrahi, 2018; Shereshevsky & Noah, 2017). Our results are in line with this line of findings, as we find that legal expertise mitigates framing effects. This seems important for our understanding of constitutional review as it is today—being aware of the role of framing effects on the one hand, but reducing the concern of its influence when judges are experts, on the other hand. Furthermore, our findings suggest that constitutional cases should be allocated to specialized judges (as is often the case anyway, whenever cases are submitted to a specialized constitutional court) and that training and experience could potentially debias judges who are insufficiently specialized. Figuring out the costs of benefits of such interventions requires further study, but our findings are nonetheless important in identifying a potential debiasing mechanism. Overall, our findings provide empirical support for the rationality of PA as practiced by professional judges, while highlighting the importance of legal expertise in constitutional adjudication.

Supplemental Material

Supplemental Material - Framing Effects in Proportionality Analysis: Experimental Evidence

Supplemental Material for Framing Effects in Proportionality Analysis: Experimental Evidence by Anne van Aaken and Roee Sarel in Journal of Law & Empirical Analysis

Footnotes

Author Note

A theoretical precursor of this article with comparative constitutional law and more biases and heuristics covered can be found in van Aaken (2019). An older working paper version is available at . The experiment is registered on osf.io.

Acknowledgements

We are grateful for the invitation to an expert workshop in Paris in March 2016 organized by the Israel Democracy Institute with the support of the European Research Council and the helpful comments by the participants. On the theoretical part, we would also like thank Mattias Kumm, Samual Ischararoff, and Aharon Barak for helpful comments on the article as well as the participants of the Global and Comparative Public Law Colloquium at NYU (2016), a workshop at Humboldt University (2017), ILE Jour Fixe in Hamburg 2021, panels at ICON-S (2021 and 2022), European Association of Law & Economics Conference (2022), French Law & Economics Association Conference (2022), Conference on Empirical Legal Studies (2022), and FUELS seminar (2023). For the empirical part and other helpful comments, we would like to thank Christoph Engel, Eberhard Feess, Sven Hoeppner, Niels Petersen, Holger Spamann, Roseanna Sommers, Doron Teichman, and Justus Vasel. We would also like to thank the editor, Dan Klerman, and several anonymous referees for their helpful comments. We are grateful to the Administrative Judges Association in Lower Saxony (Niedersachsen) Germany for permitting us to conduct the experiment. We gratefully acknowledge funding from the Alexander von Humboldt Foundation.

ORCID iDs

Anne van Aaken

Roee Sarel

Ethical Considerations

The experiment received an ethics approval from the ethics committee of the economics faculty (WISO research lab) of the University of Hamburg.

Consent to Participate

Informed consent to participate was written (attained both generally by the Wiso Lab and specifically for the experiment before participation begins).

Author Contributions

The theoretical motivation is based on a precursor working paper by Anne . The authors extended these concepts to address the current paper’s specifics. Roee Sarel programmed, collected and analyzed the data. Experimental design and writing were joint work with equal contributions.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The experiment was funded by the Alexander von Humboldt Foundation.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data Availability Statement

The data will be posted on a public repository (e.g., OpenICPSR) upon publication.*

Supplemental Material

Supplemental material for this article is available online.

Notes

Appendix

References

Aleinikoff

T. A.

(1986). Constitutional law in the age of balancing. The Yale Law Journal, 96(5), 943–1005. https://doi.org/10.2307/796529

Alexy

(2000). On the structure of legal principles. Ratio Juris, 13(3), 294–304. https://doi.org/10.1111/1467-9337.00157

Alexy

(2003). On balancing and subsumption. A structural comparison. Ratio Juris, 16(4), 433–449. https://doi.org/10.1046/j.0952-1917.2003.00244.x

Alexy

(2014). Constitutional rights and proportionality. Revus, 22, 51–65. https://doi.org/10.4000/revus.2783

Alexy

(2017). Proportionality and rationality. In Jackson

Vicki C

Tushent

Mark

(Eds.), Proportionality: New Frontiers, New Challenges (pp. 13–29). Cambridge, UK: Cambridge University Press.

Alicke

M. D.

(2005). The better-than-average effect. In The self in social judgment (pp. 85–108). Psychological Press. https://api.taylorfrancis.com/content/chapters/edit/download?identifierName=doi&identifierValue=10.4324/9780203943250-8&type=chapterpdf

Andenas

Zleptnig

(2006). Proportionality: WTO law: In comparative perspective. Texas International Law Journal, 42(3), 371–428.

Arlen

Tontrup

(2015). Does the endowment effect justify legal intervention? The debiasing effect of institutions. The Journal of Legal Studies, 44(1), 143–182. https://doi.org/10.1086/680991

Arrow

K. J.

(1982). Risk Perception in Psychology and Economics. Economic Inquiry, 20(1), 1–9. https://doi.org/10.1111/j.1465-7295.1982.tb01138.x

10.

Barak

(2012). 738 proportionality (2). In The Oxford handbook of comparative constitutional law (pp. 738–755). Oxford University Press.

11.

Bartels

(2015). The chapeau of the general exceptions in the WTO GATT and GATS agreements: A reconstruction. American Journal of International Law, 109(1), 95–125. https://doi.org/10.5305/amerjintelaw.109.1.0095

12.

Bea

M. D.

Poppe

E. S. T.

(2021). Marginalized legal categories: Social inequality, family structure, and the laws of intestacy. Law & Society Review, 55(2), 252–272. https://doi.org/10.1111/lasr.12553

13.

Becher

S. I.

Feess

Sarel

(2023). Regulating product return policies: The trade-off between efficiency and distribution. The Journal of Legal Studies, 52(1), 137–191. https://doi.org/10.1086/718911

14.

Belton

I. K.

Thomson

Dhami

M. K.

(2014). Lawyer and nonlawyer susceptibility to framing effects in out‐of‐court civil litigation settlement. Journal of Empirical Legal Studies, 11(3), 578–600. https://doi.org/10.1111/jels.12050

15.

Bendor

A. L.

Sela

(2015). How proportional is proportionality? International Journal of Constitutional Law, 13(2), 530–544. https://doi.org/10.1093/icon/mov028

16.

Berthet

(2022). The impact of cognitive biases on professionals’ decision-making: A review of four occupational areas. Frontiers in Psychology, 12, 802439. https://doi.org/10.3389/fpsyg.2021.802439

17.

Brañas-Garza

Kujal

Lenkei

(2019). Cognitive reflection test: Whom, how, when. Journal of Behavioral and Experimental Economics, 82, 101455. https://doi.org/10.1016/j.socec.2019.101455

18.

Brems

Lavrysen

(2015). ‘Don’t use a sledgehammer to crack a nut’: Less restrictive means in the case law of the European Court of Human Rights. Human Rights Law Review, 15(1), 139–168. https://doi.org/10.1093/hrlr/ngu040

19.

Broude

Levy

(2019). Outcome bias and expertise in investigations under international humanitarian law. European Journal of International Law, 30(4), 1303–1318. https://doi.org/10.1093/ejil/chaa005

20.

Brown

A. L.

Imai

Vieider

F. M.

Camerer

C. F.

(2024). Meta-analysis of empirical estimates of loss aversion. Journal of Economic Literature, 62(2), 485–516. https://doi.org/10.1257/jel.20221698

21.

Bulman-Pozen

Seifter

(2023). State constitutional rights and democratic proportionality. Columbia Law Review, 123(7), 1855–1928.

22.

Chang

Dai

(2021). The limited usefulness of the proportionality principle. International Journal of Constitutional Law, 19(3), 1110–1134. https://doi.org/10.1093/icon/moab068

23.

Chapman

(1994). The rational and the reasonable: Social choice theory and adjudication. University of Chicago Law Review, 61(1), 41–122. https://doi.org/10.2307/1600090

24.

Chiu

Y.-C.

(2003). Is there a framing effect? The asset effect in decision-making under risk. https://psycnet.apa.org/record/2004-10290-004

25.

Cohen-Eliya

Porat

(2010). American balancing and German proportionality: The historical origins. International Journal of Constitutional Law, 8(2), 263–286. https://doi.org/10.1093/icon/moq004

26.

Cohen-Eliya

Porat

(2011). Proportionality and the culture of justification. American Journal of Comparative Law, 59(2), 463–490. https://doi.org/10.5131/AJCL.2010.0018

27.

Cohen-Eliya

Porat

(2021). Proportionality in the age of populism. American Journal of Comparative Law, 69(3), 449–477. https://doi.org/10.1093/ajcl/avac005

28.

Collings

Barclay

S. H.

(2022). Taking justification seriously: Proportionality, strict scrutiny, and the substance of religious liberty. Boston College Law Review, 63(2), 453–520.

29.

da Silva

V. A.

(2022). Standing in the shadows of balancing: Proportionality and the necessity test. International Journal of Constitutional Law, 20(5), 1738–1767. https://doi.org/10.1093/icon/moac105

30.

Diecidue

Wakker

P. P.

(2001). On the intuition of rank-dependent utility. Journal of Risk and Uncertainty, 23(3), 281–298. https://doi.org/10.1023/A:1011877808366

31.

Dohmen

Falk

Huffman

Sunde

(2010). Are risk aversion and impatience related to cognitive ability? The American Economic Review, 100(3), 1238–1260. https://doi.org/10.1257/aer.100.3.1238

32.

Douek

(2021). Governing online speech: From “posts-as-trumps” to proportionality and probability. Columbia Law Review, 121, 759.

33.

Druckman

J. N.

(2004). Political preference formation: Competition, deliberation, and the (ir) relevance of framing effects. American Political Science Review, 98(4), 671–686. https://doi.org/10.1017/S0003055404041413

34.

Elangovan

A. R.

(2005). Framing effects in managerial third‐party intervention: An exploratory study. The Leadership & Organization Development Journal, 26(7), 542–557. https://doi.org/10.1108/01437730510624584

35.

Engel

(2010). The behaviour of corporate actors: How much can we learn from the experimental literature? Journal of Institutional Economics, 6(4), 445–475. https://doi.org/10.1017/s1744137410000135

36.

Englich

(2006). Blind or biased? Justitia’s susceptibility to anchoring effects in the courtroom based on given numerical representations. Law & Policy, 28(4), 497–514. https://doi.org/10.1111/j.1467-9930.2006.00236.x

37.

Englich

Mussweiler

Strack

(2006). Playing dice with criminal sentences: The influence of irrelevant anchors on experts’ judicial decision making. Personality and Social Psychology Bulletin, 32(2), 188–200. https://doi.org/10.1177/0146167205282152

38.

Feess

Kerzenmacher

Muehlheusser

(2023). Morally questionable decisions by groups: Guilt sharing and its underlying motives. Games and Economic Behavior, 140, 380–400. https://doi.org/10.1016/j.geb.2023.04.005

39.

Feess

Sarel

(2018). Judicial effort and the appeal system: Theory and experiment. The Journal of Legal Studies, 47(2), 269–294. https://doi.org/10.1086/699391

40.

Feess

Sarel

(2022). Optimal fine reductions for self-reporting: The impact of loss aversion. International Review of Law and Economics, 70, 106067. https://doi.org/10.1016/j.irle.2022.106067

41.

Fiedler

(1988). The dependence of the conjunction fallacy on subtle linguistic factors. Psychological Research, 50(2), 123–129. https://doi.org/10.1007/BF00309212

42.

Fleiner

(1928). Institutionen des deutschen Verwaltungsrechts. Mohr.

43.

Fox

Wingrove

Pfeifer

(2011). A comparison of students’ and jury panelists’ decision‐making in split recovery cases. Behavioral Sciences & the Law, 29(3), 358–375. https://doi.org/10.1002/bsl.968

44.

Franck

S. D.

Van Aaken

Freda

Guthrie

Rachlinski

J. J.

(2016). Inside the arbitrator’s mind. Emory Law Journal, 66(5), 1115–1174.

45.

Frank

(1931). Are judges human? Part one: The effect on legal thinking of the assumption that judges behave like human beings. University of Pennsylvania Law Review and American Law Register, 80(1), 17–53. https://doi.org/10.2307/3308020

46.

Frederick

(2005). Cognitive reflection and decision making. The Journal of Economic Perspectives, 19(4), 25–42. https://doi.org/10.1257/089533005775196732

47.

German Federal Constitutional Court . (2025a). Abstract judicial review of statutes. https://www.bundesverfassungsgericht.de/EN/Verfahren/Wichtige-Verfahrensarten/Abstrakte-Normenkontrolle/abstrakte-normenkontrolle_node.html;jsessionid=FA279761E34359A2D53D0FC89593F7A4.internet972

48.

German Federal Constitutional Court . (2025b). Constitutional complaints. https://www.bundesverfassungsgericht.de/EN/Verfahren/Wichtige-Verfahrensarten/Verfassungsbeschwerde/verfassungsbeschwerde_node.htm

49.

Gigerenzer

(2002). Bounded rationality: The adaptive toolbox. MIT press. https://www.academia.edu/download/80165528/GG_Adaptive_2001.pdf

50.

Gigerenzer

Goldstein

D. G.

(1996). Reasoning the fast and frugal way: Models of bounded rationality. Psychological Review, 103(4), 650–669. https://doi.org/10.1037/0033-295x.103.4.650

51.

Gigerenzer

Selten

(2001). Bounded rationality. CogNet. https://core.ac.uk/download/pdf/210844206.pdf

52.

Gigerenzer

Todd

P. M.

(1999). Simple heuristics that make us smart. Oxford University Press.

53.

Gigone

Hastie

(1997). Proper analysis of the accuracy of group judgments. Psychological Bulletin, 121(1), 149–167. https://doi.org/10.1037/0033-2909.121.1.149

54.

Greene

(2018). Rights as trumps. Harvard Law Review, 132(1), 28–132.

55.

Grimm

(2007). Proportionality in Canadian and German constitutional jurisprudence. University of Toronto Law Journal, 57(2), 383–398. https://doi.org/10.1353/tlj.2007.0014

56.

Gunn

T. J.

(2005). Deconstructing proportionality in limitations analysis. Emory International Law Review, 19(2), 465–498.

57.

Guthrie

(2003). Panacea or Pandora’s box: The costs of options in negotiation. Iowa Law Review, 88(3), 601–654.

58.

Guthrie

(2006). Misjudging. Nevada Law Journal, 7, 420–456.

59.

Guthrie

Rachlinski

J. J.

(2006). Insurers, illusions of judgment & litigation. Vanderbilt Law Review, 59(6), 2015–2050.

60.

Guthrie

Rachlinski

J. J.

Wistrich

A. J.

(2001). Inside the judicial mind. Cornell Law Review, 86(4), 777–830.

61.

Guthrie

Rachlinski

J. J.

Wistrich

A. J.

(2007). Blinking on the bench: How judges decide cases. Cornell Law Review, 93(1), 1–44.

62.

Guthrie

Rachlinski

J. J.

Wistrich

A. J.

(2009). The “hidden judiciary”: An empirical examination of executive branch justice. Duke Law Journal, 58(7), 1477–1530.

63.

Heath

Larrick

R. P.

Klayman

(1998). Cognitive repairs: How organizational practices can compensate for individual shortcomings. Research in Organizational Behavior, 20, 1–37.

64.

Helm

R. K.

Wistrich

A. J.

Rachlinski

J. J.

(2016). Are arbitrators human? Journal of Empirical Legal Studies, 13(4), 666–692. https://doi.org/10.1111/jels.12129

65.

Holste

Spamann

(2025). Experimental investigations of judicial decision-making. In Tobia

(Ed.), The Cambridge handbook of experimental jurisprudence. Cambridge University Press. https://doi.org/10.2139/ssrn.4375745

66.

Howard

(2019). Bandwagon effect and authority bias. In Howard

(Ed.), Cognitive errors and diagnostic mistakes (pp. 21–56). Springer International Publishing. https://doi.org/10.1007/978-3-319-93224-8_3

67.

Hsu

S.-L.

(2007). The identifiability bias in environmental law. Florida State University Law Review, 35(2), 433–504.

68.

Jabotinsky

Sarel

(2022). How Crisis affects Crypto: Coronavirus as a Test Case. Hastings Law Journal, 74(2), 433–488. https://repository.uclawsf.edu/cgi/viewcontent.cgi?article=4015&context=hastings_law_journal.

69.

Jackson

V. C.

(2004). Being proportional about proportionality. Constitutional Commentary, 21(3), 803–859.

70.

Jackson

V. C.

(2015). Constitutional law in an age of proportionality. The Yale Law Journal, 124, 3094–3196.

71.

Jenni

Loewenstein

(1997). Explaining the identifiable victim effect. Journal of Risk and Uncertainty, 14(3), 235–257. https://doi.org/10.1023/A:1007740225484

72.

Jolls

Sunstein

C. R.

Thaler

(1998). Theories and tropes: A reply to Posner and Kelman. Stanford Law Review, 50(5), 1593–1608. https://doi.org/10.2307/1229307

73.

Kahan

D. M.

Hoffman

Evans

Devins

Lucci

Cheng

(2015). Ideology or situation sense: An experimental investigation of motivated reasoning and professional judgment. University of Pennsylvannia Law Review, 164(2), 349–439.

74.

Kahneman

(2011). Thinking, fast and slow. Farrar.

75.

Kahneman

(2013). A perspective on judgment and choice: Mapping bounded rationality. In Jing

Rosenzweig

M. R.

d’Ydewalle

Zhang

Chen

H.-C.

Zhang

(Eds.), Progress in psychological science around the world. Volume 1 neural, cognitive and developmental issues (1st ed., pp. 1–47). Psychology Press. https://www.taylorfrancis.com/chapters/edit/10.4324/9780203783122-1/perspective-judgment-choice-daniel-kahneman

76.

Kahneman

Knetsch

J. L.

Thaler

R. H.

(1990). Experimental tests of the endowment effect and the Coase theorem. Journal of Political Economy, 98(6), 1325–1348. https://doi.org/10.1086/261737

77.

Kantorowicz‐Reznichenko

Kantorowicz

Weinshall

(2022). Ideological bias in constitutional judgments: Experimental analysis and potential solutions. Journal of Empirical Legal Studies, 19(3), 716–757. https://doi.org/10.1111/jels.12323

78.

Kaplow

(2019). Balancing versus structured decision procedures: Antitrust, title VII disparate impact, and constitutional law strict scrutiny. University of Pennsylvania Law Review, 167(6), 1375–1462.

79.

Kertzer

J. D.

Holmes

LeVeck

B. L.

Wayne

(2022). Hawkish biases and group decision making. International Organization, 76(3), 513–548. https://doi.org/10.1017/s0020818322000017

80.

Klatt

Meister

(2012). Proportionality—a benefit to human rights? Remarks on the I CON controversy. International Journal of Constitutional Law, 10(3), 687–708. https://doi.org/10.1093/icon/mos019

81.

Klein

B. F.

(1984). Rational basis? Strict scrutiny? Intermediate scrutiny? Judicial review in the abortion cases. Oklahoma City University Law Review, 9(2), 317–354.

82.

Knetsch

J. L.

(1989). The endowment effect and evidence of nonreversible indifference curves. The American Economic Review, 79(5), 1277–1284.

83.

Kommers

D. P.

(1994). The federal constitutional court in the German political system. Comparative Political Studies, 26(4), 470–491. https://doi.org/10.1177/0010414094026004004

84.

Korobkin

Guthrie

(1997). Psychology, economics, and settlement: A new look at the role of the lawyer. Texas Law Review, 76(1), 77.

85.

Kőszegi

Rabin

(2006). A model of reference-dependent preferences. Quarterly Journal of Economics, 121(4), 1133–1165. https://doi.org/10.1093/qje/121.4.1133

86.

Kőszegi

Rabin

(2007). Reference-dependent risk attitudes. The American Economic Review, 97(4), 1047–1073. https://doi.org/10.1257/aer.97.4.1047

87.

Kremnitzer

Steiner

Lang

(2020). Proportionality in action: Comparative and empirical perspectives on the judicial practice (Vol. 22). Cambridge University Press. https://books.google.com/books?hl=en&lr=&id=2OfWDwAAQBAJ&oi=fnd&pg=PR10&dq=proportionality+in+action+kremnitzer&ots=IO6xFlXQCB&sig=LPMal3MNL78p1Y6Tuu1U1Q35sZs

88.

Kretzmer

(2013). The inherent right to self-defence and proportionality in jus ad bellum. European Journal of International Law, 24(1), 235–282. https://doi.org/10.1093/ejil/chs087

89.

Kumm

(2007). Political liberalism and the structure of rights: On the place and limits of the proportionality requirement. In Pavlakos

(Ed.), Law, rights and discourse: The legal philosophy of robert Alexy (pp. 131–166). Bloomsbury Publishing.

90.

Lewinsohn-Zamir

Ritov

Kogut

(2016). Law and identifiability. Indiana Law Journal, 92(2), 505–556.

91.

Xie

(2006). A new look at the “Asian disease” problem: A choice between the best possible outcomes or between the worst possible outcomes? Thinking & Reasoning, 12(2), 129–143. https://doi.org/10.1080/13546780500145652

92.

Lidén

Gräns

Juslin

(2019). ‘Guilty, no doubt’: Detention provoking confirmation bias in judges’ guilt assessments and debiasing techniques. Psychology, Crime and Law, 25(3), 219–247. https://doi.org/10.1080/1068316X.2018.1511790

93.

Lord

(2023). Trumping Dobbs (pp. 12–21). University of Illinois Law Review Online.

94.

Lubieniechi

Hesseln

Phillips

Smyth

(2016). Expert and lay public risk preferences regarding plants with novel traits. Canadian Journal of Agricultural Economics/Revue Canadienne d’agroeconomie, 64(4), 717–738. https://doi.org/10.1111/cjag.12110

95.

Magalhães

P. C.

Skiple

J. K.

Pereira

M. M.

Arnesen

Bentsen

H. L.

(2023). Beyond the Myth of legality? Framing effects and public reactions to high court decisions in Europe. Comparative Political Studies, 56(10), 1537–1566. https://doi.org/10.1177/00104140231152769

96.

Mathews

(2017). Proportionality review in administrative law. In Comparative administrative law (pp. 405–419). Edward Elgar Publishing. https://www.elgaronline.com/abstract/edcoll/9781784718657/9781784718657.00034.xml

97.

Mcafee

R. P.

Mialon

H. M.

Mialon

S. H.

(2010). Do sunk costs matter? Economic Inquiry, 48(2), 323–336. https://doi.org/10.1111/j.1465-7295.2008.00184.x

98.

Mizrahi

(2018). Arguments from expert opinion and persistent bias. Argumentation, 32(2), 175–195. https://doi.org/10.1007/s10503-017-9434-x

99.

Möller

(2012). Proportionality: Challenging the critics. International Journal of Constitutional Law, 10(3), 709–731. https://doi.org/10.1093/icon/mos024

100.

Morewedge

C. K.

Kahneman

(2010). Associative processes in intuitive judgment. Trends in Cognitive Sciences, 14(10), 435–440. https://doi.org/10.1016/j.tics.2010.07.004

101.

Mullen

Monin

(2016). Consistency versus licensing effects of past moral behavior. Annual Review of Psychology, 67, 363–385. https://doi.org/10.1146/annurev-psych-010213-115120

102.

Otero

Alonso

(2023). Cognitive reflection test: The effects of the items sequence on scores and response time. PLoS One, 18(1), Article e0279982. https://doi.org/10.1371/journal.pone.0279982

103.

Parasurama

Pr.

(2017). Why overlapping confidence intervals mean nothing about statistical significance. Toward Data Science Blog. https://towardsdatascience.com/why-overlapping-confidence-intervals-mean-nothing-about-statistical-significance-48360559900a

104.

Pecaric

(2022). A Bayesian improvement of the proportionality principle. Ratio Juris, 35(4), 419–436. https://doi.org/10.1111/raju.12366

105.

Peters

(2017). Proportionality as a global constitutional principle. In Handbook on global constitutionalism (pp. 248–264). Edward Elgar Publishing. https://www.elgaronline.com/abstract/edcoll/9781783477258/9781783477258.00028.xml

106.

Peters

(2021). A plea for proportionality: A reply to Yun-chien Chang and Xin Dai. International Journal of Constitutional Law, 19(3), 1135–1145. https://doi.org/10.1093/icon/moab071

107.

Petersen

(2017). Proportionality and judicial activism: Fundamental rights adjudication in Canada, Germany and South Africa. Cambridge University Press. https://books.google.com/books?hl=en&lr=&id=i4Y7DgAAQBAJ&oi=fnd&pg=PR9&dq=Petersen,+Niels.+2017.+Proportionality+and+Judicial+Activism.+Cambridge:+Cambridge+University+Press.&ots=ELZwYhvO3h&sig=i9WPbSEh7_eev6e1Nq_eyX7sN_o

108.

Petersen

(2020). Alexy and the “German” model of proportionality: Why the theory of constitutional rights does not provide a representative reconstruction of the proportionality test. German Law Journal, 21(2), 163–173. https://doi.org/10.1017/glj.2020.9

109.

Peterson

T. C.

Tollefson

(2023). Asian disease problem applied to climate change: A study of the impact of framing risk preferences driven by socio-economic indicators for climate-change-related risks. Businesses, 3(1), 166–180. https://doi.org/10.3390/businesses3010012

110.

Popelier

Van De Heyning

(2013). Procedural rationality: Giving teeth to the proportionality analysis. European Constitutional Law Review, 9(2), 230–262. https://doi.org/10.1017/S1574019612001137

111.

Posner

R. A.

(1993). What do judges and justices maximize? (The same thing everybody else does). Supreme Court Economic Review, 3, 1–41. https://doi.org/10.1086/scer.3.114706

112.

Rabin

(1998). Psychology and economics. Journal of Economic Literature, 36(1), 11–46.

113.

Rachlinski

J. J.

Guthrie

Wistrich

A. J.

(2007). Heuristics and biases in bankruptcy judges. Journal of Institutional and Theoretical Economics, 163(1), 167–186. https://doi.org/10.1628/093245607780181865

114.

Rachlinski

J. J.

Guthrie

Wistrich

A. J.

(2011). Probable cause, probability, and hindsight. Journal of Empirical Legal Studies, 8(s1), 72–98. https://doi.org/10.1111/j.1740-1461.2011.01230.x

115.

Rachlinski

J. J.

Wistrich

A. J.

(2018). Gains, losses, and judges: Framing and the judiciary. The Notre Dame Law Review, 94(2), 521–582.

116.

Rachlinski

J. J.

Wistrich

A. J.

Guthrie

(2015). Can judges make reliable numeric judgments: Distorted damages and skewed sentences. Indiana Law Journal, 90, 695.

117.

Redding

R. E.

Reppucci

N. D.

(1999). Effects of lawyers’ socio-political attitudes on their judgments of social science in legal decision making. Law and Human Behavior, 23(1), 31–54. https://doi.org/10.1023/A:1022322706533

118.

Sarel

(2022). Crime and punishment in times of pandemics. European Journal of Law and Economics, 54(2), 155–186. https://doi.org/10.1007/s10657-021-09720-7

119.

Sarel

Demirtas

(2021). Delegation in a multi-tier court system: Are remands in the U.S. federal courts driven by moral hazard? European Journal of Political Economy, 68, 101999. https://doi.org/10.1016/j.ejpoleco.2020.101999

120.

Schauer

Spellman

B. A.

(2017). Analogy, expertise, and experience. University of Chicago Law Review, 84(1), 249–268.

121.

Schlink

(2012). Proportionality (1). In Rosenfeld

Sajó

(Eds.), The Oxford handbook of comparative constitutional law (pp. 718–737). Oxford, UK: Oxford University Press. https://doi.org/10.1093/oxfordhb/9780199578610.013.0035

122.

Scholten

Read

(2014). Prospect theory and the “forgotten” fourfold pattern of risk preferences. Journal of Risk and Uncertainty, 48(1), 67–83. https://doi.org/10.1007/s11166-014-9183-2

123.

Shefrin

Statman

(2003). The contributions of Daniel Kahneman and Amos Tversky. The Journal of Behavioral Finance, 4(2), 54–58. https://doi.org/10.1207/S15427579JPFM0402_01

124.

Shereshevsky

Noah

(2017). Does exposure to preparatory work affect treaty interpretation? An experimental study on international law students and experts. European Journal of International Law, 28(4), 1287–1316. https://doi.org/10.1093/ejil/chx069

125.

Simon

(1998). A psychological model of judicial decision making. Rutgers Law Journal, 30(1), 1–142.

126.

Simon

(2004). A third view of the black box: Cognitive coherence in legal decision making. University of Chicago Law Review, 71(2), 511–586.

127.

Small

D. A.

Loewenstein

(2003). Helping a victim or helping the victim: Altruism and identifiability. Journal of Risk and Uncertainty, 26(1), 5–16. https://doi.org/10.1023/a:1022299422219

128.

Small

D. A.

Loewenstein

Slovic

(2007). Sympathy and callousness: The impact of deliberative thought on donations to identifiable and statistical victims. Organizational Behavior and Human Decision Processes, 102(2), 143–153. https://doi.org/10.1016/j.obhdp.2006.01.005

129.

Spamann

Klöhn

(2016). Justice is less blind, and less legalistic, than we thought: Evidence from an experiment with real judges. The Journal of Legal Studies, 45(2), 255–280. https://doi.org/10.1086/688861

130.

Spamann

Klöhn

(2024). Can law students replace judges in experiments of judicial decision-making? Journal of Law and Empirical Analysis, 1(1), 149–161. https://doi.org/10.1177/2755323X231210467

131.

Spamann

Klöhn

Jamin

Khanna

Liu

J. Z.

Mamidi

Morell

Reidel

(2021). Judges in the lab: No precedent effects, no common/civil law differences. Journal of Legal Analysis, 13(1), 110–126. https://doi.org/10.1093/jla/laaa008

132.

Staw

B. M.

(1981). The escalation of commitment to a course of action. Academy of Management Review, 6(4), 577–587. https://doi.org/10.2307/257636

133.

Steiner

Netzer

Sulitzeanu-Kenan

(2022). Necessity or balancing: The protection of rights under different proportionality tests—experimental evidence. International Journal of Constitutional Law, 20(2), 642–663. https://doi.org/10.1093/icon/moac036

134.

Stone Sweet

Mathews

(2008). Proportionality balancing and global constitutionalism. Columbia Journal of Transnational Law, 47(1), 72–164.

135.

Steiner

P. M.

(2020). An evaluation of experimental designs for constructing vignette sets in factorial surveys. Sociological Methods & Research, 49(2), 455–497. https://doi.org/10.1177/0049124117746427

136.

Sulitzeanu-Kenan

Kremnitzer

Alon

(2016). Facts, preferences, and doctrine: An empirical analysis of proportionality judgment. Law & Society Review, 50(2), 348–382. https://doi.org/10.1111/lasr.12203

137.

Tebbe

Schwartzman

(2021). The politics of proportionality. Michigan Law Review, 120(6), 1307–1335.

138.

Teichman

Zamir

Ritov

(2023). Biases in legal decision‐making: Comparing prosecutors, defense attorneys, law students, and laypersons. Journal of Empirical Legal Studies, 20(4), 852–894. https://doi.org/10.1111/jels.12365

139.

Thiemann

Schulz

Sunde

Thöni

(2022). Selection into experiments: New evidence on the role of preferences, cognition, and recruitment protocols. Journal of Behavioral and Experimental Economics, 98, 101871. https://doi.org/10.1016/j.socec.2022.101871

140.

Tversky

Kahneman

(1973). Availability: A heuristic for judging frequency and probability. Cognitive Psychology, 5(2), 207–232. https://doi.org/10.1016/0010-0285(73)90033-9

141.

Tversky

Kahneman

(1974). Judgment under uncertainty: Heuristics and biases. Science, 185(4157), 1124–1131. https://doi.org/10.1126/science.185.4157.1124

142.

Tversky

Kahneman

(1979). Prospect theory: An analysis of decision under risk. Econometrica, 47(2), 263. https://doi.org/10.2307/1914185

143.

Tversky

Kahneman

(1981). The framing of decisions and the psychology of choice. Science, 211(4481), 453–458. https://doi.org/10.1126/science.7455683

144.

Tversky

Kahneman

(1983). Extensional versus intuitive reasoning: The conjunction fallacy in probability judgment. Psychological Review, 90(4), 293–315. https://doi.org/10.1037/0033-295X.90.4.293

145.

Tversky

Kahneman

(1986). Rational choice and the framing of decisions. Journal of Business, 59(4), S251–S278. https://doi.org/10.1086/296365

146.

Tversky

Kahneman

(1992). Advances in prospect theory: Cumulative representation of uncertainty. Journal of Risk and Uncertainty, 5(4), 297–323. https://doi.org/10.1007/BF00122574

147.

van Aaken

(2003). Rational Choice in der Rechtswissenschaft: Zum Stellenwert der ökonomischen Theorie im Recht (“Rational Choice Theory in Law: On the Significance of Economic Theory in Law”) (Reprinted 2009). Nomos Verlag. https://www.alexandria.unisg.ch/handle/20.500.14171/70821

148.

van Aaken

(2019). The decision architecture of proportionality analysis: Cognitive biases and heuristics. SSRN Electronic Journal. [Working Paper]. https://doi.org/10.2139/ssrn.3364553

149.

Waldron

(2003). Security and liberty: The image of balance. The Journal of Political Philosophy, 11(2), 191–210. https://doi.org/10.1111/1467-9760.00174

150.

Webber

G. C. N.

(2010). Proportionality, balancing, and the cult of constitutional rights scholarship. Canadian Journal of Law and Jurisprudence, 23(1), 179–202. https://doi.org/10.1017/S0841820900004860

151.

Whittemore

L. A.

(2016). Proportionality decision making in targeting: Heuristics, cognitive biases, and the law. Harvard Nattional Security Journal, 7(2), 577–636.

152.

Wistrich

A. J.

Rachlinski

J. J.

(2013). How lawyers’ intuitions prolong litigation. Southern California Law Review, 86(3), 571–636.

153.

Wojciechowski

B. W.

Pothos

E. M.

(2018). Is there a conjunction fallacy in legal probabilistic decision making? Frontiers in Psychology, 9, 391. https://doi.org/10.3389/fpsyg.2018.00391

154.

Yang

Liu

Deng

Huang

Luo

Y.-J.

Cui

(2022). To blame or not? Modulating third-party punishment with the framing effect. Neuroscience Bulletin, 38(5), 533–547. https://doi.org/10.1007/s12264-021-00808-3

155.

Yechiam

Zeif

(2023). Revisiting the effect of incentivization on cognitive reflection: A meta‐analysis. Journal of Behavioral Decision Making, 36(1), Article e2286. https://doi.org/10.1002/bdm.2286

156.

Zamir

Teichman

(2018). Behavioral law and economics. Oxford University Press.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.48 MB