Sage Journals: Discover world-class research

Abstract

The goal of psychological science is to discover truths about human nature, and the typical form of empirical insights is a simple statement of the form x relates to y. We suggest that such “one-liners” imply much larger x-y relationships than those we typically study. Given the multitude of factors that compete and interact to influence any human outcome, small effect sizes should not surprise us. And yet they do—as evidenced by the persistent and systematic underpowering of research studies in psychological science. We suggest an explanation. Effect size magnification is the tendency to exaggerate the importance of the variable under investigation because of the momentary neglect of others. Although problematic, this attentional focus serves a purpose akin to that of the eye’s fovea. We see a particular x-y relationship with greater acuity when it is the center of our attention. Debiasing remedies are not straightforward, but we recommend (a) recalibrating expectations about the effect sizes we study, (b) proactively exploring moderators and boundary conditions, and (c) periodically toggling our focus from the x variable we happen to study to the non-x variables we do not.

Keywords

effect size statistical power focusing illusion overconfidence metascience

Looking across a room full of research psychologists at a professional meeting, it is possible to be struck by the thought that everyone there believes, usually with some justification, that what he or she is studying is important. (Funder & Ozer, 2019, p. 164)

The basic unit of much psychological inquiry is a statement linking a single variable x with a single outcome y. Producing such insights is a primary goal of our field’s empirical research, as evidenced by their common appearance in the abstracts of journal articles, in what we convey to journalists and the lay public, and in summaries of landmark findings for introductory psychology textbooks.

We suggest that x-y statements can be true yet misleading—and not just because of the familiar specters of reverse causality and unobserved confounds or more recently discussed problems such as mining data for statistically significant results, post hoc hypothesizing, and data faking. Our specific concern is that the implied magnitude of any x-y relationship, particularly when described without qualification or quantification, appears larger than it really is.

Why is this error so pervasive and persistent? Why, for example, is the one-liner “Grittier people are more successful” easily interpreted as an effect size larger than the observed r = .1 (Duckworth et al., 2019)?¹ Why does it disappoint policymakers to learn that the average effect size of “nudges” (such as reminding patients that a flu shot is “waiting for them”; Milkman et al., 2022) has been estimated as, at most, r = .15 (Szaszi et al., 2022)?² And why do we continue to power our studies as if x-y relationships were large enough to detect with just dozens of research participants?

On reflection, it is obvious that any psychological outcome y is influenced by many variables other than the one we happen to be thinking about. See Figure 1. Logic further dictates that the greater the number of such non-x variables, the less important, relatively speaking, the one x we are considering at the moment.³ For instance, human achievement is quite obviously influenced by values, goals, cognitive ability, physical health, socioeconomic advantage, years of education, social support, and personality traits other than grit. The same goes for behavioral-economics nudges. How important should we expect any given intervention to be when so many individual differences and situational factors are also at play?

Fig. 1.

Reality versus perception. Any (a) psychological outcome y is determined by multiple influences. In this greatly simplified example, 10 variables exert main effects on y (e.g., x₁, x₂, . . .) and/or act as moderators (e.g., x₇ and x₉). When we think about (b) the influence of a particular variable x on outcome y, we tend to ignore all others (x₁–x₉). Omitted from this figure are the 45 pairwise correlations among the 10 independent variables that further reduce the unique variance in y explained by our focal x.

It is equally plain that any variable x interacts with others, moderating the x-y relationship. The number of possible interactions between these variables and x, and with each other, increases exponentially (Turkheimer, 2000). An absurdly oversimplified universe of only 10 independent variables yields, for example, more than a thousand possible linear interactions (45 two-way interactions, 120 three-way interactions, and so on),⁴ not to mention reciprocal effects, nonlinear relationships, and the added complexity of covarying xs (Götz et al., 2024; Szaszi et al., 2024; van Tilburg & van Tilburg, 2023). As a consequence, any one x-y relationship is unlikely to apply, to the same degree, across all individuals and contexts (Bryan et al., 2021; Gelman et al., 2023; Liou et al., 2023; Tosh et al., 2024). So it is certain, for example, that even if grit is a cause of achievement, it explains only a small fraction of variation for some people and under some circumstances.

We suggest that the act of studying the relationship between x and y inadvertently leads us to ignore the multitude of factors—other than x—that also influence y. Likewise, we tend to momentarily ignore the myriad contexts or individual differences that moderate the relationship between x and y, including those rendering it weaker, nonexistent, or even reversed in sign (Bryan et al., 2021). And by extension, we tend to overestimate how well the x-y relationship generalizes—because context is nothing more than a collection of xs at varying levels. In sum, we suffer from effect size magnification, a bias in which the importance of any x-y relationship is overestimated when x is considered in isolation from other (non-x) influences.

Most Effect Sizes in Psychological Science Are Surprisingly Small

Cohen (1992) suggested that an effect of r = .3 is both “visible to the naked eye of a careful observer” and “the average size of observed effects in various fields” (p. 156). More recently, the benchmarks of r = .1, .3, and .5 for small, medium, and large effects, respectively, have been recognized as overly optimistic (Funder & Ozer, 2019; Götz et al., 2024).⁵ Our field’s historical miscalibration is immediately evident in Table 1, which summarizes median published effect size estimates ranging from r = .13 to .19 from four large-scale reviews.

Table 1.

Meta-Analytic Estimates of Typical Published Effect Sizes in Psychological Science

Source	r
Median of 43 effect sizes for the association of a single environmental influence with a single life outcome from 43 studies (Turkheimer & Waldron, 2000)	.13
Median of 474 meta-analytic effect sizes extracted from 322 meta-analyses summarizing the social-psychology literature (> 25,000 studies) over a century (Richard et al., 2003)	.18
Median of 147,328 effect sizes reported in Journal of Applied Psychology and Personnel Psychology from 1980 to 2010 (Bosco et al., 2015)	.16
Median of 708 meta-analytically derived correlations in social and personality psychology (Gignac & Szodorai, 2016)	.19

Note: Where necessary, we converted between correlation and effect size using the following: $r = \frac{d}{\sqrt{d^{2} + 4}}$ and $d = \frac{2 r}{\sqrt{1 - r^{2}}}$ (Cohen, 1988, p. 23).

There is good reason to believe that unpublished effect sizes (i.e., results hidden in the proverbial file drawer) may be even smaller. For example, Polanin et al. (2016) estimated the average difference in effect size between published and unpublished studies in meta-analyses to be r = .09 (d = 0.18). When the Open Science Collaboration replicated 100 studies from three high-impact journals, the average replication effect size was r = .20—half that of the original (r = .40; Open Science Collaboration, 2015). Similarly, 11 years of student replication projects found effects (r = .14) that were half the size of those originally published (r = .30; Boyce et al., 2023). Multiple-lab replications of 15 meta-analyses yielded effects (r = .08) that were one-third the size of the originally published estimates (r = .21; Kvarven et al., 2020).⁶ And recent reviews have estimated the effect of nudges in practitioner reports to be about one-sixth the size of corresponding effects in the published literature (Dellavigna & Linos, 2022).⁷

After accounting for publication bias, the evidence suggests that the median effect size of a finding in psychological science may be no larger than r = .1. Of course, the distribution contains larger and smaller effects. Effect sizes depend on idiosyncratic researcher decisions around which x-y relationship to study, particularities of the study design, specifications of the statistical analyses, and more—such that even their central tendencies may be misleading (Simonsohn et al., 2022). Regardless, we contend that the typical x-y relationship studied by psychological scientists is likely so slight that to detect it at all requires a tightly controlled research design—one that effectively minimizes the influence of non-x variables.

To be clear, our point is not that effect sizes need to be gigantic to matter. Effects on the order of r = .1, or even smaller, can be consequential when the x-y relationship is repeated many times, applied to many people, or low-cost to implement (on nudges, see Benartzi et al., 2017; on wise interventions, see Walton, 2014), or when y is especially important (Funder & Ozer, 2019). On the contrary, we are increasingly skeptical of eye-popping effect sizes, particularly when produced by subtle and brief manipulations.

What concerns us is that if effect sizes in psychological science are typically as small as we surmise, then simple x-y statements (e.g., “Grittier people are more successful,” “Nudges encourage healthy choices”), by failing to highlight the de minimis nature of the relationship, can inadvertently mislead their audience.

What’s more, we suspect that as researchers, we may also be misleading ourselves. The primary evidence for researcher overconfidence in x-y effects is the persistent norm of underpowering studies. More than 60 years ago, Cohen (1962) lamented that the typical study was underpowered. As shown in Table 2, subsequent surveys of statistical power do not suggest much progress. From Cohen’s day to the present, the median study in psychological science continues to have only a one in six chance of detecting an effect of r = .1. Although we tend to recruit dozens of subjects for a given study, what is required are samples in the hundreds or more (e.g., N > 780 to detect r = .1 using a two-tailed t test with a threshold of p < .05 with 80% power).⁸

Table 2.

Median Statistical Power of Studies in Psychological Science Across Four Reviews

Source	Percentage of studies with 80% power to detect historically conventional effect sizes
Source	Small (r = .1–.2)	Medium (r = .3–.4)	Large (r = .5–.6)
70 articles published in the Journal of Abnormal and Social Psychology in 1960 using r = .2, .4, and .6 as benchmarks (Cohen, 1962)^a	17%	46%	89%
221 articles published in the Journal of Abnormal Psychology, Journal of Consulting and Clinical Psychology, and Journal of Personality and Social Psychology in 1982 using r = .1, .3, and .5 as benchmarks (Rossi, 1990)	12%	53%	89%
54 articles published in the Journal of Abnormal Psychology in 1984 using r = .2, .4, and .6 as benchmarks (Sedlmeier & Gigerenzer, 1992)	14%	44%	90%
2,261 articles using t tests published in Psychological Science, Cognitive Science, Cognition, Acta Psychologica, and Journal of Experimental Child Psychology between 2011 and 2014 using r = .1, .3, and .5 as benchmarks (Szucs & Ioannidis, 2017)	16%	60%	81%

Note: Each numeric column represents the percentage of studies in the cited review that were powered to detect a small, medium, and large effect, respectively. The definition of each category varied slightly across reviews and is specified in the first column. When necessary, we transformed effect sizes into Pearson correlations.

Cohen (1962) chose small, medium, and large values for d (the standardized difference between two groups) that differ slightly from conversions to Pearson’s r. Cohen’s d values of 0.2, 0.5, and 0.8 convert to Pearson’s r values of approximately .100, .243, and .371, respectively. Cohen explained that the larger rules of thumb chosen for Pearson r correlations reflected a variety of considerations, including differences in the expected effects in correlational (e.g., validity coefficients) versus experimental research.

How gloomy is the prospect of effects so small that they can be detected only with very large samples? It may help to recall that a common measure of effects, Pearson’s r, is a ratio: The numerator describes the covariance of x and y, and the denominator describes the variability (SD) of x and y. Likewise, Cohen’s d is a ratio: The numerator describes how y changes in response to a given change in x, and the denominator describes the variability of y. Small effect sizes in psychological science are not a symptom of small numerators so much as enormous denominators. In the form of a one-liner: People are complicated.

A Special Case of the Focusing Illusion

So why do small effects still surprise us? We suggest that effect size magnification is, in part, driven by a special case of the focusing illusion, a concept first introduced with an aphorism: “Nothing in life is as important as you think it is when you are thinking about it” (Kahneman, 2011, p. 402).

To date, happiness has been a primary outcome of interest in empirical demonstrations of the focusing illusion (Schkade & Kahneman, 1998). Like success, happiness depends on a multitude of factors, each of which can move happiness up or down and surely interact in uncountable ways—as shown in Figure 1a. Yet as suggested in Figure 1b, narrowing our attention to consider one particular factor—focusing in on it—momentarily isolates and magnifies its importance.

Consider whether someone would be happier living in the Midwest or Southern California. Their happiness would be driven by the weather, cost of living, industry mix, cultural influences, and more. Schkade and Kahneman (1998) found that self-reported overall life satisfaction was quite similar in the two regions, but—focusing on the salient factor of the weather—participants from both regions predicted that Midwesterners would be less happy than Californians.⁹ When we zoom out to consider the myriad determinants of how satisfied a person is with their life overall, it seems foolish to put much weight on the weather. And yet, to examine the potential influence of weather on happiness, we must, by necessity, zoom in.

What underlies the focusing illusion is the highly selective nature of attention. By analogy, the human eye has only one fovea, covering only about one degree of the visual field—approximately the size of your thumb when held at arm’s length. We see whatever is in our fovea more clearly than anything in the periphery. Such acuity is made possible by a far greater density of photoreceptors in the fovea compared with the rest of the retina—a prioritization necessitated by the fact that dramatically more visual information enters the retina than could possibly be processed by the brain (Luck & Ford, 1998). In other words, rather than see everything with equal acuity, we prioritize what is central.

We conjecture that simple x-y statements focus attention in a similar way. When we use a simple x-y statement to specify our hypotheses ex ante or to summarize our results ex post, what unfolds is a kind of mental movie. In the mind’s eye, we conjure an image of x changing y against an empty backdrop—or, at most, a backdrop of other non-x variables optimally favorable for this effect. What we tend not to do, either before or after collecting data, is conjure a messier yet more realistic narrative.

Are we wrong not to do so? Try telling or recalling a story without spotlighting the protagonist and the plotline, instead describing every ancillary detail of the setting, all minor characters, and every subplot. “Writing is selection,” says the writer John McPhee (2015). So, too, is thinking—and perhaps especially so for scientists. From our one-liners to our carefully controlled research designs, to make sense of x as it relates to y, we must only have eyes for x.

Implications

Experts learn more and more about less and less, goes the old joke, until they know everything about nothing. We likewise suggest an inherent tradeoff between depth and breadth that renders effect size magnification something of a “necessary evil.”

On one hand, it is a serious error to repeatedly expect dramatically larger x-y relationships than what we typically find. This bias introduces problems from the very beginning of a research program, when we decide how many participants to recruit, to the end, when we communicate our findings to fellow scientists and the public. Further, it compounds more systemic problems, such as the pressure to publish novel and counterintuitive findings at a galloping pace. When we expect implausibly large effects, we may find ourselves publishing spurious findings from underpowered studies (Gelman & Carlin, 2014)—exacerbating the already thorny collective-action challenges that pervade social science (Hoekstra & Vazire, 2021).

On the other hand, overestimating the importance of a given variable while we are thinking about it may be the only way we can think about it in the first place. Put differently, the mental acuity afforded by zooming in on one x-y relationship may be possible only while ignoring myriad other causes and moderators that likely diminish this relationship. In fact, the exogenous minimizing of confounds is the very essence of the experimental method. There is a seemingly inescapable tradeoff between ecological validity (real-world relevance) and internal validity (clean causal inference) embedded in our scientific methods (Campbell & Stanley, 1963/2015, p. 5).

Of course, effect size magnification is not a peccadillo unique to psychological scientists. No one can see or think about everything everywhere all at once. As a profession, however, we are responsible for reckoning with it.

Although remedies are not straightforward, as a start, a serious recalibration of expectations is in order (cf. Götz et al., 2022). There is simply no reason to expect effects large enough, as Cohen (1992) put it, to be “visible to the naked eye,” and no reason to be embarrassed about effects that are not. After all, the modest size of the typical x-y relationship in psychology does not undermine causal inference per se. Certainly, no one would doubt the causal influence of genes on behavior—but because outcomes are typically influenced by hundreds of genes, it is not surprising that a relevant single-nucleotide polymorphism x usually explains far less than 0.1% (an r of approximately .03; Chabris et al., 2015) of the variance in a given phenotype y (O’Connor, 2021). Further, just as polygenic scores, which aggregate the effects of numerous genes, often correlate higher with phenotypes than any single gene (Plomin & von Stumm, 2022), it may be that structural influences (e.g., growing up in Texas vs. Tunisia; the combined effects of cigarette taxes, antismoking media campaigns, and prohibitions against smoking in public spaces) that represent many, many x variables in concert exert a larger influence on our focal y outcomes than any single variable x (Chater & Loewenstein, 2023). Unfortunately, these complex influences are not easily studied using the scientific method.

Second, we can search for and report the moderating and boundary conditions of the x-y relationships we study (Götz et al., 2024; Krefeld-Schwalb et al., 2024), predicting effects out of sample (Salganik et al., 2020) and explicitly recognizing limits to generalizability (Simons et al., 2017).

Third, just as the eye rapidly shifts from one fixation point to another, we can periodically wrest attention away from the x variable we most often think about and instead consider a non-x influence on outcome y, and then another, and so on. We might even center our research on a given y rather than on a given x (Watts, 2017). And every so often, we might zoom out, take stock of the empirical x-y one-liners that we and other psychological scientists have accumulated, and weave them together into more holistic theories of the phenomenon of interest (e.g., Lewin, 1938).

We may not be able to escape the tendency to exaggerate the effect we’re thinking about while we’re thinking about it—but we can shift what we’re thinking about. Doing so may be our only hope of glimpsing the big picture.

Footnotes

Acknowledgements

We are deeply grateful to the late Daniel Kahneman for his extensive contributions to the conceptualization and refinement of this argument, as well as his seemingly boundless patience, humility, and good humor. We hope this article would make him proud. We also appreciate the thoughtful feedback we received from colleagues, including Abdullah Almaatouq, Frank Bosco, David Brainard, Christopher Bryan, Francisco Ceballos, Christopher Chabris, Stefano Dellavigna, David Funder, Andrew Gelman, Gilles Gignac, Samuel Gosling, Friedrich Götz, Elizabeth Linos, Brian Nosek, Daniel Ozer, Peter Rentfrow, Dan Richard, David Schkade, Daniel Simons, Barnabás Szászi, Elizabeth Tipton, Eric Turkheimer, Simine Vazire, Duncan Watts, and two reviewers.

Transparency

Action Editor: Robert L. Goldstone

Editor: Robert L. Goldstone

ORCID iD

Linnea Gandhi

Notes

References

Bakker

Van Dijk

Wicherts

J. M.

(2012). The rules of the game called psychological science. Perspectives on Psychological Science, 7(6), 543–554. https://doi.org/10.1177/17456916124590

Benartzi

Beshears

Milkman

K. L.

Sunstein

C. R.

Thaler

R. H.

Shankar

Tucker-Ray

Congdon

W. J.

Galing

(2017). Should governments invest more in nudging? Psychological Science, 28(8), 1041–1055. https://doi.org/10.1177/0956797617702501

Benjamin

D. J.

Berger

J. O.

Johannesson

Nosek

B. A.

Wagenmakers

E. J.

Berk

Bollen

K. A.

Brembs

Brown

Camerer

Cesarini

Chambers

C. D.

Clyde

Cook

T. D.

De Boeck

Dienes

Dreber

Easwaran

Efferson

. . . Johnson

V. E.

(2018). Redefine statistical significance. Nature Human Behaviour, 2(1), 6–10. https://doi.org/10.1038/s41562-017-0189-z

Bosco

F. A.

Aguinis

Singh

Field

J. G.

Pierce

C. A.

(2015). Correlational effect size benchmarks. Journal of Applied Psychology, 100(2), 431–449. https://doi.org/10.1037/a0038047

Boyce

Mathur

Frank

M. C.

(2023). Eleven years of student replication projects provide evidence on the correlates of replicability in psychology. Royal Society Open Science, 10(11), Article 231240. https://doi.org/10.1098/rsos.231240

Bryan

C. J.

Tipton

Yeager

D. S.

(2021). Behavioural science is unlikely to change the world without a heterogeneity revolution. Nature Human Behaviour, 5(8), 980–989. https://doi.org/10.1038/s41562-021-01143-3

Campbell

D. T.

Stanley

J. C.

(2015). Experimental and quasi-experimental designs for research. Ravenio Books. (Original work published 1963)

Chabris

C. F.

Lee

J. J.

Cesarini

Benjamin

D. J.

Laibson

D. I.

(2015). The fourth law of behavior genetics. Current Directions in Psychological Science, 24(4), 304–312. https://doi.org/10.1177/0963721415580430

Chater

Loewenstein

(2023). The i-frame and the s-frame: How focusing on individual-level solutions has led behavioral public policy astray. Behavioral and Brain Sciences, 46, Article e147. https://doi.org/10.1017/S0140525X22002023

10.

Cohen

(1962). The statistical power of abnormal-social psychological research: A review. The Journal of Abnormal and Social Psychology, 65(3), 145–153. https://doi.org/10.1037/h0045186

11.

Cohen

(1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale.

12.

Cohen

(1992). A power primer. Psychological Bulletin, 112, 1155–1159. https://doi.org/10.1037/0033-2909.112.1.155

13.

DellaVigna

Linos

(2022). RCTs to scale: Comprehensive evidence from two nudge units. Econometrica, 90(1), 81–116. https://doi.org/10.3982/ECTA18709

14.

Duckworth

A. L.

Quirk

Gallop

Hoyle

R. H.

Kelly

D. R.

Matthews

M. D.

(2019). Cognitive and noncognitive predictors of success. Proceedings of the National Academy of Sciences, USA, 116(47), 23499–23504.https://doi.org/10.1073/pnas.1910510116

15.

Funder

D. C.

Ozer

D. J.

(2019). Evaluating effect size in psychological research: Sense and nonsense. Advances in Methods and Practices in Psychological Science, 2(2), 156–168. https://doi.org/10.1177/2515245919847202

16.

Gelman

Carlin

(2014). Beyond power calculations: Assessing type S (sign) and type M (magnitude) errors. Perspectives on Psychological Science, 9(6), 641–651.https://doi.org/10.1177/1745691614551642

17.

Gelman

Hullman

Kennedy

(2023). Causal quartets: Different ways to attain the same average treatment effect. The American Statistician, 78, 267–272. https://doi.org/10.1080/00031305.2023.2267597

18.

Gignac

G. E.

Szodorai

E. T.

(2016). Effect size guidelines for individual differences researchers. Personality and Individual Differences, 102, 74–78.https://doi.org/10.1016/j.paid.2016.06.069

19.

Götz

F. M.

Gosling

S. D.

Rentfrow

P. J.

(2022). Small effects: The indispensable foundation for a cumulative psychological science. Perspectives on Psychological Science, 17(1), 205–215. https://doi.org/10.1177/1745691620984483

20.

Götz

F. M.

Gosling

S. D.

Rentfrow

P. J.

(2024). Effect sizes and what to make of them. Nature Human Behaviour, 8(5), 798–800. https://doi.org/10.1038/s41562-024-01858-z

21.

Hoekstra

Vazire

(2021). Aspiring to greater intellectual humility in science. Nature Human Behaviour, 5(12), 1602–1607. https://doi.org/10.1038/s41562-021-01203-8

22.

Kahneman

(2011). Thinking, fast and slow. Macmillan.

23.

Krefeld-Schwalb

Sugerman

E. R.

Johnson

E. J.

(2024). Exposing omitted moderators: Explaining why effect sizes differ in the social sciences. Proceedings of the National Academy of Sciences, USA, 121(12), Article e2306281121. https://doi.org/10.1073/pnas.2306281121

24.

Kvarven

Strømland

Johannesson

(2020). Comparing meta-analyses and preregistered multiple-laboratory replication projects. Nature Human Behaviour, 4(4), 423–434. https://doi.org/10.1038/s41562-019-0787-z

25.

Lewin

(1938). The conceptual representation and the measurement of psychological forces. Martino Publishing.

26.

Liou

Bailey

D. H.

Baldwin

C. R.

Duckworth

A. L.

Tay

(2023). Why life outcomes are hard to predict. PsyArXiv. https://doi.org/10.31234/osf.io/7q2rx

27.

Luck

S. J.

Ford

M. A.

(1998). On the role of selective attention in visual perception. Proceedings of the National Academy of Sciences, USA, 95(3), 825–830. https://doi.org/10.1073/pnas.95.3.825

28.

McPhee

(2015, September 7). Omission: Choosing what to leave out. The New Yorker. https://www.newyorker.com/magazine/2015/09/14/omission

29.

Milkman

K. L.

Gandhi

Patel

M. S.

Graci

H. N.

Gromet

D. M.

Kay

J. S.

Lee

T. W.

Rothschild

Bogard

J. E.

Brody

(2022). A 680,000-person megastudy of nudges to encourage vaccination in pharmacies. Proceedings of the National Academy of Sciences, USA, 119(6), Article e2115126119. https://doi.org/10.1073/pnas.2115126119

30.

O’Connor

L. J.

(2021). The distribution of common-variant effect sizes. Nature Genetics, 53(8), 1243–1249.https://doi.org/10.1038/s41588-021-00901-3

31.

Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251), Article aac4716. https://doi.org/10.1126/science.aac4716

32.

Plomin

von Stumm

(2022). Polygenic scores: Prediction versus explanation. Molecular Psychiatry, 27(1), 49–52. https://doi.org/10.1038/s41380-021-01348-y

33.

Polanin

J. R.

Tanner-Smith

E. E.

Hennessy

E. A.

(2016). Estimating the difference between published and unpublished effect sizes: A meta-review. Review of Educational Research, 86(1), 207–236. https://doi.org/10.3102/0034654315582067

34.

Richard

F. D.

Bond

C. F.

Jr. Stokes-Zoota

J. J.

(2003). One hundred years of social psychology quantitatively described. Review of General Psychology, 7(4), 331–363. https://doi.org/10.1037/1089-2680.7.4.331

35.

Rossi

J. S.

(1990). Statistical power of psychological research: What have we gained in 20 years? Journal of Consulting and Clinical Psychology, 58(5), 646–656. https://doi.org/10.1037//0022-006x.58.5.646

36.

Salganik

M. J.

Lundberg

Kindel

A. T.

Ahearn

C. E.

Al-Ghoneim

Almaatouq

Altschul

D. M.

Brand

J. E.

Carnegie

N. B.

Compton

R. J.

Datta

(2020). Measuring the predictability of life outcomes with a scientific mass collaboration. Proceedings of the National Academy of Sciences, USA, 117(15), 8398–8403.https://doi.org/10.1073/pnas.1915006117

37.

Schkade

D. A.

Kahneman

(1998). Does living in California make people happy? A focusing illusion in judgments of life satisfaction. Psychological Science, 9(5), 340–346. https://doi.org/10.1111/1467-9280.00066

38.

Sedlmeier

Gigerenzer

(1992). Do studies of statistical power have an effect on the power of studies? Psychological Bulletin, 105(2), 309–316. https://doi.org/10.1037/0033-2909.105.2.309

39.

Simons

D. J.

Shoda

Lindsay

D. S.

(2017). Constraints on generality (COG): A proposed addition to all empirical papers. Perspectives on Psychological Science, 12(6), 1123–1128. https://doi.org/10.1177/1745691617708630

40.

Simonsohn

Simmons

Nelson

L. D.

(2022). Above averaging in literature reviews. Nature Reviews Psychology, 1(10), 551–552. https://doi.org/10.1038/s44159-022-00101-8

41.

Szaszi

Goldstein

D. G.

Soman

Michie

(2024). Generalizability and effectiveness of choice architecture interventions. PsyArXiv. https://osf.io/preprints/psyarxiv/8p4vq

42.

Szaszi

Higney

Charlton

Gelman

Ziano

Aczel

Goldstein

D. G.

Yeager

D. S.

Tipton

(2022). No reason to expect large and consistent effects of nudge interventions. Proceedings of the National Academy of Sciences, USA, 119(31), Article e2200732119. https://doi.org/10.1073/pnas.2200732119

43.

Szucs

Ioannidis

J. P.

(2017). Empirical assessment of published effect sizes and power in the recent cognitive neuroscience and psychology literature. PLoS Biology, 15(3), Article e2000797. https://doi.org/10.1371/journal.pbio.2000797

44.

Tosh

Greengard

Goodrich

Gelman

Hsu

(2024). The piranha problem: Large effects swimming in a small pond. arXiv. https://doi.org/10.48550/arXiv.2105.13445

45.

Turkheimer

(2000). Three laws of behavior genetics and what they mean. Current Directions in Psychological Science, 9(5), 160–164. https://doi.org/10.1111/1467-8721.00084

46.

Turkheimer

Waldron

(2000). Nonshared environment: A theoretical, methodological, and quantitative review. Psychological Bulletin, 126(1), 78–108. https://doi.org/10.1037/0033-2909.126.1.78

47.

van Tilburg

W. A.

van Tilburg

L. J

. (2023). Impossible hypotheses and effect-size limits. Advances in Methods and Practices in Psychological Science, 6(4). https://doi.org/10.1177/25152459231197605

48.

Walton

G. M.

(2014). The new science of wise psychological interventions. Current Directions in Psychological Science, 23(1), 73–82. https://doi.org/10.1177/0963721413512856

49.

Watts

D. J.

(2017). Should social science be more solution-oriented? Nature Human Behaviour, 1, Article 0015. https://doi.org/10.1038/s41562-016-0015

Effect Size Magnification: No Variable Is as Important as the One You’re Thinking About—While You’re Thinking About It

Abstract

Keywords

Most Effect Sizes in Psychological Science Are Surprisingly Small

A Special Case of the Focusing Illusion

Implications

Recommended Reading

Footnotes

Acknowledgements

Transparency

ORCID iD

Notes

References