A Power Primer Revisited

Abstract

Among Jacob Cohen’s many influential contributions, his 1992 paper, “A Power Primer,” stands as a seminal work that fundamentally reshaped how researchers approach data analysis and study design.¹ To fully appreciate Cohen’s impact, one must acknowledge not only the statistical tools he introduced but also the paradigm shifts in scientific thinking that his work inspired. “A Power Primer” provided the research community with accessible explanations of statistical power, effect sizes, and sample size planning, concepts that were once confined mainly to advanced statistical literature.^1,2 Cohen’s ability to distill complex ideas into practical guidance made “A Power Primer” one of the most widely cited methodological articles in the behavioral and biomedical sciences.^3–5

More than just a technical manual, “A Power Primer” was a call to elevate scientific rigor. Cohen urged researchers to design studies with sufficient statistical power, not only to detect meaningful effects but also to protect participants and ensure valid and reproducible conclusions. As he argued, a study without power analysis is essentially a study without a plan. While his suggested benchmarks for small, medium, and large effect sizes became widely adopted, Cohen cautioned against treating them as fixed rules. He emphasized that these values were intended as general guidelines and encouraged researchers to interpret effect sizes within their specific fields.^1,2 Over time, scholars have extended this view by promoting the use of discipline-specific thresholds, particularly in medicine, neuroscience, and social psychology, where the practical importance of an effect can differ substantially.^6–8 Truly appreciating Cohen’s work requires recognizing that his contributions went far beyond statistical formulas.

Cohen’s framework has been especially influential in mental health research, where outcomes are often complex, subjective, and sensitive to small but meaningful changes.⁹ In studies of depression, anxiety, schizophrenia, and quality of life, effect sizes and statistical power are crucial for distinguishing clinically essential improvements from trivial statistical differences.¹⁰ Applying Cohen’s principles encourages mental health researchers to design adequately powered studies, justify sample sizes based on realistic expectations, and interpret findings beyond p values alone. For example, a modest effect size in symptom reduction may still represent a meaningful improvement in daily functioning or emotional well-being. By emphasizing effect size, power, and practical significance, Cohen’s framework helps ensure that mental health research produces results that are not only statistically defensible but also clinically and socially meaningful.

Moving forward, this article expands on Cohen’s seminal work by deepening the discussion of its underlying concepts and guiding researchers to appreciate the significance and continuing relevance of his contributions. The discussion is organized into two parts. The first, “Recognizing the Problem Cohen Addressed,” revisits the challenges in statistical power that motivated his work. The second, “Critical Evaluations,” offers reflections on its impact, limitations, and implications for modern research practice. The significance of this study lies in its effort to bridge the gap between Cohen’s foundational ideas and contemporary research applications, thereby helping current and future researchers design, interpret, and report studies with appropriate statistical power.

Discussion

Recognizing the Problem Cohen Addressed

Before Cohen’s intervention, the scientific community, particularly in psychology and the social sciences, was preoccupied mainly with p values and hypothesis testing.^11,12 Researchers often conducted studies with small sample sizes, resulting in underpowered research that was unlikely to detect true effects.⁵ Consequently, the scientific literature became filled with non-replicable findings and an overreliance on statistical significance, often at the expense of practical importance.¹³ Cohen boldly confronted this issue, urging the field to look beyond p values and recognize the importance of effect size and statistical power. These two concepts were underappreciated yet crucial for sound scientific inference.^1,2

Practical and Actionable Contributions

Cohen’s genius lay not only in identifying a problem but also in providing practical solutions. In “A Power Primer,” he offered clear guidelines for calculating and interpreting effect sizes, categorizing them as small, medium, or large across different statistical tests, such as independent-samples t-tests, one-way analysis of variance (ANOVA), and correlation coefficients.¹ These contributions were groundbreaking because they were accessible, intuitive, and immediately applicable. Researchers from diverse disciplines could readily adopt his guidelines to plan better studies, justify sample sizes, and interpret their findings with greater clarity and confidence. His work democratized power analysis and established it as a cornerstone of modern research methodology.^14,15

Inspiring a Paradigm Shift in Research Culture

Cohen’s influence extended far beyond statistical theory. His advocacy for effect size reporting and power analysis catalyzed a paradigm shift in the culture of scientific research. Journals began requiring the reporting of effect sizes alongside p values.^11,16–18 Ethics committees and funding bodies made power analysis a standard component of research proposals. As a result, study designs became more rigorous, findings more robust, and the body of scientific knowledge more cumulative. Importantly, Cohen’s work also laid the foundation for the replication movement in psychology and other fields. By emphasizing that statistical significance without power is meaningless, he indirectly championed the call for reproducibility and transparency in science, principles that have become central to contemporary research ethics.

Legacy in Modern Research Practices

Today, Cohen’s influence is visible in the tools we use, the language we speak, and the standards we uphold in research. Widely used software such as PASS (NCSS, LLC, Kaysville, Utah, USA) and G*Power are built on Cohen’s formulas and guidelines.¹⁹ Reporting standards such as the Standards for Reporting Diagnostic Accuracy Studies (STARD), Strengthening the Reporting of Observational Studies in Epidemiology (STROBE), and the Consolidated Standards of Reporting Trials (CONSORT) encourage or mandate the inclusion of effect sizes and power analyses.^16–18 Researchers routinely cite Cohen when discussing study design or interpreting findings, a testament to the enduring relevance of his work.

Critical Evaluations

Over-simplified Effect Size Benchmarks

One of the most widely criticized aspects of Cohen’s work is his establishment of arbitrary effect size benchmarks for different statistical tests. In his article, Cohen proposed that specific effect sizes, such as a Cohen’s d of 0.2, 0.5, and 0.8, should be considered small, medium, and large, respectively.¹ While this was useful for introducing the concept of effect size, these benchmarks have limitations when applied across diverse research fields and contexts.

In medical and clinical research, for example, the impact of a treatment may not align with Cohen’s proposed thresholds. A small effect size, as defined by Cohen, might have profound clinical relevance, especially when even minor changes in a health outcome can significantly improve patients’ quality of life. The clinical significance of an effect, rather than its statistical significance alone, should be considered, and Cohen’s paper does not fully address how to navigate this complexity.⁴ Researchers now understand that context plays a crucial role in interpreting effect sizes.

Reliance on Simple Statistical Models

Cohen’s approach assumes relatively simple statistical models, such as the two-sample t-test, which do not fully capture the complexity of modern medical research. Today, researchers often use more sophisticated statistical models, including multivariate analyses, longitudinal designs, and survival analysis, which were not covered in Cohen’s work.¹ The assumption that power analysis can be performed using basic, traditional tests such as independent sample t-tests or ANOVA limits the applicability of the paper in more advanced fields, particularly those involving large datasets and complex interactions between variables. For example, in clinical trials with multiple interventions or time points, a more advanced multilevel or hierarchical approach is required, and these methods necessitate different power analysis strategies that Cohen’s paper does not address.^14,20

Underestimation of Sample Size Variability and Uncertainty

Cohen’s paper presents a relatively straightforward formula for calculating sample sizes and assumes that estimates of effect sizes and population variances are accurate and stable. However, this approach overlooks the uncertainty inherent in such estimates, especially in fields like medicine, where populations are diverse and true effect sizes are often unknown. Modern approaches to sample size planning have evolved to incorporate Bayesian methods and uncertainty modeling, enabling a more flexible and dynamic power analysis that accounts for variability in assumptions.^21,22

Applying Cohen’s Principles in Modern Research Practices

Cohen’s central message was never about formulas alone. It was about how researchers think when they analyze data. In modern statistical analysis, where software can run hundreds of models in seconds, his concept provides a much-needed intellectual discipline. A practical way forward is to shift from testing to estimation as the primary goal of analysis. Instead of asking, “Is there an effect?” modern analysis should ask, “How large is the effect, and how precise is our estimate?” This means structuring analysis plans around effect sizes, confidence intervals, and meaningful benchmarks such as clinical, educational, or social importance.

Additionally, Cohen’s concept guides the responsible use of big data and advanced analytics. Large samples can make trivial effects look “significant.” Applying his thinking means actively filtering results through a relevance lens. This includes setting minimum meaningful effect thresholds, focusing on patterns that matter, and resisting the temptation to report everything that crosses a statistical cutoff. Even in complex models, researchers should extract and report interpretable effect measures, such as marginal effects, predicted differences, or risk changes, and relate them to real-world meaning.

Finally, training in statistics should emphasize judgment rather than rituals. Students and researchers should learn to argue about effect size, relevance, and uncertainty, not just significance. Software, journals, and reviewers can reinforce this by expecting effect-focused reporting and power-aware interpretation as standard practice.¹³ In this sense, applying Cohen’s concept in modern statistical analysis is not about updating old rules. It is about preserving a mindset. Statistics should be a tool for understanding meaningful effects under conditions of uncertainty, guided by scientific reasoning rather than mechanical thresholds.

Conclusions

Jacob Cohen’s “A Power Primer” remains one of the most widely cited and respected works in statistical methodology. To truly honor his contribution, however, we must move beyond citation and into action. This means integrating power analysis into the core of research planning, interpreting effect sizes within meaningful contexts, fostering a culture of transparency and reproducibility, and adopting evolving methods that reflect the complexity of modern science. While defining a single effect size or power can be challenging in modern analyses, Cohen’s core message remains highly relevant. He provided the blueprint, but it is our responsibility to build upon it. As researchers, educators, and reviewers, we must continue to uphold his call for statistical integrity and, in doing so, ensure that the science we produce is not just statistically significant but genuinely impactful.

Footnotes

Acknowledgements

I would like to thank the Director General of Health Malaysia for his permission to publish this article.

Declaration of Conflicting Interests

The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Declaration Regarding the Use of Generative AI

The author wrote the article. ChatGPT was used to identify grammatical errors and suggest amendments, some of which were accepted after careful consideration. The author remains fully responsible for the entire content of this article.

Funding

The author received no financial support for the research, authorship, and/or publication of this article.

References

Cohen

. A power primer. Psychol Bull, 1992; 112: 155–159.

Cohen

. The earth is round (p < .05). Am Psychol, 1994; 49: 997–1003.

Sink

and Stroh

Practical significance: The use of effect sizes in school counseling research. Prof Sch Couns, 2006; 9: 401–411. DOI: 10.5330/prsc.9.4.283746k6 64204023.

Ferguson

. An effect size primer: A guide for clinicians and researchers. Prof Psychol Res Pract, 2009; 40: 532–538. DOI: 10.1037/a0015808.

Bujang

. A step-by-step process for sample size determination for medical research. Malays J Med Sci, 2021; 28: 15.

Kirk

. Practical significance: A concept whose time has come. Educ Psychol Meas, 1996; 56: 746–759. DOI: 10.1177/0013164496056005002.

Kline

. Beyond significance testing: Reforming data analysis methods in behavioral research. Washington, DC: APA, 2004. DOI: 10.1037/10693-000.

Kraemer

. A simple effect size indicator for two-group comparisons: A comment on requivalent. Psychol Methods, 2005; 10: 413–419. DOI: 10.1037/1082-989X.10.4.413.

Eisen

, Ranganathan

, Seal

, . Measuring clinically meaningful change following mental health treatment. J Behav Health Serv Res, 2007; 34(3): 272–289.

10.

Krause

, Hetrick

, Courtney

, . How much is enough? Considering minimally important change in youth mental health outcomes. Lancet Psychiatry, 2022; 9(12): 992–998.

11.

Lykken

. Statistical significance in psychological research. Psychol Bull, 1968; 70: 151–159. DOI: 10.1037/h0026141.

12.

Loftus

. Psychology will be a much better science when we change the way we analyze data. Curr Dir Psychol Sci, 1996; 5: 161–171. DOI: 10.1111/1467-8721.ep11512376.

13.

Bujang

. The dilemma and wisdom in translating p values: A collaborative approach to strengthening scientific validity. Biomed Res Int, 2025; 2025: 6703756. DOI: 10.1155/bmri/6703756.

14.

Levine

and Hullett

Eta squared, partial eta squared, and misreporting of effect size in communication research. Hum Commun Res, 2002; 28: 612–625. DOI: 10.1111/j.1468-2958.2002.tb00828.x.

15.

Fidler

, Cumming

, Thomason

, . Toward improved statistical reporting in the journal of consulting and clinical psychology. J Consult Clin Psychol, 2005; 73: 136–143. DOI: 10.1037/0022-006X.73.1.136.

16.

Bossuyt

, Reitsma

, Bruns

, . The STARD statement for reporting studies of diagnostic accuracy: Explanation and elaboration. Ann Intern Med, 2003; 138: W1–W12. DOI: 10.7326/0003-4819-138-1-200301070-00012-w1.

17.

Ghaferi

, Schwartz

and Pawlik

. STROBE reporting guidelines for observational studies. JAMA Surg, 2021; 156: 577–578. DOI: 10.1001/jamasurg.2021.0528.

18.

Butcher

, Monsour

, Mew

, . Guidelines for reporting outcomes in trial reports: The CONSORT-Outcomes 2022 Extension. JAMA, 2022; 328: 2252–2264. DOI: 10.1001/jama.2022.21022.

19.

Faul

, Erdfelder

, Lang

, . G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav Res Methods, 2007; 39(2), 175–191. DOI: 10.3758/bf03193146.

20.

Olejnik

and Algina

Generalized eta and omega squared: Measures of effect size for some common research designs. Psychol Methods, 2003; 8: 434–447. DOI: 10.1037/ 1082-989X.8.4.434.

21.

Pek

and Park

Complexities in power analysis: Quantifying uncertainties with a Bayesian-classical hybrid approach. Psychol Methods, 2019; 24: 590. DOI: 10.1037/ met0000211.

22.

Pawel

and Held

. Closed-form power and sample size calculations for Bayes factors. Am Stat, 2025; 79: 1–5. DOI: 10.1080/00031305.2024.2314396.