Sage Journals: Discover world-class research

Abstract

Testing and interpreting replications typically focus on whether they reproduce the sign (direction) and the statistical significance of the parameter estimates reported in a focal study. These side-by-side comparisons of replication and original study findings often yield a dichotomous “successful” or “unsuccessful” replication outcome. However, such practices can lead to incorrect conclusions, lose information by failing to reveal how and to what degree replication findings differ from those of the original research, and threaten replication’s role in safeguarding and advancing empirical literatures. This paper presents two approaches—a multi-staged set of confidence interval tests and Bayesian analyses—which use findings from original and replication studies as inputs for additional assessments of replicability. These methods overcome the limitations of side-by-side comparisons and provide more complete insights into the meaning and boundaries of the original study’s findings. Guidelines for their use are provided, and an empirical example is reported to provide concrete step-by-step illustrations.

Keywords

Bayesian methods confidence intervals replication reproducibility statistical significance

Get full access to this article

View all access options for this article.

References

Aguinis

Ramani

Alabduljader

(2018) What you see is what you get? Enhancing methodological transparency in management research. Academy of Management Annals 12(1): 83–110.

Anderson

Kelley

(2022) Sample size planning for replication studies: The devil is in the design. Psychological Methods 29(5): 844–867.

Anderson

Maxwell

(2016) There’s more than one way to conduct a replication study: Beyond statistical significance. Psychological Methods 21(1): 1–12.

Anderson

Kelley

Maxwell

(2017) Sample-size planning for more accurate statistical power: A method adjusting sample effect sizes for publication bias and uncertainty. Psychological Science 28(11): 1547–1562.

Baig

(2022) Bayesian inference: Evaluating replication attempts with Bayes factors. Nicotine and Tobacco Research 24(4): 626–629.

Bayarri

Mayoral

(2002a) Bayesian analysis and design for comparison of effect sizes. Journal of Statistical Planning and Inference 103(–2): 225–243.

Bayarri

Mayoral

(2002b) Bayesian design of “successful” replications. The American Statistician 56(3): 207–214.

Bayarri

Berger

Forte

, et al. (2012) Criteria for Bayesian model choice with application to variable selection. Annals of Statistics 40: 1550–1577.

Bebchuk

Cohen

Ferrell

(2009) What matters in corporate governance? The Review of Financial Studies 22(2): 783–827.

10.

Berchicci

King

(2022) Building knowledge by mapping model uncertainty in six studies of social and financial performance. Strategic Management Journal 43(7): 1319–1346.

11.

Berger

(2006) Bayes factors. In: Kotz

Balakrishnan

Read

, et al. (eds) Encyclopedia of Statistical Sciences, vol. 1, 2nd edn. Hoboken, NJ: Wiley, pp. 378–386.

12.

Bergh

Powell

Zhao

(2024) Another look at the managerial entrenchment hypothesis of acquisitions: A replication of Humphery-Jenner (2014). Journal of Management Scientific Reports 2(1): 62–99.

13.

Bergh

Sharp

Aguinis

, et al. (2017) Is there a credibility crisis in strategic management research? Evidence on the reproducibility of study findings. Strategic Organization 15(3): 423–436.

14.

Bettis

(2012) The search for asterisks: Compromised statistical tests and flawed theories. Strategic Management Journal 33(1): 108–113.

15.

Bettis

Ethiraj

Gambardella

, et al. (2016b) Creating repeatable cumulative knowledge in strategic management: A call for a broad and deep conversation among authors, referees, and editors. Strategic Management Journal 37(2): 257–261.

16.

Bettis

Helfat

Shaver

(2016a) The necessity, logic, and forms of replication. Strategic Management Journal 37(11): 2193–2203.

17.

Bonett

(2009) Meta-analytic interval estimation for standardized and unstandardized mean differences. Psychological Methods 14: 225–238.

18.

Bonett

(2021) Design and analysis of replication studies. Organizational Research Methods 24(3): 513–529.

19.

Brandt

IJzerman

Dijksterhuis

, et al. (2014) The replication recipe: What makes for a convincing replication? Journal of Experimental Social Psychology 50: 217–224.

20.

Bürkner

P-C

(2017) BRMS: An R package for Bayesian multilevel models using Stan. Journal of Statistical Software 80(1): 1–28.

21.

Busenbark

Yoon

Gamache

, et al. (2022) Omitted variable bias: Examining management research with the impact threshold of a confounding variable (ITCV). Journal of Management 48(1): 17–48.

22.

Carpenter

Gelman

Hoffman

, et al. (2017) Stan: A probabilistic programming language. Journal of Statistical Software 76(1): 1–32.

23.

Carsten

Clapp-Smith

Haslam

, et al. (2023) Doing better leadership science via replications and registered reports. The Leadership Quarterly 34(4): 101712.

24.

Certo

Albader

Raney

, et al. (2022) A Bayesian approach to nested data analysis: A primer for strategic management research. Strategic Organization 22: 241–268.

25.

Cohen

(1992) Statistical power analysis. Current Directions in Psychological Science 1(3): 98–101.

26.

Crandall

Sherman

(2016) On the scientific superiority of conceptual replications for scientific progress. Journal of Experimental Social Psychology 66: 93–99.

27.

Dau

Santangelo

van Witteloostuijn

(2022) Replication studies in international business. Journal of International Business Studies 53: 1–16.

28.

Gelman

(2015) The connection between varying treatment effects and the crisis of unreplicable research: A Bayesian perspective. Journal of Management 41(2): 632–643.

29.

Gelman

Stern

(2006) The difference between “significant” and “not significant” is not itself statistically significant. The American Statistician 60(4): 328–331.

30.

Gelman

Carlin

Stern

, et al. (2014) Bayesian Data Analysis, 3rd edn. Boca Raton, FL: CRC Press:.

31.

Gill

(2015) Bayesian Methods: A Social and Behavioral Sciences Approach. Boca Raton, FL: Taylor and Francis/CRC.

32.

Goldfarb

King

(2016) Scientific apophenia in strategic management research: Significance tests & mistaken inference. Strategic Management Journal 37(1): 167–176.

33.

Goldfarb

Yan

(2021) Revisiting Zuckerman’s (1999) categorical imperative: An application of epistemic maps for replication. Strategic Management Journal 42(11): 1963–1992.

34.

Gompers

Ishii

Metrick

(2003) Corporate governance and equity prices. The Quarterly Journal of Economics 118(1): 107–156.

35.

Greenland

Senn

Rothman

, et al. (2016) Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. European journal of epidemiology 31(4): 337–350.

36.

Harrison

Banks

Pollack

, et al. (2017) Publication bias in strategic management research. Journal of Management 43(2): 400–425.

37.

Humphery-Jenner

(2014) Takeover defenses, innovation, and value creation: Evidence from acquisition decisions. Strategic Management Journal 35(5): 668–690.

38.

Hunter

Schmidt

(1990) Methods of Meta-analysis: Correcting Error and Bias in Research Findings. Newbury Park, CA: Sage.

39.

Iparragirre

Lumley

Barrio

, et al. (2023) Variable selection with LASSO regression for complex survey data. Stat 12(1): e578.

40.

Jeffreys

(1961) Theory of Probability, 3rd edn. Oxford: Oxford University Press:.

41.

Kass

Raftery

(1995) Bayes factors. Journal of the American Statistical Association 90: 773–795.

42.

Köhler

Cortina

(2021) Play it again, Sam! An analysis of constructive replication in the organizational sciences. Journal of Management 47(2): 488–518.

43.

Köhler

Cortina

(2023). Constructive replication, reproducibility, and generalizability: Getting theory testing for JOMSR right. Journal of Management Scientific Reports 1(2): 75–93.

44.

Kruschke

(2015) Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan. London: Academic Press.

45.

Kruschke

Aguinis

Joo

(2012) The time has come: Bayesian methods for data analysis in the organizational sciences. Organizational Research Methods 15(4): 722–752.

46.

Lakens

(2017) Equivalence tests: A practical primer for t tests, correlations, and meta-analyses. Social Psychological and Personality Science 8(4): 355–362.

47.

Locke

Latham

(2020) Building a theory by induction: The example of goal-setting theory. Organizational Psychology Review 10: 223–239.

48.

Etz

Marsman

, et al. (2019) Replication Bayes factors from evidence updating. Behavior Research Methods 51: 2498–2508.

49.

Mackey

Dotson

(2024) Bayesian Statistics in Management Research: Theory, Applications, and Opportunities. In Oxford Research Encyclopedia of Business and Management.

50.

Marsman

Wagenmakers

(2017) Three insights from a Bayesian interpretation of the one-sided P value. Educational and Psychological Measurement 77(3): 529–539.

51.

McCann

Schwab

(2023) Bayesian analysis in strategic management research: Time to update your priors. Strategic Management Review 4(3): 1–32.

52.

McElreath

(2020) Statistical Rethinking: A Bayesian Course with Examples in R and Stan. Boca Raton, FL: CRC Press.

53.

Morey

Wagenmakers

(2014) Simple relation between Bayesian order-restricted and point-null hypothesis tests. Statistics & Probability Letters 92: 121–124.

54.

Nosek

Errington

(2020) What is replication? PLoS Biology 18(3): e3000691.

55.

Open Science Collaboration (2015) Estimating the reproducibility of psychological science. Science 349(6251): aac4716.

56.

Platt

(1964) Strong Inference: Certain systematic methods of scientific thinking may produce much more rapid progress than others. Science 146(3642): 347–353.

57.

Rouder

Speckman

Sun

, et al. (2009) Bayesian t-tests for accepting and rejecting the null hypothesis. Psychonomic Bulletin & Review 16: 225–237.

58.

Scheibehenne

Jamil

Wagenmakers

(2016) Bayesian evidence synthesis can reconcile seemingly inconsistent results: The case of hotel towel reuse. Psychological Science 27(7): 1043–1046.

59.

Schenker

Gentleman

(2001) On judging the significance of differences by examining the overlap between confidence intervals. The American Statistician 55(3): 182–186.

60.

Schmidt

(2009) Shall we really do it again? The powerful concept of replication is neglected in the social sciences. Review of General Psychology 13(2): 90–100.

61.

Schwab

Abrahamson

Starbuck

, et al. (2011) Researchers should make thoughtful assessments instead of null-hypothesis significance tests. Organization Science 22(4): 1105–1120.

62.

Schwab

Starbuck

(2017) A call for openness in research reporting: How to turn covert practices into helpful tools. Academy of Management Learning & Education 16(1): 125–141.

63.

Stan Development Team (2023) Stan modeling language: User’s guide and reference manual (Version 232). Available at: http://mc-stan.org

64.

Stanton

(2021) Evaluating equivalence and confirming the null in the organizational sciences. Organizational Research Methods 24(3): 491–512.

65.

Stata Corp (2023) Stata 18 Bayesian Analysis Reference Manual. College Station, TX: Stata Press.

66.

Tsang

Kwan

(1999) Replication and theory development in organizational science: A critical realist perspective. Academy of Management Review 24(4): 759–780.

67.

Tukey

(1991) The philosophy of multiple comparisons. Statistical Science 6(1): 100–116.

68.

Unlu

Lauen

Fuller

, et al. (2021) Can quasi-experimental evaluations that rely on state longitudinal data systems replicate experimental results? Journal of Policy Analysis and Management 40(2): 572–613.

69.

Van Aert

Van Assen

(2017) Bayesian evaluation of effect size after replicating an original study. PLoS ONE 12(4): e0175302.

70.

Verhagen

Wagenmakers

(2014) Bayesian tests to quantify the result of a replication attempt. Journal of Experimental Psychology: General 143(4): 1457.

71.

Walker

Brewer

Lee

, et al. (2019) Best practice recommendations for replicating experiments in public administration. Journal of Public Administration Research and Theory 29(4): 609–626.

72.

Wasserstein

Lazar

(2016) The ASA statement on p-values: Context, process, and purpose. The American Statistician 70(2): 129–133.

Testing and interpreting replication studies: Insights from confidence intervals and Bayesian analyses

Abstract

Keywords

Get full access to this article

References