Sage Journals: Discover world-class research

Abstract

When data analyses produce encouraging but nonsignificant results, researchers often respond by collecting more data. This may transform a disappointing dataset into a publishable study, but it does so at the cost of increasing the Type I error rate. How big of a problem is this, and what can we do about it? To answer the first question, we estimate the Type I error inflation based on the initial sample size, the number of participants used to augment the dataset, the critical value for determining significance (typically .05), and the maximum p value within the initial sample such that the dataset would be augmented. With one round of augmentation, Type I error inflation maximizes at .0975 with typical values from .0564 to .0883. To answer the second question, we review methods of adjusting the critical value to allow augmentation while maintaining p < .05, but we note that such methods must be applied a priori. For the common occurrence of post-hoc dataset augmentation, we develop a new statistic, p_augmented, that represents the magnitude of the resulting Type I error inflation. We argue that the disclosure of post-hoc dataset augmentation via p_augmented elevates such augmentation from a questionable research practice to an ethical research decision.

Keywords

significance testing research methods ethics

Get full access to this article

View all access options for this article.

References

American Psychological Association. (2010). Publication manual of the American Psychological Association (6th ed.). Washington, DC: Author.

Armitage

McPherson

C. K.

Rowe

B. C.

(1969). Repeated significance tests on accumulating data. Journal of the Royal Statistical Society, Series A: Statistics in Society, 132, 235–244.

Baguley

(2012). Serious stats: A guide to advanced statistics for the behavioral sciences. Basingstoke, England: Palgrave Macmillan.

Botella

Ximénez

Revuelta

Suero

(2006). Optimization of sample size in controlled experiments: The CLAST rule. Behavioral Research Methods, 38, 65–76. doi:10.3758/BF03192751

Cornfield

(1966). Sequential trials, sequential analysis and the likelihood principle. American Statistician, 20, 18–23.

Cui

Hung

H. M. J.

Wang

(1999). Modification of sample size in group sequential clinical trials. Biometrics, 55, 853–857.

Cumming

(2014). The new statistics: Why and how. Psychological Science, 25, 7–29.

Dienes

(2011). Bayesian versus orthodox statistics: Which side are you on? Perspectives on Psychological Science, 6, 274–290.

The Dirksen Congressional Center. (n.d.). A billion here, a billion there. Retrieved from http://www.dirksencenter.org/print_emd_billionhere.htm

10.

Eich

(2014). Business not as usual. Psychological Science, 25, 3–6.

11.

Fitts

D. A.

(2010). Improved stopping rules for the design of efficient small-sample experiments in biomedical and biobehavioral research. Behavior Research Methods, 42, 3–22. doi:10.3758/BRM.42.1.3

12.

Frick

R. W.

(1998). A better stopping rule for conventional statistical tests. Behavior Research Methods, Instruments, & Computers, 30, 690–697. doi:10.3758/BF03209488

13.

Heinlein

R. A.

(1966). The moon is a harsh mistress. New York, NY: The Berkley Publishing Group.

14.

John

L. K.

Loewenstein

Prelec

(2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23, 524–532. doi:10.1177/0956797611430953

15.

Keppel

Wickens

T. D.

(2004). Design and analysis: A researcher’s handbook (4th ed.). Upper Saddle River, NJ: Pearson Education.

16.

Lakens

(2013). On the benefits of adaptive designs and sequential analyses for psychological science. Retrieved from http://dx.doi.org/10.2139/ssrn.2333729

17.

Lakens

Evers

(2014). Sailing from the seas of chaos into the corridor of stability: Practical recommendations to increase the informational value of studies. Perspectives on Psychological Science, 9, 278–292.

18.

Lehmacher

Wassmer

(1999). Adaptive sample size calculations in group sequential trials. Biometrics, 55, 1286–1290.

19.

O’Brien

P. C.

Fleming

T. R.

(1979). A multiple testing procedure for clinical trials. Biometrics, 35, 549–556.

20.

Pocock

S. J.

(1977). Group sequential methods in the design and analysis of clinical trials. Biometrika, 64, 191–199.

21.

Pocock

S. J.

(1982). Interim analyses for randomized clinical trials: The group sequential approach. Biometrics, 38, 153–162.

22.

Rosnow

R. L.

Rosenthal

(1989). Statistical procedures and the justification of knowledge in psychological science. American Psychologist, 44, 1276–1284. doi:10.1037/0003-066X.44.10.1276

23.

Rouder

J. N.

Speckman

P. L.

Sun

Morey

R. D.

Iverson

(2009). Bayesian t tests for accepting and rejecting the null hypothesis. Psychonomic Bulletin & Review, 16, 225–237.

24.

Royall

R. M.

(1997). Statistical evidence: A likelihood paradigm. New York, NY: Chapman & Hall.

25.

Simmons

J. P.

Nelson

L. D.

Simonsohn

(2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22, 1359–1366. doi:10.1177/0956797611417632

26.

Wagenmakers

E.-J.

(2007). A practical solution to the pervasive problems of p values. Psychonomic Bulletin & Review, 14, 779–804. doi:10.3758/BF03194105

27.

Whitlock

M. C.

(2005). Combining probability from independent tests: The weighted Z-method is superior to Fisher’s approach. Journal of Evolutionary Biology, 18, 1368–1373. doi:10.1111/j.1420-9101.2005.00917.x

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.22 MB

0.00 MB