Statistical significance is concerned with whether a research result is due to chance or sampling variability; practical significance is concerned with whether the result is useful in the real world. A growing awareness of the limitations of null hypothesis significance tests has led to a search for ways to supplement these procedures. A variety of supplementary measures of effect magnitude have been proposed. The use of these procedures in four APA journals is examined, and an approach to assessing the practical significance of data is described.
Get full access to this article
View all access options for this article.
References
1.
Berkson, J. (1938). Some difficulties of interpretation encountered in the application of the chi-square test. Journal of the American Statistical Association, 33, 526-542.
2.
Brewer, J. K. (1978). Effect size: The most troublesome of the hypothesis testing considerations. Phi Delta Kappa, 11, 7-10.
3.
Carver, R. P. (1978). The case against statistical significance testing. Harvard Educational Review, 48, 378-399.
4.
Cohen, J. (1962). The statistical power of abnormal-social psychological research: A review. Journal of Abnormal and Social Psychology, 65, 145-153.
5.
Cohen, J. (1969). Statistical power analysis for the behavioral sciences. New York: Academic Press.
6.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum.
7.
Cohen, J. (1990). Things I have learned (so far). American Psychologist, 45, 1304-1312.
8.
Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155-159.
9.
Cohen, J. (1994). The earth is round (p < .05). American Psychologist, 49, 997-1003.
10.
Cooper, H. , & Findley, M. (1982). Expected effect sizes: Estimates for statistical power analysis in social psychology. Personality and Social Psychology Bulletin, 8, 168-173.
11.
Cramér, H. (1946). Mathematical methods of statistics. Princeton, NJ: Princeton University Press.
12.
Cronbach, L. J. (1975). Beyond two disciplines of scientific psychology. American Psychologist, 30, 116-127.
13.
Dar, R. (1987). Another look at Meehl, Lakatos, and the scientific practices of psychologists. American Psychologists, 42, 145-151.
14.
Falk, R. , & Greenbaum, C. W. (1995). Significance tests die hard: The amazing persistence of a probabilistic misconception. Theory & Psychology, 5, 75-98.
15.
Fisher, R. A. (1921). On the "probable error" of a coefficient of correlation deduced from a small sample. Metron, 1, 1-32.
16.
Fisher, R. A. (1925). Statistical methods for research workers. London: Oliver & Boyd.
17.
Fleishman, A. I. (1980). Confidence intervals for correlation ratios. Educational and Psychological Measurement, 40, 659-670.
18.
Fleiss, J. L. (1969). Estimating the magnitude of experimental effects. Psychological Bulletin, 72, 273-276.
19.
Fowler, R. L. (1985). Point estimates and confidence intervals in measures of association. Psychological Bulletin, 98, 160-165.
20.
Glass, G. V. (1976). Primary, secondary, and meta-analysis of research. Educational Researcher, 5, 3-8.
21.
Glass, G. V. , & Hakstian, A. R. (1969). Measures of association in comparative experiments: Their development and interpretation. American Educational Research Journal, 6, 403-414.
22.
Guttman, L. (1985). The illogic of statistical inference for cumulative science. Applied Stochastic Models and Data Analysis, 1, 3-10.
23.
Haase, R. F. , Waechter, D. M., & Solomon, G. S. (1982). How significant is a significant difference? Average effect size of research in counseling psychology. Journal of Counseling Psychology, 29, 58-65.
24.
Hays, W. L. (1963). Statistics for psychologists. New York: Holt, Rinehart & Winston.
25.
Hedges, L. V. (1981). Distributional theory for Glass's estimator of effect size and related estimators. Journal of Educational Statistics, 6, 107-128.
26.
Hedges, L. V. , & Olkin, I. (1985). Statistical methods for meta-analysis. Orlando, FL: Academic Press.
27.
Kelley, T. L. (1935). An unbiased correlation ratio measure. Proceedings of the National Academy of Sciences, 21, 554-559.
28.
Kendall, M. G. (1963). Rank correlation methods (3rd ed.). London: Griffin.
29.
Kirk, R. E. (Ed.). (1972). Statistical issues. Monterey, CA: Brooks/Cole.
30.
Lykken, D. T. (1968). Statistical significance in psychological research. Psychological Bulletin, 70, 151-159.
31.
McGraw, K. O. , & Wong, S. P. (1992). A common language effect size statistic. Psychological Bulletin, 111, 361-365.
32.
Meehl, P. E. (1967). Theory testing in psychology and physics: A methodological paradox. Philosophy of Science, 34, 103-115.
33.
Meehl, P. E. (1978). Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and the slow progress of soft psychology. Journal of Consulting and Clinical Psychology, 46, 806-834.
34.
Morrison, D. E. , & Henkel, R. E. (Eds.). (1970). The significance test controversy. Chicago: Aldine.
35.
Neyman, J. , & Pearson, E. S. (1928). On the use and interpretation of certain test criteria for purposes of statistical inference. Biometrika, 29A, Part 1: 175-240; Part II: 263-294.
36.
Oakes, M. (1986). Statistical inference: A commentary for the social and behavioral sciences. New York: John Wiley.
37.
Pearson, K. (1901). On the correlation of characters not quantitatively measurable. Philosophical Transactions of the Royal Society of London, 195, 1-47.
38.
Peters, C. C. , & VanVoorhis, W. R. (1940). Statistical procedures and their mathematical bases. New York: McGraw-Hill.
39.
Preece, P.F.W. (1983). A measure of experimental effect size based on success rates. Educational and Psychological Measurement, 43, 763-766.
40.
Rosenthal, R. (1978). Combining results of independent studies. Psychological Bulletin, 85, 185-193.
41.
Rosenthal, R. , & Rubin, D. B. (1982). A simple, general purpose display of magnitude of experimental effect. Journal of Educational Psychology, 74, 166-169.
42.
Rosenthal, R. , & Rubin, D. B. (1989). Effect size estimation for one-sample multiple-choice-type data: Design, analysis, and meta-analysis. Psychological Bulletin, 106, 332-337.
43.
Rosenthal, R. , & Rubin, D. B. (1994). The counternull value of an effect size: A new statistic. Psychological Science, 5, 329-334.
44.
Rosnow, R. , & Rosenthal, R. (1989). Statistical procedures and the justification of knowledge in psychological science. American Psychologist, 44, 1276-1284.
45.
Rozeboom, W. W. (1960). The fallacy of the null hypothesis significance test. Psychological Bulletin, 57, 416-428.
46.
Schmidt, F. (1996a). APA Board of Scientific Affairs to study issue of significance testing, make recommendations. Score, 19, 1, 6.
47.
Schmidt, F. (1996b). Statistical significance testing and cumulative knowledge in psychology: Implications for the training of researchers. Psychological Methods, 1, 115-129.
48.
Sedlmeier, P. , & Gigerenzer, G. (1989). Do studies of statistical power have an effect on the power of studies?Psychological Bulletin, 105, 309-316.
49.
Shulman, L. S. (1970). Reconstruction of educational research. Review of Educational Research, 40, 371-393.
50.
Tang, P. C. (1938). The power function of the analysis of variance tests with tables and illustrations of their use. Statistics Research Memorandum, 2, 126-149.
51.
Tatsuoka, M. M. (1973). An examination of the statistical properties of a multivariate measure of strength of association. (Final Report to U.S. Office of Education on Contract No. OEG-5-72-0027.)
52.
Thompson, B. (1993). Statistical significance testing in contemporary practice: Some proposed alternatives with comments from journal editors. Journal of Experimental Education, 61(4).
53.
Thompson, B. (1996). AERA editorial policies regarding statistical significance testing: Three suggested reforms. Educational Researcher, 25, 26-30.
54.
Tukey, J. W. (1991). The philosophy of multiple comparisons. Statistical Science, 6, 100-116.
55.
Yates, F. (1951). The influence of "statistical methods for research workers" on the development of the science of statistics. Journal of the American Statistical Association, 46, 19-34.