Sage Journals: Discover world-class research

Abstract

In this article, I discuss measures of effect size for two-group comparisons where data are not appropriately analyzed by least-squares methods. The Mann–Whitney test calculates a statistic that is a very useful measure of effect size, particularly suited to situations in which differences are measured on scales that either are ordinal or use arbitrary scale units. Both the difference in medians and the median difference between groups are also useful measures of effect size.

Keywords

st0253 ranksum Wilcoxon ranksum test Mann–Whitney statistic Hodges–Lehman median shift effect size qreg

References

Acion

, Peterson

J. J.

, Temple

, and Arndt

2006. Probabilistic index: An intuitive non-parametric approach to measuring the size of treatment effects. Statistics in Medicine 25: 591–602.

Birnbaum

Z. W.

1956. On a use of the Mann-Whitney statistic. In Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, ed. Neyman

, 13–17. Berkeley, CA: University of California Press.

Bross

I. D. J.

1958. How to use ridit analysis. Biometrics 14: 18–38.

Brumback

L. C.

, Pepe

M. S.

, and Alonzo

T. A.

2006. Using the ROC curve for gauging treatment effect in clinical trials. Statistics in Medicine 25: 575–590.

Cox

N. J.

2003. vreverse: Stata module to reverse existing categorical variable. Statistical Software Components S434402, Department of Economics, Boston College. http://ideas.repec.org/c/boc/bocode/s434402.html.

Herrnstein

R. J.

, Loveland

D. H.

, and Cable

1976. Natural concepts in pigeons. Journal of Experimental Psychology: Animal Behavior Processes 2: 285–302.

Hodges

J. L.

Jr. , and Lehmann

E. L.

1963. Estimates of location based on rank tests. Annals of Mathematical Statistics 34: 598–611.

Koziol

J. A.

, and Jia

2009. The concordance index C and the Mann-Whitney parameter Pr(X>Y) with randomly censored data. Biometrical Journal 51: 467–474.

Mann

H. B.

, and Whitney

D. R.

1947. On a test of whether one of two random variables is stochastically larger than the other. Annals of Mathematical Statistics 18: 50–60.

10.

McGraw

K. O.

, and Wong

S. P.

1992. A common language effect size statistic. Psychological Bulletin 111: 361–365.

11.

Newson

1998. somersd: Stata module to calculate Kendall's tau-a, Somers’ D and median differences. Statistical Software Components S336401, Department of Economics, Boston College. http://ideas.repec.org/c/boc/bocode/s336401.html.

12.

Newson

2002. Parameters behind “nonparametric” statistics: Kendall's tau, Somers’ D and median differences. Stata Journal 2: 45–64.

13.

Newson

2006. Confidence intervals for rank statistics: Percentile slopes, differences, and ratios. Stata Journal 6: 497–520.

14.

Ruscio

2008. A probability-based measure of effect size: Robustness to base rates and other factors. Psychological Methods 13: 19–30.

15.

Theil

1950a. A rank invariant method of linear and polynomial regression analysis, I. Proceedings of the Koninklijke Nederlandse Akademie Wetenschappen, Series A – Mathematical Sciences 53: 386–392.

16.

Theil

1950b. A rank invariant method of linear and polynomial regression analysis, II. Proceedings of the Koninklijke Nederlandse Akademie Wetenschappen, Series A – Mathematical Sciences 53: 521–525.

17.

Theil

1950c. A rank invariant method of linear and polynomial regression analysis, III. Proceedings of the Koninklijke Nederlandse Akademie Wetenschappen, Series A – Mathematical Sciences 53: 1397–1412.

18.

Vargha

, and Delaney

H. D.

2000. A critique and improvement of the CL common language effect size statistics of McGraw and Wong. Journal of Educational and Behavioral Statistics 25: 101–132.

19.

Wilcoxon

1945. Individual comparisons by ranking methods. Biometrics Bulletin 1: 80–83.

20.

Wilcoxon

1950. Some rapid approximate statistical procedures. Annals of the New York Academy of Sciences 52: 808–814.

What Hypotheses do “Nonparametric” Two-Group Tests Actually Test?

Abstract

Keywords

References