In this article, I discuss measures of effect size for two-group comparisons where data are not appropriately analyzed by least-squares methods. The Mann–Whitney test calculates a statistic that is a very useful measure of effect size, particularly suited to situations in which differences are measured on scales that either are ordinal or use arbitrary scale units. Both the difference in medians and the median difference between groups are also useful measures of effect size.
AcionL., PetersonJ. J., TempleS., and ArndtS.2006. Probabilistic index: An intuitive non-parametric approach to measuring the size of treatment effects. Statistics in Medicine25: 591–602.
2.
BirnbaumZ. W.1956. On a use of the Mann-Whitney statistic. In Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, ed. NeymanJ., 13–17. Berkeley, CA: University of California Press.
3.
BrossI. D. J.1958. How to use ridit analysis. Biometrics14: 18–38.
4.
BrumbackL. C., PepeM. S., and AlonzoT. A.2006. Using the ROC curve for gauging treatment effect in clinical trials. Statistics in Medicine25: 575–590.
5.
CoxN. J.2003. vreverse: Stata module to reverse existing categorical variable. Statistical Software Components S434402, Department of Economics, Boston College.http://ideas.repec.org/c/boc/bocode/s434402.html.
6.
HerrnsteinR. J., LovelandD. H., and CableC.1976. Natural concepts in pigeons. Journal of Experimental Psychology: Animal Behavior Processes2: 285–302.
7.
HodgesJ. L.Jr., and LehmannE. L.1963. Estimates of location based on rank tests. Annals of Mathematical Statistics34: 598–611.
8.
KoziolJ. A., and JiaZ.2009. The concordance index C and the Mann-Whitney parameter Pr(X>Y) with randomly censored data. Biometrical Journal51: 467–474.
9.
MannH. B., and WhitneyD. R.1947. On a test of whether one of two random variables is stochastically larger than the other. Annals of Mathematical Statistics18: 50–60.
10.
McGrawK. O., and WongS. P.1992. A common language effect size statistic. Psychological Bulletin111: 361–365.
11.
NewsonR.1998. somersd: Stata module to calculate Kendall's tau-a, Somers’ D and median differences. Statistical Software Components S336401, Department of Economics, Boston College.http://ideas.repec.org/c/boc/bocode/s336401.html.
12.
NewsonR.2002. Parameters behind “nonparametric” statistics: Kendall's tau, Somers’ D and median differences. Stata Journal2: 45–64.
13.
NewsonR.2006. Confidence intervals for rank statistics: Percentile slopes, differences, and ratios. Stata Journal6: 497–520.
14.
RuscioJ.2008. A probability-based measure of effect size: Robustness to base rates and other factors. Psychological Methods13: 19–30.
15.
TheilH.1950a. A rank invariant method of linear and polynomial regression analysis, I. Proceedings of the Koninklijke Nederlandse Akademie Wetenschappen, Series A – Mathematical Sciences53: 386–392.
16.
TheilH.1950b. A rank invariant method of linear and polynomial regression analysis, II. Proceedings of the Koninklijke Nederlandse Akademie Wetenschappen, Series A – Mathematical Sciences53: 521–525.
17.
TheilH.1950c. A rank invariant method of linear and polynomial regression analysis, III. Proceedings of the Koninklijke Nederlandse Akademie Wetenschappen, Series A – Mathematical Sciences53: 1397–1412.
18.
VarghaA., and DelaneyH. D.2000. A critique and improvement of the CL common language effect size statistics of McGraw and Wong. Journal of Educational and Behavioral Statistics25: 101–132.
19.
WilcoxonF.1945. Individual comparisons by ranking methods. Biometrics Bulletin1: 80–83.
20.
WilcoxonF.1950. Some rapid approximate statistical procedures. Annals of the New York Academy of Sciences52: 808–814.