Abstract
In the course of screening a form of a medical licensing exam for items that function differentially (DIF) between men and women, the authors used the traditional Mantel-Haenszel (MH) statistic for initial screening and a Bayesian method for deeper analysis. For very easy items, the MH statistic unexpectedly often found DIF where there was none. The Bayesian method did not lead the results astray. In this article, the authors describe one possible Bayesian approach for the study of DIF, illustrate its use on this data set, demonstrate situations in which the MH test can be misleading, explore these issues through a sequence of simulation studies, and offer a plausible explanation and advice for those who wish to study DIF.
Get full access to this article
View all access options for this article.
