Abstract
The Mantel-Haenszel chi-square (χ2MH) is widely used to detect differential item functioning (item bias) between ethnic and gender-based subgroups on educational and psychological tests. The empirical behavior of χ2MH has been incompletely understood; previous research is inconclusive. The present simulation study explored the effects of sample size, number of items, and trait distributions on the power of χ2MH to detect modeled differential item functioning. A significant effect was obtained for sample size with unacceptably low power for 250 subjects each in the focal and reference groups. The discussion supports the 1990 recommendations of Swaminathan and Rogers, opposes the 1993 view of Zieky that a sample size of 250 for each group is adequate.
Get full access to this article
View all access options for this article.
