Abstract
The area between item response functions esti mated in different samples is often used as a measure of differential item functioning (DIF). Under item response theory, this area should be 0, except for errors of measurement. This study examined the effectiveness of two statistical tests of this area—a Z test for exact signed area and a Z test for exact unsigned area—for different test length, sample size, proportion of DIF items on the test, and item parameter estimation conditions using the two- parameter model. Errors in detection made using these two statistics were compared with errors made using Lord's χ2. Differences between all three statistics were relatively small; however, the χ2 statistic was more effective than either of the two Z tests at detecting simulated DIF. The Z test for the exact signed area was the least effective and was the most likely to result in false negative errors.
Keywords
Get full access to this article
View all access options for this article.
