Abstract
Dorans and Holland (2000) and von Davier, Holland, and Thayer (2003) introduced measures of the degree to which an observed-score equating function is sensitive to the population on which it is computed. This article extends the findings of Dorans and Holland and of von Davier et al. to item response theory (IRT) true-score equating methods that are commonly used in the nonequivalent-groups anchor test (NEAT) design. Using data from the Advanced Placement Program Calculus AB exam, which contain multiple-choice (MC) and free-response (FR) sections, the authors investigate the population sensitivity of the IRT equating functions computed for the MC section only and for the MC and FR sections together. The degree of population sensitivity is also compared across three equating methods: the IRT true-score equating method and two observed-score equating methods, chained equipercentile and Tucker linear equating.
Keywords
Get full access to this article
View all access options for this article.
