Abstract
The performance of three equating methods— the presmoothed equipercentile method, the item response theory (IRT) true score method, and the IRT observed score method—were examined based on three equating criteria: the same distributions property, the first-order equity property, and the second-order equity property. The magnitude of the difficulty differences in the alternate forms was found to affect the extent to which the three equating properties hold. The Iowa Tests of Basic Skills (ITBS) standardization data and simulated data were analyzed in the study. The results showed that when the raw score distributions of alternate forms are similar, all three methods lead to adequate equating regardless of the criterion used. When the raw score distributions are dissimilar, to preserve the first order equity property, the IRT true score method performs best; to preserve the same distributions and second order equity properties, the equipercentile method and the IRT observed score method both perform well. The greater the difficulty difference between alternate forms, the less likely the first and second order equity properties hold. These results are intended to inform equating practice by suggesting, for actual equatings, the size of form-to-form differences that can lead each equating method to perform well and poorly relative to the various criteria.
Get full access to this article
View all access options for this article.
