Abstract
The purpose of this paper was to examine alterna tive techniques for quantifying the errors associated with the criterion of equating a test to itself. Data for the study came from the national standardization of the 3-R's Achievement Test. The reading and mathe matics subtests were analyzed using random samples from the Grade 4 norming group. Errors for two item response theory (IRT; three-parameter and Rasch) methods and the equipercentile equating method were investigated. A total of 45 error estimates from the sampling distribution were obtained for each combina tion of equating method and content area. Analysis of variance procedures were also used to estimate the av erage error across methods for each content area. In addition, the results of the Phillips (1983a, 1983b) studies were reevaluated using the mean of the sam pling distribution of equating errors for each of the methods from the present study and from the corre sponding ANOVA error estimates. The results of this study suggest that single-replication error estimates may provide misleading assessments of the errors as sociated with equating a test to itself. The analysis of variance mean squares appeared somewhat promising as alternatives to error estimates by replication. Fi nally, the results of this study together with those of the Phillips (1983a) study suggest that the Rasch model may be more reliable than other IRT models for equating, but in some applications it is less valid.
Get full access to this article
View all access options for this article.
