Abstract
Model-data-fit of item response theory (IRT) models is generally assessed by comparing observed performance by examinees on individual items with performance that is predicted under the chosen IRT model. However, use of traditional chi-square methods to evaluate goodness-of-fit of IRT models is not appropriate when the underlying trait/ability is estimated imprecisely (e.g., shorter assessments). This article describes a goodness-of-fit statistic that considers directly the uncertainty with which ability is estimated as well as a resampling-based hypothesis testing procedure. A simulation study was conducted to evaluate the empirical power and Type I error rates for the proposed procedure. Results of the study indicated that the procedure should be useful for evaluating goodness-of-fit of IRT models for most testing applications where uncertainty in ability estimation is an issue.
Get full access to this article
View all access options for this article.
