Abstract
Assessing goodness of fit of item response theory models typically involves evaluating differences between observed and expected score response distributions using a chi-square test statistic. When these methods are applied to assessments that are shorter in length, uncertainty with which ability is estimated greatly affects the approximation to the null chi-square distribution. Results from a Monte Carlo study indicated serious departures between null theoretical distributions and empirically derived sampling distributions for the chi-square statistic for tests with 8 and 16 constructed response items. This article also describes a fit statistic that attempts to account for the uncertainty in estimating ability and that could therefore be applied to testing situations in which ability is not precisely estimated. This method employs more information from the same distribution used to obtain Bayesian point estimates of ability and reflects probabilities that examinees have ability equal to a range of values rather than restricting expectations to single values.
Get full access to this article
View all access options for this article.
