Abstract
The PARSCALE G2 statistic is arguably the most popular item fit statistic in operational testing. For long tests, the Type I error rates of the statistic have often been found to be satisfactory. However, the Type I error rates of the statistic have only been studied for sample sizes of up to several thousands. The authors examined the Type I error rates of the PARSCALE G2 statistic in a simulation study using sample sizes much larger than those considered in the literature. For any fixed test length, the Type I error rate of the PARSCALE G2 statistic is found to increase to 1 as the sample size increases. The findings contradict the claim in the PARSCALE software manual that the PARSCALE G2 statistic leads to a large-sample test and also contradict the common belief that the statistic has reasonable Type I error rates for long tests. Thus, this simulation study conveys the important practical message that the use of the PARSCALE G2 statistic cannot always be recommended even for long tests. The Type I error rates of the item fit statistics of Orlando and Thissen were found to be close to the nominal level for all simulation conditions considered here.
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
