Abstract
The validity of five methods of estimating the reliability of criterion-referenced tests was evaluated: one method based on the binomial expansion; two based on Kuder and Richardson's formulae 20 and 21; and two methods based on the analysis of variance. The methods were compared across nine conditions of variability among item means. Within each condition the number of test items, number of testees, the value of the criterion, the population mean, and variance of true scores were varied to form 1024 cases. The results were analyzed by means of a conditions-by-methods analysis of variance, the Newman-Keuls test, and a nonparametric multiple comparison procedure. There was a tendency for all of the methods to be conservative. The KR-21 method tended to be more valid given low variability among item means, and the KR-20 method given high variability.
Get full access to this article
View all access options for this article.
