Abstract
Pass/fail decisions for all two-method combinations of 10 standard-setting procedures were analyzed. The most reliable or equivalent methods were the practitioners—borderline group, Ebel—contrasting groups, and 33rd percentile—Angoff approaches. The least reliable or equivalent were the chance/ideal mean—masters group and Nedelsky—masters group approaches. The most valid of the 10 methods, as evidenced by correlations with an external criterion, were the practitioner and borderline group approaches while the least valid were the nonmasters, Nedelsky, and chance/ideal mean methods. Results of a more detailed study of the reliability and validity of the item-judgment methods are also reported.
Get full access to this article
View all access options for this article.
