Abstract
In applied measurement, test scores are usually transformed to decisions. Analogous to classical test theory, the reliability of decisions has been de fined as the consistency of decisions on a test and a retest or on two parallel tests. Coefficient kappa (Cohen, 1960) is used for assessing the consistency of decisions. This coefficient has been developed for assessing agreement between nominal scales. It is argued that the coefficient is not suited for as sessing consistency of decisions. Moreover, it is ar gued that the concept consistency of decisions is not appropriate for assessing the quality of a decision procedure. It is proposed that the concept con sistency of decisions be replaced by the concept optimality of the decision procedure. Two types of optimality are distinguished. The internal optimal ity is the risk of the decision procedure with respect to the true score the test is measuring. The external optimality is the risk of the decision procedure with respect to an external criterion. For assessing the optimality of a decision procedure, coefficient delta (van der Linden & Mellenbergh, 1978), which can be considered a standardization of the Bayes risk or expected loss, can be used. Two loss functions are dealt with: the threshold and the linear loss func tions. Assuming psychometric theory, coefficient delta for internal optimality can be computed from empirical data for both the threshold and the linear loss functions. The computation of coefficient delta for external optimality needs no assumption of psy chometric theory. For six tests coefficient delta as an index for internal optimality is computed for both loss functions; the results are compared with coefficient kappa for assessing the consistency of decisions with the same tests.
Get full access to this article
View all access options for this article.
