Various indices for measuring agreement among several raters on the presence or absence of a trait can be interpreted as intraclass correlation coefficients. Such a reformulation clarifies the relationships among the measures, simplifies the computations involved, and permits simple significance tests to be carried out. An illustrative example is included.
Get full access to this article
View all access options for this article.
References
1.
Conger, A. J. (1980). Integration and generalization of kappas for multiple raters. Psychological Bulletin, 88, 322-328.
2.
Fleiss, J. L. (1965). Estimating the accuracy of dichotomous judgements. Psychometrika, 30, 469-479.
3.
Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76, 378-382.
4.
Fleiss, J. L. and Cohen, J. (1973). The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educational and Psychological Measurement, 33, 613-619.
5.
Maxwell, A. E. and Pilliner, A. E. G. (1968). Deriving coefficients of reliability and agreement for ratings. British Journal of Mathematical and Statistical Psychology, 21, 105-116.
6.
Winer, B. J. (1971). Statistical principles in experimental design (2nd ed.). New York: McGraw-Hill.