Sage Journals: Discover world-class research

Abstract

Weighted kappa described by Cohen in 1968 is widely used in psychological research to measure agreement between two independent raters. Everitt then provided the exact variance for weighted kappa for two raters. In this paper, Everitt's exact variance is extended to three or more raters.

Get full access to this article

View all access options for this article.

References

Andrés

A. M.

Marzo

P. E.

(2004) Delta: a new measure of agreement between two raters. British Journal of Mathematical and Statistical Psychology, 57, 1–19.

Andrés

A. M.

Marzo

P. F.

(2005) Chance-corrected measures of reliability and validity in K x K tables. Statistical Methods in Medical Research, 14, 473–492.

Banerjee

Capozzoli

McSweeney

Sinha

(1999) Beyond kappa: a review of interrater agreement measures. The Canadian Journal of Statistics, 27, 3–23.

Brennan

R. L.

Prediger

D. J.

(1981) Coefficient kappa: some uses, misuses, and alternatives. Educational and Psychological Measurement, 41, 687–699.

Cohen

(1960) A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37–46.

Cohen

(1968) Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin, 70, 213–220.

De Mast

(2007) Agreement and kappa-type indices. The American Statistician, 61, 148–153.

Everitt

B. S.

(1968) Moments of the statistics kappa and weighted kappa. British Journal of Mathematical and Statistical Psychology, 21, 97–103.

Fleiss

J. L.

Cohen

Everitt

B. S.

(1969) Large sample standard errors of kappa and weighted kappa. Psychological Bulletin, 72, 323–327.

10.

Graham

Jackson

(1993) The analysis of ordinal agreement data: beyond weighted kappa. Journal of Clinical Epidemiology, 46, 1055–1062.

11.

Hsu

L. M.

Field

(2003) Interrater agreement measures: comments on kappa_n, Cohen's kappa, Scott's Π, and Aickin's a. Understanding Statistics, 2, 205–219.

12.

Kundel

H. L.

Polansky

(2003) Measurement of observer agreement. Radiology, 228, 303–308.

13.

Maclure

Willett

W. C.

(1987) Misinterpretation and misuse of the kappa statistic. American Journal of Epidemiology, 126, 161–169.

14.

Mielke

P. W.

Berry

K. J.

(1988) Cumulant methods for analyzing independence of r-way contingency tables and goodness-of-fit frequency data. Biometrika, 75, 790–793.

15.

Mielke

P. W.

Berry

K. J.

Johnston

J. E.

(2005) A FORTRAN program for computing the exact variance of weighted kappa. Perceptual and Motor Skills, 101, 468–472.

16.

Nelson

J. C.

Pepe

M. S.

(2000) Statistical description of interrater variability in ordinal ratings. Statistical Methods in Medical Research, 9, 475–496.

17.

Schuster

(2004) A note on the interpretation of weighted kappa and its relations to other rater agreement statistics for metric scales. Educational and Psychological Measurement, 64, 243–253.

18.

Schuster

Smith

D. A.

(2005) Dispersion-weighted kappa: an integrative framework for metric and nominal scale agreement coefficients. Psychometrika, 70, 135–146.

19.

Zwick

(1988) Another look at interrater agreement. Psychological Bulletin, 103, 374–378.

The Exact Variance of Weighted Kappa with Multiple Raters

Abstract

Get full access to this article

References