This program computes multiple judge reliability levels under the following conditions: different sets of judges perform the ratings; the number of judges is a constant; and the scale of measurement is nominal.
Get full access to this article
View all access options for this article.
References
1.
Cicchetti, D. V. , Aivano, S. L., and Vitale, J. (1976). A computer program for assessing the reliability and systematic bias of individual measurements. Educational and Psychological Measurement, 36, 761-764.
2.
Cicchetti, D. V. , Aivano, S. L., and Vitale, J. (1977). Computer programs for assessing rater agreement and rater bias for qualitative data. Educational and Psychological Measurement, 37, 195-201.
3.
Cicchetti, D. V. and Heavens, R. (1979). RATCAT (Rater Agreement/Categorical Data). American Statistician, 33, 91.
4.
Cicchetti, D. V. and Heavens, R. (1981). A computer program for determining the significance of the difference between pairs of independently derived values of kappa or weighted kappa. Educational and Psychological Measurement, 41, 189-193.
5.
Cicchetti, D. V. , Lee, C., Fontana, A. F., and Dowds, B. Noel. (1978). A computer program for assessing specific category rater agreement for qualitative data. Educational and Psychological Measurement, 38, 805-813.
6.
Cicchetti, D. V. , Lyons, N. S., Heavens, R. A., and Horwitz, R. (1982) A computer program for correlating pairs of variables when one is measured on an ordinal and the other on a continuous scale of measurement. Educational and Psychological Measurement, 42, 209-213.
7.
Cicchetti, D. V. and Sparrow, S. S. (1981) Developing criteria for establishing interrater reliability of specific items: Applications to assessment of adaptive behavior. American Journal of Mental Deficiency, 86, 127-137.
8.
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37-46.
9.
Cohen, J. (1968). Weighted kappa: Nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin, 70, 213-220.
10.
Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76, 378-382.
11.
Fleiss, J. L. (1975). Measuring agreement between two judges on the presence or absence of a trait. Biometrics, 31, 651-659.
12.
Fleiss, J. L. (1981). Statistics for rates and proportions (2nd ed.). New York: Wiley.
13.
Fleiss, J. L. , Cohen, J., and Everitt, B. S. (1969). Large sample standard errors of kappa and weighted kappa. Psychological Bulletin, 72, 323-327.
14.
Fleiss, J. L. , Nee, J. C. M., and Landis, J. R. (1979). The large sample variance of kappa in the case of different sets of raters. Psychological Bulletin, 86, 974-977.
15.
Heavens, R. and Cicchetti, D. V. (1978). A computer program for calculating rater agreement and bias statistics using contingency table input. Proceedings of the American Statistical Association (Statistical Computing Section), 21, 366-370.