A Computer Program for Assessing the Reliability of Nominal Scales Using Varying Sets of Multiple Raters

Abstract

This program computes multiple judge reliability levels under the following conditions: different sets of judges perform the ratings; the number of judges is a constant; and the scale of measurement is nominal.

Get full access to this article

View all access options for this article.

References

Cicchetti, D. V. , Aivano, S. L. , and Vitale, J. (1976). A computer program for assessing the reliability and systematic bias of individual measurements. Educational and Psychological Measurement, 36, 761-764.

Cicchetti, D. V. , Aivano, S. L. , and Vitale, J. (1977). Computer programs for assessing rater agreement and rater bias for qualitative data. Educational and Psychological Measurement, 37, 195-201.

Cicchetti, D. V. and Heavens, R. (1979). RATCAT (Rater Agreement/Categorical Data). American Statistician, 33, 91.

Cicchetti, D. V. and Heavens, R. (1981). A computer program for determining the significance of the difference between pairs of independently derived values of kappa or weighted kappa. Educational and Psychological Measurement, 41, 189-193.

Cicchetti, D. V. , Lee, C. , Fontana, A. F. , and Dowds, B. Noel. (1978). A computer program for assessing specific category rater agreement for qualitative data. Educational and Psychological Measurement, 38, 805-813.

Cicchetti, D. V. , Lyons, N. S. , Heavens, R. A. , and Horwitz, R. (1982) A computer program for correlating pairs of variables when one is measured on an ordinal and the other on a continuous scale of measurement. Educational and Psychological Measurement, 42, 209-213.

Cicchetti, D. V. and Sparrow, S. S. (1981) Developing criteria for establishing interrater reliability of specific items: Applications to assessment of adaptive behavior. American Journal of Mental Deficiency, 86, 127-137.

Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37-46.

Cohen, J. (1968). Weighted kappa: Nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin, 70, 213-220.

10.

Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76, 378-382.

11.

Fleiss, J. L. (1975). Measuring agreement between two judges on the presence or absence of a trait. Biometrics, 31, 651-659.

12.

Fleiss, J. L. (1981). Statistics for rates and proportions (2nd ed.). New York: Wiley.

13.

Fleiss, J. L. , Cohen, J. , and Everitt, B. S. (1969). Large sample standard errors of kappa and weighted kappa. Psychological Bulletin, 72, 323-327.

14.

Fleiss, J. L. , Nee, J. C. M. , and Landis, J. R. (1979). The large sample variance of kappa in the case of different sets of raters. Psychological Bulletin, 86, 974-977.

15.

Heavens, R. and Cicchetti, D. V. (1978). A computer program for calculating rater agreement and bias statistics using contingency table input. Proceedings of the American Statistical Association (Statistical Computing Section), 21, 366-370.