A Microcomputer Basic Program to Calculate the Level of Agreement Between Two Raters Using Nominal Scale Classification

Abstract

A simple percentage measure of agreement among raters using nominal scales may provide misleading reliability information. Scott's Pi and Cohen's Kappa are two chance-corrected statistics which have been widely utilized in assessing interrater agreement. This paper presents a BASIC microcomputer program which calculates these two chance-corrected measures of interrater reliability.

Get full access to this article

View all access options for this article.

References

Antonak, R. F. A computer program to compute measures of response agreement for nominal scale data obtained from two judges . Behavioral Research Methods and Instrumentation , 1977, 9, 553 .

Berk, R. A. and Campbell, K. L. A FORTRAN program for Cohen's kappa coefficient of observer agreement. Behavior Research Methods and Instrumentation , 1976, 8, 396.

Cohen, J. A coefficient of agreement for nominal scales. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1960, 20, 37 -46.

Coons, D. F. A concise method for computing normal curve areas using a calculator. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1978, 38, 653-655.

Costello, A. J. The reliability of direct observations. British Psychological Society Bulletin, 1973, 26, 105 -108.

Flanders, N. A. The problems of observer training and reliability. In E. J. Amidon and J. B. Hough (Eds.), Interaction Analysis: Theory, Research, and Application. Reading, Mass .: Addison-Wesley, 1967 .

Hartmann, D. P. Considerations in the choice of interobserver reliability estimates. Journal of Applied Behavior Analysis, 1977, 10, 103-116.

Krippendorff, K. Bivariate agreement coefficients for reliability of data. In E. F. Borgatta and G. W. Bohrnstedt (Eds.), Sociological Methodology, San Francisco : Jossey-Bass, 1970.

Light, R. J. Issues in the analysis of qualitative data. In R. M. W. Travers (Ed.), Second Handbook of Research on Teaching. Chicago : Rand McNally, 1973.

10.

Scott, W. A. Reliability of content analysis: the case of nominal scale coding. Public Opinion Quarterly , 1955, 19, 321-325.

11.

Spitzer, R. L. and Fleiss, J. L. A re-analysis of the reliability of psychiatric diagnosis . British Journal of Psychiatry, 1974 , 125, 341-347 .

12.

Thornton, B. W. and Croskey, F. L. A computer program for calculating an index of interobserver reliability from timeseries data. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1975, 35, 735-737.

13.

Watkins, M. W. Chance and interrater agreement on manuscripts. American Psychologist, 1979 , 34, 796 -798.