Reminder: Reliability of Global Judgments

Abstract

Reliability as measured by the extent of agreement is often a problem for complex global judgments. Empirically, the use of multiple raters improved reliability consistent with predictions from the Spearman-Brown formula. Implications for the reliability of clinical diagnosis are suggested.

Get full access to this article

View all access options for this article.

References

Cohen

Weighted kappa: Nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin, 1968, 70, 213–220.

Cronbach

L. J.

Gleser

G. G.

Nanda

Rajaratnam

The dependability of behavioral measurements. New York: Wiley, 1972.

Epstein

The stability of behavior: II. Implications for psychological research. American Psychologist, 1980, 35, 790–806.

Ghiselli

E. E.

Theory of psychological measurement. New York: McGraw-Hill, 1964.

Gulliksen

Theory of mental tests. New York: Wiley, 1950.

Kokes

R. F.

Strauss

J. S.

Klorman

Premorbid adjustment in schizophrenia: Part II. Measuring premorbid adjustment: The instruments and their development. Schizophrenia Bulletin, 1977, 3, 186–213.

Rowley

The relationship of reliability in classroom research to amount of observation: An extension of the Spearman-Brown formula. Journal of Educational Measurement, 1978, 15, 165–180.

Shrout

P. E.

Fleiss

J. L.

Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin, 1979, 86, 420–428.

Spitzer

R. L.

Cohen

Fleiss

J. L.

Endicott

Quantification of agreement in psychiatric diagnosis. Archives of General Psychiatry, 1967, 17, 83–87.