Reliability as measured by the extent of agreement is often a problem for complex global judgments. Empirically, the use of multiple raters improved reliability consistent with predictions from the Spearman-Brown formula. Implications for the reliability of clinical diagnosis are suggested.
Get full access to this article
View all access options for this article.
References
1.
CohenJ.Weighted kappa: Nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin, 1968, 70, 213–220.
2.
CronbachL. J.GleserG. G.NandaH.RajaratnamN.The dependability of behavioral measurements. New York: Wiley, 1972.
3.
EpsteinS.The stability of behavior: II. Implications for psychological research. American Psychologist, 1980, 35, 790–806.
4.
GhiselliE. E.Theory of psychological measurement. New York: McGraw-Hill, 1964.
5.
GulliksenH.Theory of mental tests. New York: Wiley, 1950.
6.
KokesR. F.StraussJ. S.KlormanR.Premorbid adjustment in schizophrenia: Part II. Measuring premorbid adjustment: The instruments and their development. Schizophrenia Bulletin, 1977, 3, 186–213.
7.
RowleyG.The relationship of reliability in classroom research to amount of observation: An extension of the Spearman-Brown formula. Journal of Educational Measurement, 1978, 15, 165–180.