Cronbach alpha and Cohen kappa were compared and found to differ along two major facets. A fourfold classification system based on these facets clarifies the double contrast and produces a common metric allowing direct comparability. A new estimator, coefficient beta, is introduced in the process and is presented as a complement to coefficient alpha in estimating the psychometric properties of test scores and ratings.
Get full access to this article
View all access options for this article.
References
1.
AllenM. J.YenW. M. (1979) Introduction to measurement theory. Monterey, CA: Brooks/Cole.
2.
American Psychological Association. (1994) Publication manual of the American Psychological Association. (4th ed) Washington, DC: Author
3.
BeckerG. (2000) How important is transient error in estimating reliability? Going beyond simulation studiesPsychological Methods, 5, 370–379
4.
BerryK. J.MielkeP. W. (1988) A generalization of Cohen's kappa agreement measure to interval measurement and multiple raters. Educational and Psychological Measurement, 48, 921–933.
5.
ByrneB. M.ShavelsonR. J. (1986) On the structure of the adolescent self-concept. Journal of Educational Psychology, 78, 474–481.
6.
CohenJ. (1960) A coefficient of agreement for nominal data. Educational and Psychological Measurement, 20, 37–46.
7.
CohenJ. (1968) Weighted kappa: Nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin, 70, 213–220.
8.
CronbachL. J. (1951) Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297–334.
9.
CuretonE. E. (1965) Reliability and validity: Basic assumptions and experimental designs. Educational and Psychological Measurement, 25, 327–346.
10.
DunnG. (1989) Design and analysis of reliability studies. New York: Oxford.
11.
EbelR. L. (1951) Estimation of the reliability of ratings. Psychometrika, 16, 407–424.
12.
FanX.ChenM. (2000) Published studies of interrater reliability often overestimate reliability: Computing the correct coefficient. Educational and Psychological Measurement, 60, 532–542.
13.
FleissJ. L.CohenJ. (1973) The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educational and Psychological Measurement, 33, 613–619.
14.
FleissJ. L.CohenJ.EverittB. S. (1969) Large sample standard errors of kappa and weighted kappa. Psychological Bulletin, 72, 323–327.
15.
Gray-LittleB.WilliamsV. S. L.HancockT. D. (1997) An item response theory analysis of the Rosenberg Self-esteem Scale. Personality and Social Psychology Bulletin, 23, 443–451.
16.
GuilfordJ. P. (1954) Psychometric methods. New York: McGraw-Hill.
17.
GuttmanL. (1945) A basis for analyzing test-retest reliability. Psychometrika, 10, 255–282.
18.
HoytC. (1941) Test reliability estimated by analysis of variance. Psychometrika, 6, 153–160.
19.
JacksonR. W. B. (1939) Reliability of mental tests. British Journal of Psychology, 29, 267–287.
20.
LordF. M.NovickM. R. (1968) Statistical theories of mental test scores. Reading, MA: Addison-Wesley.
21.
McGrawK. O.WongS. P. (1996) Forming inferences about some intraclass correlation coefficients. Psychological Methods, 1, 30–46.
22.
O'BrienE. J. (1985) Global self-esteem scales: Unidimensional or multidimensional?Psychological Reports, 57, 383–389.
23.
RosenbergM. (1965) Society and the adolescent self-image. Princeton, NJ: Princeton Univer. Press.
24.
RosenthalR. (1973) Estimating effective reliability in studies that employ judges' ratings. Journal of Clinical Psychology, 29, 342–345.