Intercoder Reliability Estimation Approaches in Marketing: A Generalizability Theory Framework for Quantitative Data

Abstract

A review of estimation approaches for intercoder reliability reported in articles in leading marketing journals reveals that most marketing researchers are using inadequate measures. The authors recommend that marketing researchers report dependability indices based on generalizability theory for quantitative coding systems.

Get full access to this article

View all access options for this article.

References

Algina

James

(1978), “Comment on Bartko's ‘On Various Intraclass Correlation Reliability Coefficients’,” Psychological Bulletin, 85 (February), 135–40.

Anderson

James C.

(1985), “A Measurement Model to Assess Measure-Specific Factors in Multiple-Informant Research,” Journal of Marketing Research, 22 (February), 86–92.

Baker

B. O.

, Hardyck

C. D.

, and Petrinnovich

(1966), “Weak Measurement vs. Strong Statistics: An Empirical Critique of S. Stevens’ Proscription on Statistics,” Educational and Psychological Measurement, 26 (Summer), 291–309.

Box

G. E. P.

and Tiao

G. C.

(1973), Bayesian Inference in Statistical Analysis. Reading, MA: Addison-Wesley Publishing Company.

Brennan

Robert L.

(1978), “Extensions of Generalizability Theory to Domain Referenced Testing,” ACT Technical Bulletin No. 30 (June). Iowa City, IA: American College Testing Program, p. 131.

Brennan

Robert L.

(1983), Elements of Generalizability Theory. Iowa City, IA: American College Testing Program.

Brennan

Robert L.

and Dale

J. Prediger

(1981), “Coefficient Kappa: Some Uses, Misuses, and Alterantives,” Educational and Psychological Measurement, 41 (Autumn), 687–99.

Cardinet

Jean

, Yvan

Tourneur

, and Linda

Allal

(1976), “The Symmetry of Generalizability Theory: Applications to Educational Measurement,” Journal of Educational Measurement, 13 (Summer), 119–35.

Cardinet

Jean

, Yvan

Tourneur

, and Linda

Allal

(1981), “Extension of Generalizability Theory and Its Applications in Educational Measurement,” Journal of Educational Measurement, 18 (Winter), 183–204.

10.

Cardinet

Jean

, Yvan

Tourneur

, and Linda

Allal

(1982), “ERRATA—Extension of Generalizability Theory: Applications to Educational Measurement,” Journal of Educational Measurement, 19 (Winter), 331–2.

11.

Churchill

Gilbert A.

Jr. , Ford

Neil M.

, Hartley

Steven W.

, and Orville

C. Walker

Jr. (1985), “The Determinants of Salesperson Performance: A Meta-Analysis,” Journal of Marketing Research, 22 (May), 103–18.

12.

Cohen

Jacob

(1960), “A Coefficient of Agreement for Nominal Scales,” Educational and Psychological Measurement, 20 (Winter), 37–46.

13.

Cohen

Jacob

(1968), “A Coefficient of Agreement for Nominal Scales, Provision for Scaled Disagreement or Partial Credit,” Psychological Bulletin, 70 (October), 213–20.

14.

Cronbach

L. J.

(1951), “Coefficient Alpha and the Internal Structure of Tests,” Psychometrika, 16 (September), 297–334.

15.

Cronbach

L. J.

, Gleser

G. C.

, Nanda

, and Rajaratnam

(1972), The Dependability of Behavioral Measurements: Theory of Generalizability for Scores and Profiles. New York: John Wiley & Sons, Inc.

16.

Cronbach

L. J.

, Rajaratnam

, and Gleser

G. C.

(1963), “Theory of Generalizability: A Liberalization of Reliability Theory,” British Journal of Statistical Psychology, 16 (November), 137–63.

17.

Durbin

and Stuart

(1954), “An Experimental Comparison Between Coders,” Journal of Marketing, 19 (July), 54–66.

18.

Fleiss

Joseph L.

(1965), “Estimating the Accuracy of Dichotomous Judgments,” Psychometrika, 30 (December), 469–79.

19.

Fleiss

Joseph L.

(1971), “Measuring Nominal Scale Agreement Among Many Raters,” Psychological Bulletin, 76 (October), 378–82.

20.

Fleiss

Joseph L.

(1978), “Reply to Krippendorff—Correspondence,” Biometrics, 34 (March), 144.

21.

Fleiss

Joseph L.

and Cuzick

(1979), “The Reliability of Dichotomous Judgments: Unequal Numbers of Judges Per Subject,” Applied Psychological Measurements, 3 (Fall), 537–42.

22.

Garrett

Dennis E.

(1987), “The Effectiveness of Marketing Policy Boycotts: Environmental Opposition to Marketing,” Journal of Marketing, 51 (April), 46–57.

23.

Haggard

Ernest A.

(1958), Intraclass Correlation and the Analysis of Variance. New York: Dryden Press.

24.

Hirschman

Elizabeth C.

(1986), “Humanistic Inquiry in Marketing Research: Philosophy, Method, and Criteria,” Journal of Marketing Research, 23 (August), 237–49.

25.

Hoch

Stephen J.

and Ha

Young-Won

(1986), “Consumer Learning: Advertising and the Ambiguity of Product Expense,” Journal of Consumer Research, 13 (September), 221–33.

26.

Holbrook

Morris B.

and Rajeev

Batra

(1987), “Assessing the Role of Emotions as Mediators of Consumer Responses to Advertising,” Journal of Consumer Research, 14 (December), 404–20.

27.

John

Deborah Roedder

and John

C. Whitney

Jr. (1986), “The Development of Consumer Knowledge in Children: A Cognitive Structure Approach,” Journal of Consumer Research, 12 (March), 406–17.

28.

Kassarjian

Harold H.

(1977), “Content Analysis in Consumer Research,” Journal of Consumer Research, 4 (June), 8–18.

29.

Krippendorff

Klaus

(1978), “Reliability of Binary Attribute Data—Correspondence,” Biometrics, 34 (March), 142–4.

30.

Krippendorff

Klaus

(1980), Content Analysis: An Introduction to Its Methodology. Beverly Hills, CA: Sage Publications, Inc.

31.

Lawlis

G. Frank

and Lu

Elba

(1972), “Judgment of Counseling Process: Reliability, Agreement, and Error,” Psychological Bulletin, 76 (5), 365–77.

32.

Lord

F. M.

and Novick

M. R.

(1968), Statistical Theories of Mental Test Scores. Reading, MA: Addison-Wesley Publishing Company.

33.

MacKenzie

Scott B.

, Richard

J. Lutz

, and Belch

George E.

(1986), “The Role of Attitude Toward the Ad as a Mediator of Advertising Effectiveness: A Test of Competing Explanations,” Journal of Marketing Research, 23 (May), 130–43.

34.

Parameswaran

Ravi

, Greenberg

Barnett A.

, Bellenger

Danny N.

, and Dan

H. Robertson

(1979). “Measuring Reliability: A Comparison of Alternative Techniques,” Journal of Marketing Research, 16 (February), 18–25.

35.

Perreault

William D.

Jr. and Leigh

Laurence E.

(1989), “Reliability of Nominal Data Based on Qualitative Judgments,” Journal of Marketing Research, 26 (May), 135–48.

36.

Peter

J. Paul

(1977), “Reliability, Generalizability, and Consumer Behavior,” in Advances in Consumer Research, Vol. 4, Perreault

W. D.

Jr. , ed. Atlanta: Association for Consumer Research, 394–400.

37.

Peter

J. Paul

(1979), “Reliability: A Review of Psychometric Basics and Recent Marketing Practices,” Journal of Marketing Research, 16 (February), 6–17.

38.

Rentz

Joseph O.

(1987), “Generalizability Theory: A Comprehensive Method for Assessing and Improving the Dependability of Marketing Measures,” Journal of Marketing Research, 24 (February), 19–28.

39.

Robinson

W. S.

(1957), “The Statistical Measurement of Agreement,” American Sociological Review, 22, 25.

40.

Schouten

H. J. A.

(1986), “Nominal Scale Agreement Among Observers,” Psychometrika, 51 (September), 453–66.

41.

Scott

W. A.

(1955), “Reliability of Content Analysis: The Case of Nominal Scale Coding,” Public Opinion Quarterly, 19 (Fall), 321–5.

42.

Selvage

Rob

(1976), “Comments on the Analysis of Variance Strategy for the Computation of Intraclass Reliability,” Educational and Psychological Measurement, 36 (Autumn), 605–9.

43.

Shavelson

Richard J.

and Noreen

M. Webb

(1981), “Generalizability Theory: 1973–1980,” British Journal of Mathematical and Statistical Psychology, 34 (November), 133–66.

44.

Shrout

Patrick E.

and Joseph

L. Fleiss

(1979), “Intraclass Correlations: Uses in Assessing Rater Reliability,” Psychological Bulletin, 86 (March), 420–8.

45.

Spiegelman

Marvin

, Carl

Terwilliger

, and Franklin

Fearing

(1953), “The Reliability of Agreement in Content Analysis,” Journal of Social Psychology, 37 (May), 175–87.

46.

Tinsley

Howard E. A.

and David

J. Weiss

(1975), “Interrater Reliability and Agreement of Subjective Judgments,” Journal of Counseling Psychology, 22 (July), 358–76.

47.

Whimbey

Arthur

, Graham

M. Vaughan

, and Maurice

M. Tatsuoka

(1967), “Fixed Effects vs. Random Effects: Estimating Variance Components From Mean Squares,” Perceptual and Motor Skills, 25 (October), 668.

48.

Winer

B. J.

(1971), Statistical Principles in Experimental Design. New York: McGraw-Hill Book Company.