The main purpose of this article is to demonstrate how halo effects may be detected and quantified using two independent ratings of the same person. A practical illustration is given to show how halo effects can be avoided.
Bechger, T.M., Kuijper, H., & Maris, G. ( 2009). Standard setting in relation to the Common European Framework of Reference for Languages: The case of the state examinations of Dutch as a second language. Language Assessment Quarterly , 6, 126-150.
3.
Bechger, T.M., & Maris, G. ( 2004). Structural equation modelling of multiple facet data: Extending models for multitrait-multimethod data. Psicologica , 25, 253-274.
4.
Bechger, T.M., Maris, G., Verstralen, H.H.F. M., & Béguin, A.A. (2003). Using classical test theory in combination with item response theory. Applied Psychological Measurement, 27, 319-334.
Brown, W. ( 1910). Some experimental results in the correlation of mental abilities. British Journal of Psychology, 3, 296-322.
7.
Byrne, B. ( 2006). Structural equation modeling with EQS: Basic concepts, applications, and programming (2nd ed.). Mahwah, NJ: Lawrence Erlbaum.
8.
Byrne, B. ( 2009). Structural equating modelling with AMOS: Basic concepts, applications, and programming (2nd ed.). New-York, NY: Taylor & Francis Group.
9.
Campbell, D.T., & Fiske, D.W. ( 1959). Convergent and discriminant validation by the multitrait multimethod matrix. Psychological Bulletin, 56, 81-105.
Croudace, T., Dunn, G., & Pickles, A. ( 2009). General latent variable modelling using Mplus (1sted.). London, UK: Chapman & Hall.
12.
De Finetti, B. ( 1974). Theory of probability. New York, NY : John Wiley.
13.
Eid, M. ( 2000). A multitrait-multimethod model with minimal assumptions . Psychometrika, 65, 241-261.
14.
Goffin, R.D., & Jackson, D.N. ( 1992). Analysis of multitrait-multirater performance appraisal data: Composite direct product method versus conrmatory factor analysis . Multivariate Behavioral Research, 27, 363-385.
15.
Guilford, J.P. ( 1936). Psychometric methods. New York, NY : McGraw-Hill.
16.
Gulliksen, H. ( 1950). Theory of mental tests. New York, NY: John Wiley.
17.
Hales, L.W., & Tokar, E. ( 1975). The effect of the quality of preceding responses on the grades assigned to subsequent responses to an essay question. Journal of Educational Measurement, 12, 115-117.
18.
Hoyt, W. ( 2000). Rater bias in psychological research: When is it a problem and what can we do about it? Psychological Bulletin , 5, 64-86.
19.
Ip, E.H., Smits, D.J.M., & De Boeck, P. ( 2009). Locally dependent linear logistic test model with person covariates. Applied Psychological Measurement, 3, 555-569.
20.
Jöreskog, K.G., & Sörbom, D. (1993). LISREL 8: Structural equation modeling with the SIMPLIS command language. Chicago, IL : Scientic Software.
21.
Kelley, T.L. ( 1924). Note on the reliability of a test: A reply to Dr. Crumm’s criticism. Journal of Educational Psychology , 15, 193-204.
Lord, F.M., & Novick, M.R. ( 1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.
24.
Lumley, P. ( 2005). Assessing second language writing: The rater’s perspective. Frankfurt, Germany: Peter Lang.
25.
Maris, G., & Bechger, T.M. ( 2007). Scoring open ended questions. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics: Psychometrics (Vol. 26, pp. 663-680). Amsterdam, Netherlands: Elsevier.
26.
Marsh, H.W., & Butler, S. ( 1984). Evaluating reading diagnostic tests: An application of confirmatory factor analysis to multitrait-multimethod data. Applied Psychological Measurement, 8, 307-320.
27.
McDonald, R.P. ( 1999). Test theory: A unified treatment. Mahwah, NJ: Lawrence Erlbaum.
28.
Murphy, K.R., Jako, R.A., & Anhalt, R.L. ( 1993). The nature and consequences of halo error: A critical analysis. Journal of Applied Psychology, 78, 218-225.
Neale, M.C., Boker, S.M., Xie, G., & Maes, H.H. ( 2003). Mx: Statistical modeling (6th ed.) [Computer software manual]. Department of Psychiatry, Virginia Commonwealth University.
31.
Rasch, G. ( 1960). Probabilistic models for some intelligence and attainment tests. Copenhagen, Denmark: Danish Institute of Educational Research. (Expanded edition, 1980. Chicago, IL: University of Chicago Press)
32.
Rosenzweig, P. ( 2007). The halo effect. New York, NY: Free Press.
33.
Sanders, P.F., & Verschoor, A.J. (1998). Parallel test construction using classical item parameters. Applied Psychological Measurement , 22, 212-223.
34.
Solomonson, A.L., & Lance, C.E. ( 1997). Examination of the relationship between true halo and halo error in performance ratings. Journal of Applied Psychology , 82, 665-674.
35.
Spearman, C. ( 1910). Correlation calculated from faulty data. British Journal of Psychology, 3, 271-295.
36.
Steiger, J.H. ( 1979). MULTICORR: A computer program for fast, accurate, small-sample tests of correlational pattern hypotheses. Educational and Psychological Measurement, 39, 677-680.
37.
Steiger, J.H. ( 2005). Comparing correlations: Pattern hypothesis tests between and/or within independent samples. In A. Maydeu-Olivares & J. J. McArdle (Eds.), Contemporary psychometrics. A festschrift for Roderick P. Mcdonald (pp. 377-414). Mahwah, NJ: Lawrence Erlbaum.
38.
Steyer, R., & Eid, M. ( 1993). Messen und Testen. Berlin, Germany: Springer-Verlag.
39.
Thorndike, E.L. ( 1920). A constant error in psychological ratings. Journal of Applied Psychology, 33, 263-271.
40.
Thornton, G.C. ( 1992). Assessment centers in human resource management . Reading, MA: Addison-Wesley .
41.
Vaughan, C. ( 1991). Holistic assessment: What goes on in the rater’s mind. In L. H. Lyons (Ed.), Assessing second language writing in academic contexts (pp. 111-125). Norwood, NJ: Ablex.
42.
Verguts, T., & De Boeck, P. ( 2001). Some Mantel-Haenszel test of Rasch model assumptions . British Journal of Mathematical and Statistical Psychology , 54, 21-37.
43.
Wang, W., & Wilson, M. ( 2005). Exploring local item dependence using a random-effects facet model. Applied Psychological Measurement, 29, 296-318.
44.
Wells, F.J. ( 1907). A statistical study of literary merit. Archives of Psychology, 1, 1-30.
45.
Woodruffe, C. ( 1998). Assessment centers: Identifying and developing competence . London, UK: Institute of Personnel Management.
46.
Yen, W.M. ( 1984). Effects of local item dependence on the fit and equating performance of the three-parameter logistic model. Applied Psychological Measurement, 30, 187-213.