Do We Know a Successful Teacher When We See One? Experiments in the Identification of Effective Teachers

Abstract

The authors report on three experiments designed to (a) test under increasingly more favorable conditions whether judges can correctly rate teachers of known ability to raise student achievement, (b) inquire about what criteria judges use when making their evaluations, and (c) determine which criteria are most predictive of a teacher’s effectiveness. All three experiments resulted in high agreement among judges but low ability to identify effective teachers. Certain items on the established measure that are related to instructional behavior did reliably predict teacher effectiveness. The authors conclude that (a) judges, no matter how experienced, are unable to identify successful teachers; (b) certain cognitive operations may be contributing to this outcome; (c) it is desirable and possible to develop a new measure that does produce accurate predictions of a teacher’s ability to raise student achievement test scores.

Keywords

teacher effectiveness teacher evaluation classroom observation value-added

Get full access to this article

View all access options for this article.

References

Aaronson

Barrow

Sander

(2007). Teachers and student achievement in the Chicago public high schools. Journal of Labor Economics, 25(1), 95-135.

Ambady

Bernieri

F. J.

Richeson

J. A.

(2000). Toward a histology of social behavior: Judgmental accuracy from thin slices of the behavioral stream. In Zanna

M. P.

(Ed.), Advances in experimental social psychology (Vol. 32, pp. 201-271). San Diego, CA: Academic Press.

Ambady

Gray

H. M.

(2002). On being sad and mistaken: Mood effects on the accuracy of thin-slice judgments. Journal of Personality and Social Psychology, 83(4), 947-961.

Ambady

Rosenthal

(1993). Half a minute: Predicting teacher evaluations from thin slices of nonverbal behavior and physical attractiveness. Journal of Personality and Social Psychology, 64, 431-441.

Benjamin

D. J.

Shapiro

J. M.

(2009). Thin-slice forecasts of gubernatorial elections. Review of Economics and Statistics, 91(3), 523-536.

Brophy

J. E.

Good

T. L.

(1986). Teacher behavior and student achievement. In Wittrock

M. C.

(Ed.), Handbook of research on teaching (3rd ed., pp. 328-375). New York, NY: Macmillan.

Chaiken

Trope

(Eds.). (1999). Dual-process theories in social psychology. New York, NY: Guilford Press.

Cohen

(1988). Statistical power analysis for the behavioral sciences. Hillsdale, NJ: Lawrence Erlbaum.

Danielson

(2007). Enhancing professional practice: A framework for teaching (2nd ed.). Alexandria, VA: ASCD.

10.

Downey

C. J.

Steffy

B. E.

English

F. W.

Frase

L. E.

Poston

W. K.

Jr. (2004). The three-minute classroom walk-through: Changing school supervisory practice one teacher at a time. Thousand Oaks, CA: Corwin Press.

11.

Epstein

(1994). Integration of the cognitive and psychodynamic unconscious. American Psychologist, 49(8), 709-724.

12.

Fenstermacher

G. D.

Richardson

(2005). On making determinations of quality in teaching. Teachers College Record, 107(1), 186-213.

13.

Frederick

(2005). Cognitive reflection and decision making. Journal of Economic Perspectives, 19(4), 25-42.

14.

Gallagher

H. A.

(2004). Vaughn Elementary’s innovative teacher evaluation system: Are teacher evaluation scores related to growth in student achievement? Peabody Journal of Education, 79(4), 79-107.

15.

Goldhaber

(2002). The mystery of good teaching: Surveying the evidence on student achievement and teachers’ characteristics. Education Next, 2(1), 50-55.

16.

Goldhaber

Brewer

D. J.

(2000). Does teacher certification matter? High school teacher certification status and student achievement. Educational Evaluation and Policy Analysis, 22(2), 129-145.

17.

Hanushek

E. A.

(1992). The trade-off between child quantity and quality. Journal of Political Economy, 100(1), 84-117.

18.

Hanushek

E. A.

Kain

J. F.

O’Brien

D. M.

Rivkin

S. G.

(2005). The market for teacher quality (NBER Working Paper 11154). Cambridge, MA: National Bureau of Economic Research.

19.

Jacob

Lefgren

(2008). Can principals identify effective teachers? Evidence on subjective performance evaluation in education. Journal of Labor Economics, 26(1), 101-136.

20.

James

(1950). The principles of psychology. New York, NY: Dover. (Original work published 1890)

21.

Johnson-Laird

P. N.

(1983). Mental models. Cambridge, MA: Harvard University Press.

22.

Kahneman

(2002, December 8). Maps of bounded rationality: A perspective on intuitive judgment and choice [Nobel Prize lecture]. Retrieved from http://nobelprize.org/nobel_prizes/economics/laureates/2002/kahnemann-lecture.pdf

23.

Kahneman

Frederick

(2002). Representativeness revisited: Attribute substitution in intuitive judgment. In Gilovich

Griffin

Kahneman

(Eds.), Heuristics and biases (pp. 49-81). New York, NY: Cambridge University Press.

24.

Kahneman

Frederick

(2005). A model of heuristic judgment. In Holyoak

K. J.

Morrison

R. G.

(Eds.), The Cambridge handbook of thinking and reasoning (pp. 267-293). Cambridge, England: Cambridge University Press.

25.

Kane

T. J.

Taylor

E. S.

Tyler

J. H.

Wooten

A. L.

(2010). Identifying effective classroom practices using student achievement data (NBER Working Paper 15803). Cambridge, MA: National Bureau of Economic Research.

26.

Kimball

S. M.

White

Milanowski

A. T.

Borman

(2004). Examining the relationship between teacher evaluation and student assessment results in Washoe County. Peabody Journal of Education, 79(4), 54-78.

27.

Kunda

(1990). The case for motivated reasoning. Psychological Bulletin, 108(3), 480-498.

28.

Mack

Rock

(1998). Inattentional blindness. Cambridge, MA: MIT Press.

29.

McCaffrey

D. F.

Han

Lockwood

J. R.

(2009). Turning student test scores into teacher compensation systems. In Springer

M. G.

(Ed.), Performance incentives: Their growing impact on American K-12 education (pp. 113-147). Washington, DC: Brookings Institution.

30.

Medley

D. M.

Coker

(1987). The accuracy of principals’ judgments of teacher performance. Journal of Educational Research, 80(4), 242-247.

31.

Milanowski

(2004). The relation between teacher performance evaluation scores and student achievement: Evidence from Cincinnati. Peabody Journal of Education, 79(4), 33-53.

32.

Naftulin

Ware

Donnelly

(1973). The Doctor Fox lecture: A paradigm of educational seduction. Journal of Medical Education, 48, 630-635.

33.

Neisser

(1963). The multiplicity of thought. British Journal of Psychology, 54(1), 1-14.

34.

No Child Left Behind Act, 20 U.S.C. 70 § 6301 et seq. (2002).

35.

Nuthall

Alton-Lee

(1990). Research on teaching and learning: Thirty years of change. Elementary School Journal, 90, 546-570.

36.

Nye

Konstantopoulos

Hedges

(2004). How large are teacher effects? Educational Evaluation and Policy Analysis, 26(3), 237-257.

37.

Ornstein

A. C.

(1995a). Beyond effective teaching. Peabody Journal of Education, 70(2), 2-33.

38.

Ornstein

A. C.

(1995b). The new paradigm in research on teaching. Educational Forum, 59, 124-129.

39.

Peterson

K. D.

(1987). Teacher evaluation with multiple and variable lines of evidence. American Educational Research Journal, 24(2), 311-317.

40.

Peterson

K. D.

(2000). Teacher evaluation: A comprehensive guide to new directions and practices (2nd ed.). Thousand Oaks, CA: Corwin Press.

41.

Piaget

(1926). The language and thought of the child. London, England: Routledge Kegan Paul.

42.

Pianta

R. C.

La Paro

Hamre

B. K.

(2008). Classroom Assessment Scoring System (CLASS). Baltimore, MD: Paul H. Brookes.

43.

Radmacher

S. A.

Martin

D. J.

(2001). Identifying significant predictors of student evaluations of faculty through hierarchical regression analysis. Journal of Psychology, 135, 259-268.

44.

Raudenbush

S. W.

Bryk

A. S.

(2002). Hierarchical linear models: Applications and data analysis methods (2nd ed.). Thousand Oaks, CA: Sage.

45.

Riniolo

T. C.

Johnson

K. C.

Sherman

T. R.

Misso

J. A.

(2006). Hot or not: Do professors perceived as physically attractive receive higher student evaluations? Journal of General Psychology, 133(1), 19-35.

46.

Rivkin

Hanushek

Kain

(2005). Teachers, schools, and academic achievement. Econometrica, 73(2), 417-458.

47.

Rosenshine

B. V.

(1987). Explicit teaching. In Berliner

D. C.

Rosenshine

B. V.

(Eds.), Talks to teachers (pp. 75-92). New York, NY: Random House.

48.

Sanders

(2000). Value-added assessment from student achievement data: Opportunities and hurdles. Journal of Personnel Evaluation in Education, 14(4), 329-339.

49.

Sanders

W. L.

Horn

S. P.

(1995). The Tennessee Value-Added Assessment System (TVAAS): Mixed-model methodology in educational assessment. In Shrinkfietd

A. J.

Stufflebeam

(Eds.), Teacher evaluation: Guide to effective practice (pp. 337-350). Boston, MA: Kluwer.

50.

SAS Institute. (n.d.). Dr. William L. Sanders. Retrieved from http://www.sas.com/govedu/edu/bio_sanders.html

51.

Shrout

P. E.

Fleiss

J. L.

(1979). Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin, 86(2), 470-428.

52.

Simons

D. J.

Chabris

C. F.

(1999). Gorillas in our midst: Sustained inattentional blindness for dynamic events. Perception, 28(9), 1059-1074.

53.

Sloman

S. A.

(1996). The empirical case for two systems of reasoning. Psychological Bulletin, 119(1), 3-22.

54.

Stallings

J. A.

Mohlman

G. G.

(1988). Classroom observation techniques. In Keeves

J. P.

(Ed.), Educational research, methodology, and measurement: An international handbook (pp. 469-474). Oxford, England: Pergamon.

55.

Stanovich

K. E.

West

R. F.

(2000). Individual differences in reasoning: Implications for the rationality debate. Behavioral and Brain Sciences, 23(5), 645-665.

56.

Strong

M. A.

(2009). Effective teacher induction and mentoring: Assessing the evidence. New York, NY: Teachers College Press.

57.

Vygotsky

L. S.

(1987). Thinking and speech. In Rieber

R. W.

Carton

A. S.

(Eds.), The collected works of L. S. Vygotsky: Vol. 1. Problems of general psychology (pp. 37-285). New York: Plenum Press. (Original work published 1934).

58.

Ware

Williams

(1975). The Dr. Fox effect: A study of lecturer effectiveness and ratings of instruction. Journal of Medical Education, 40, 149-156.

59.

Wason

P. C.

(1960). On the failure to eliminate hypotheses in a conceptual task. Quarterly Journal of Experimental Psychology, 12, 129-140.

60.

Waxman

H. C.

(1995). Classroom observations of effective teaching. In Ornstein

A. C.

(Ed.), Teaching: Theory into practice (pp. 76-93). Needham Heights, MA: Allyn & Bacon.

61.

Williams

W. M.

Ceci

S. J.

(1997). “How’m I doing?” Change, 29, 13-24.

62.

Wright

Horn

Sanders

(1997). Teacher and classroom context effects on student achievement: Implications for teacher evaluation. Journal of Personnel Evaluation in Education, 11, 57-67.