Abstract
Evaluators “save the best for last” by systematically elevating the final year-end observation scores for novices or low-performing teachers. Examining 5,251 K–12 teachers, we document a striking reversal: evaluator scores typically fell below teacher self-assessments across initial and midyear observations but exceeded them by approximately 0.25 SD in final year-end observations. This pattern was robust to unobserved differences across annual teacher-evaluator pairings, the instructional domains observed, and observation timing. The data suggest that evaluation systems are not implemented uniformly across performance tiers, revealing tensions between accountability and developmental support that merit explicit acknowledgment in evaluation framework design.
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
