Abstract
A short training program for evaluating responses to an essay writing task consisted of scoring 20 training essays with immediate feedback about the correct score. The same scoring session also served as a certification test for trainees. Participants with little or no previous rating experience completed this session and 14 trainees who passed an accuracy threshold proceeded to score other essays. Performance of the newly-trained raters was compared to that of 16 expert raters with extensive experience in scoring responses to the writing task. Results showed that the scores from the newly-trained group of raters exhibited similar measurement properties (mean and variability of scores, reliability and various validity coefficients, and underlying factor structure) to those from the experienced group of raters. Implications for the place of initial training and screening of raters on rater performance are discussed.
Get full access to this article
View all access options for this article.
