Abstract
As automated scoring systems for spoken responses are increasingly used in language assessments, testing organizations need to analyze their performance, as compared to human raters, across several dimensions, for example, on individual items or based on subgroups of test takers. In addition, there is a need in testing organizations to establish rigorous procedures for monitoring the performance of both human and automated scoring processes during operational administrations. This paper provides an overview of the automated speech scoring system SpeechRaterSM and how to use charts and evaluation statistics to monitor and evaluate automated scores and human rater scores of spoken constructed responses.
Keywords
Get full access to this article
View all access options for this article.
