An Interesting Problem in the Estimation of Scoring Reliability

Abstract

A performance assessment consisting of 10 separate exercises was scored with a randomized scoring procedure. All responses to each exercise were rated once; in addition, a randomly selected subset of the responses to each exercise received an independent second rating. Each second rating was averaged with the corresponding first rating before the scores were computed. This article presents a method for estimating the scoring reliability (interrater reliability) coefficient and the standard error of scoring for the resulting scores. The report concludes with some numerical examples showing how the reliability estimation procedure can be used to estimate the effect of varying the proportions of responses that are double-scored.

Keywords

interrater reliability performance assessment scoring reliability

Get full access to this article

View all access options for this article.

An Interesting Problem in the Estimation of Scoring Reliability

Abstract

Keywords

Get full access to this article

References