Abstract
Language performance assessments typically require human raters, introducing possible error. In international examinations of English proficiency, rater language background is an especially salient factor that needs to be considered. The existence of rater language background-related bias in writing performance assessment is the object of this study. Data for this study are ratings assigned by Michigan English Language Assessment Battery (MELAB) raters to compositions written by examinees of various language backgrounds. While most of the raters are native speakers of English, four have first languages other than English: two Spanish, one Korean, and one bilingual speaker of Filipino and Chinese (Amoy). Examinees were divided into 21 language groups. The IRT application FACETS was used to estimate and control for rater severity when calculating the amount of bias reflected by each rater’s set of ratings for each language/language group. Results show that the magnitude of bias terms for all raters for all language groups was minimal, thus having little effect on examinee scores, and that there is no pattern of language-related bias in the ratings.
Keywords
Get full access to this article
View all access options for this article.
