Abstract
The purpose of this study is twofold. First, using FACETS (Linacre, 1996), it investigates how judgements of trained teacher raters are biased towards certain types of candidates and certain criteria in assessing Japanese second language (L2) writing. Previous studies that identified significantly biased rater-candidate interactions did not discuss who the candidates were, but this study examines rater-candidate interactions in much more detail. Secondly, since there is no established rating scale for assessing Japanese L2 writing, this study explores the potential for using a modified version of Jacobs et al.’s (1981) rating scale for norm-referenced decisions about Japanese L2 writing ability. The participants in the study comprised 234 university candidates and three trained teacher raters. The raters produced highly correlated scores and were self-consistent, but significant differences in overall severity surfaced. The raters scored certain candidates and criteria more leniently or harshly, and every rater’s bias pattern was different. The highest percentage of significantly biased rater-candidate interactions was found among the candidates whose ability was extremely high or low. This study suggests that the modified version of Jacobs et al.’s scale can be a reliable tool in assessing Japanese L2 writing in norm-referenced settings, but multiple ratings are still necessary.
Get full access to this article
View all access options for this article.
