Abstract
Today’s language models can produce syntactically accurate and semantically coherent texts. This capability presents new opportunities for generating content for language assessments, which have traditionally required intensive expert resources. However, these models are also known to generate biased texts, leading to representational harms. Therefore, to utilize language models for language assessment content generation, it is crucial to address this bias issue to ensure all test takers have a fair and beneficial assessment experience. This paper proposes a novel method to ensure the generation of language assessment content free from representational harms. Specifically, the method eliminates any systematic relationship between demographic groups and their attributes through a two-step process. Two case studies were conducted to illustrate and evaluate the method’s effectiveness. In both studies, the method produced language assessment content comparable with their respective targets and successfully prevented representational harms in a systematic manner.
Get full access to this article
View all access options for this article.
