Abstract
The scoring scheme used in vocabulary translation tests remains a debate topic among researchers. This study aims to examine the impact of various scoring schemes on the reliability and validity of the Chinese Vocabulary Proficiency Test (CVPT), as well as on the scores and rankings of test-takers. A total of 170 Indonesian Chinese as a second/foreign language (CS/FL) learners were recruited to complete the CVPT, which required the participants to translate 100 Chinese words into Indonesian or explain their meanings in Indonesian. The translation responses were classified into four levels based on the semantic distance between such answers and the intended word meanings. Subsequently, five distinct scoring schemes were devised, encompassing strict and lenient 2-point schemes, strict and lenient 3-point schemes, and a 4-point scheme. The analysis revealed that the five scoring schemes exhibited comparable levels of reliability and validity across various metrics, including person - and item-level reliability and separation values, and external validity, as well as in terms of test-takers’ scores and rankings. However, the three polychotomous scoring schemes tended to yield a greater degree of dimensionality compared to the two dichotomous scoring schemes. Implications for the development of scoring frameworks in vocabulary translation assessments and for the instruction of vocabulary are proposed based on the findings.
Keywords
Get full access to this article
View all access options for this article.
