Sage Journals: Discover world-class research

Abstract

The scoring scheme used in vocabulary translation tests remains a debate topic among researchers. This study aims to examine the impact of various scoring schemes on the reliability and validity of the Chinese Vocabulary Proficiency Test (CVPT), as well as on the scores and rankings of test-takers. A total of 170 Indonesian Chinese as a second/foreign language (CS/FL) learners were recruited to complete the CVPT, which required the participants to translate 100 Chinese words into Indonesian or explain their meanings in Indonesian. The translation responses were classified into four levels based on the semantic distance between such answers and the intended word meanings. Subsequently, five distinct scoring schemes were devised, encompassing strict and lenient 2-point schemes, strict and lenient 3-point schemes, and a 4-point scheme. The analysis revealed that the five scoring schemes exhibited comparable levels of reliability and validity across various metrics, including person - and item-level reliability and separation values, and external validity, as well as in terms of test-takers’ scores and rankings. However, the three polychotomous scoring schemes tended to yield a greater degree of dimensionality compared to the two dichotomous scoring schemes. Implications for the development of scoring frameworks in vocabulary translation assessments and for the instruction of vocabulary are proposed based on the findings.

Keywords

Chinese as a second/foreign language Chinese Vocabulary Proficiency Test vocabulary assessment vocabulary size test vocabulary translation

Get full access to this article

View all access options for this article.

References

Aitchison

(2012). Words in the mind: An introduction to the mental lexicon (4th ed.). John Wiley & Sons.

Akase

(2022). Longitudinal measurement of growth in vocabulary size using Rasch-based test equating. Language Testing in Asia, 12, Article 5. https://doi.org/10.1186/s40468-022-00155-8

Anderson

R.C.

Nagy

W.E.

(1991). Word meanings. In Barr

Kamil

Mosenthal

Pearson

(Eds.), Handbook of reading research (pp. 690–724). Longman.

Andrich

(1988). Rasch models for measurement. Sage.

Arnaud

P. J. L.

(1984). The lexical richness of written productions and the validity of vocabulary tests. In Culhane

Klein-Braley

Stevenson

D. K.

(Eds.), Practice and problems in language testing (pp. 14-28). University of Essex.

Aryadoust

L.Y.

Sayama

(2021). A comprehensive review of Rasch measurement in language assessment: Recommendations and guidelines for research. Language Testing, 38(1), 6–40. https://doi.org/10.1177/0265532220927487

Bachman

L.F.

Cohen

A.D.

(Eds.). (1999). Interfaces between second language acquisition and language testing research. Cambridge University Press.

Beglar

(2010). A Rasch-based validation of the Vocabulary Size Test. Language Testing, 27(1), 101–118. https://doi.org/10.1177/0265532209340194

Boone

W.J.

Noltemeyer

(2017). Rasch analysis: A primer for school psychology researchers and practitioners. Cogent Education, 4(1), Article 1416898. https://doi.org/10.1080/2331186X.2017.1416898

10.

Botes

Dewaele

J.-M.

Greiff

(2022). Taking stock: A meta-analysis of the effects of foreign language enjoyment. Studies in Second Language Learning and Teaching, 12(2), 205–232. https://doi.org/10.14746/ssllt.2022.12.2.3

11.

Brown

J.D.

(2002). Do cloze tests work? Or is it just an illusion? Second Language Studies, 21(1), 79–125. http://hdl.handle.net/10125/40654

12.

Brown

Miller

(2013). The Cambridge dictionary of linguistics. Cambridge University Press. https://doi.org/10.1017/CBO9781139049412

13.

Chaudron

(2001). Progress in language classroom research: Evidence from The Modern Language Journal, 1916–2000. Modern Language Journal, 85(1), 57–76. https://doi.org/10.1111/0026-7902.00097

14.

Chen

(2018). The contribution of morphological awareness to lexical inferencing in L2 Chinese: Comparing more-skilled and less-skilled learners. Foreign Language Annals, 51(4), 816–830. https://doi.org/10.1111/flan.12365

15.

Chen

Koda

Wiener

(2020). Word-meaning inference in L2 Chinese: An interactive effect of learners’ linguistic knowledge and words’ semantic transparency. Reading and Writing, 33(10), 2639–2660. https://doi.org/10.1007/s11145-020-10058-w

16.

Cheng

Matthews

(2016). The relationship between three measures of L2 vocabulary knowledge and L2 listening and reading. Language Testing, 35(1), 3–25. https://doi.org/10.1177/0265532216676851

17.

Cohen

(1988). Statistical power analysis for the behavioral sciences (2nd ed.). Erlbaum. https://doi.org/10.4324/9780203771587

18.

Collins

A.M.

Loftus

E.F.

(1975). A spreading-activation theory of semantic processing. Psychological Review, 82(6), 407–428. https://doi.org/10.1037/0033-295X.82.6.407

19.

Collins

A.M.

Quillian

M.R.

(1969). Retrieval time from semantic memory. Journal of Verbal Learning & Verbal Behavior, 8(2), 240–247. https://doi.org/10.1016/S0022-5371(69)80069-1

20.

Cronbach

L.J.

(1942). An analysis of techniques for diagnostic vocabulary testing. The Journal of Educational Research, 36(3), 206–217. http://www.jstor.org/stable/27528353

21.

Crossley

S.A.

(2013). Assessing automatic processing of hypernymic relations in first language speakers and advanced second language learners: A semantic priming approach. The Mental Lexicon, 8(1), 96–116. https://doi.org/10.1075/ml.8.1.05cro

22.

Crutch

S.J.

Williams

Ridgway

G.R.

Borgenicht

(2012). The role of polarity in antonym and synonym conceptual knowledge: Evidence from stroke aphasia and multidimensional ratings of abstract words. Neuropsychologia, 50(11), 2636–2644. https://doi.org/10.1016/j.neuropsychologia.2012.07.015

23.

Daller

Milton

Treffers-Daller

(Eds.). (2007). Modelling and assessing vocabulary knowledge. Cambridge University Press.

24.

Deng

(2017). Yuyi guanxi dui yuyi qidong xiaoying de yingxiang[The influence of semantic relationship on semantic priming effects]. [Master’s thesis, Hunan University]. https://www.cnki.net/

25.

Dóczi

(2020). An overview of conceptual models and theories of lexical representation in the mental lexicon. In Webb

(Ed.), The Routledge handbook of vocabulary studies (pp. 46–65). Routledge.

26.

Dolch

E.W.

Leeds

(1953). Vocabulary tests and depth of meaning. Journal of Educational Research, 47(3), 181–190. https://doi.org/10.1080/00220671.1953.10882095

27.

Drum

P.A.

Konopak

B.C.

(1987). Learning word meanings from written context. In McKeown

M.G.

Curtis

M.E.

(Eds.), The nature of vocabulary acquisition (pp. 73–87). Lawrence Erlbaum Associates.

28.

Durkin

Manning

(1989). Polysemy and the subjective lexicon: Semantic relatedness and the salience of intraword senses. Journal of Psycholinguistic Research, 18(6), 577–612. https://doi.org/10.1007/BF01067161

29.

Eyckmans

(2004). Measuring receptive vocabulary size: Reliability and validity of the Yes/No Vocabulary Test for French-speaking learners of Dutch. Landelijke Onderzoekschool Taalwetenschap. http://hdl.handle.net/2066/19469

30.

Feng

Bai

(2020). Hanyu eryu shuiping kuaisu ceshi de shijuan yanfa fenxi-jiyu dengjuli wanxing tiankong de yanjiu [An analysis of proficiency test for CSL (Chinese as second language) based on fixed-ratio cloze questions]. Yuyan wenzi yingyong, (3), 69–79. https://link.cnki.net/doi/10.16499/j.cnki.1003-5397.2020.03.011

31.

Gellert

A.S.

Elbro

(2013). Cloze tests may be quick, but are they dirty? Development and preliminary validation of a cloze test of reading comprehension. Journal of Psychoeducational Assessment, 31(1), 16–28. https://doi.org/10.1177/0734282912451971

32.

Grunert

M.L.

Raker

J.R.

Murphy

K.L.

Holme

T.A.

(2013). Polytomous versus dichotomous scoring on multiple-choice examinations: Development of a rubric for rating partial credit. Journal of Chemical Education, 90(10), 1310–1315. https://doi.org/10.1021/ed400247d

33.

(2019). Vocabulary learning strategies. In Chapelle

C.A.

(Ed.), The encyclopedia of applied linguistics (pp. 1–7). John Wiley & Sons. https://doi.org/10.1002/9781405198431.wbeal1329.pub2

34.

Johnson

R.K.

(1996). Vocabulary learning strategies and language learning outcomes. Language Learning, 46(4), 643–679. https://doi.org/10.1111/j.1467-1770.1996.tb01355.x

35.

Gyllstad

McLean

Stewart

(2020). Using confidence intervals to determine adequate item sample sizes for vocabulary tests: An essential but overlooked practice. Language Testing, 38(4), 558–579. https://doi.org/10.1177/0265532220979562

36.

Gyllstad

Vilkaitė-Lozdienė

Schmitt

(2015). Assessing vocabulary size through multiple-choice formats: Issues with guessing and sampling rates. ITL - International Journal of Applied Linguistics, 166(2), 278–306. https://doi.org/10.1075/itl.166.2.04gyl

37.

H.T.

(2021a). Exploring the relationships between various dimensions of receptive vocabulary knowledge and L2 listening and reading comprehension. Language Testing in Asia, 11, Article 20. https://doi.org/10.1186/s40468-021-00131-8

38.

H.T.

(2021b). A Rasch-based validation of the Vietnamese version of the Listening Vocabulary Levels Test. Language Testing in Asia, 11, Article 16. https://doi.org/10.1186/s40468-021-00132-7

39.

Haber

Poesio

(2024). Polysemy—Evidence from linguistics, behavioral science, and contextualized language models. Computational Linguistics, 50(1), 351–417. https://doi.org/10.1162/coli_a_00500

40.

Hamada

Yanagawa

(2024). Aural vocabulary, orthographic vocabulary, and listening comprehension. International Review of Applied Linguistics in Language Teaching, 62(2), 953–997. https://doi.org/10.1515/iral-2022-0100

41.

Hamp-Lyons

(2016a). Farewell to holistic scoring? Assessing Writing, 27, A1–A2. https://doi.org/10.1016/j.asw.2015.12.002

42.

Hamp-Lyons

(2016b). Farewell to holistic scoring. Part Two: Why build a house with only one brick? Assessing Writing, 29, A1–A5. https://doi.org/10.1016/j.asw.2016.06.006

43.

Han

Qian

D.D.

(2024). Evaluating the roles of breadth and depth of aural vocabulary knowledge in listening comprehension of EFL learners: An investigation applying auditory measures. System, 120, Article 103207. https://doi.org/10.1016/j.system.2023.103207

44.

Harsch

Martin

(2013). Comparing holistic and analytic scoring methods: Issues of validity and reliability. Assessment in Education: Principles, Policy & Practice, 20(3), 281–307. https://doi.org/10.1080/0969594X.2012.742422

45.

Hashimoto

B.J.

(2021). Is frequency enough?: The frequency model in vocabulary size testing. Language Assessment Quarterly, 18(2), 1–17. https://doi.org/10.1080/15434303.2020.1860058

46.

Henriksen

(1999). Three dimensions of vocabulary development. Studies in Second Language Acquisition, 21(2), 303–317. https://doi.org/10.1017/S0272263199002089

47.

Hsu

C.-L.

Jin

K.-Y.

Chiu

M.M.

(2020). Cognitive diagnostic models for random guessing behaviors. Frontiers in Psychology, 11, Article e570365. https://doi.org/10.3389/fpsyg.2020.570365

48.

Hulstijn

J.H.

(2012). The construct of language proficiency in the study of bilingualism from a cognitive perspective. Bilingualism: Language and Cognition, 15(2), 422–433. https://doi.org/10.1017/S1366728911000678

49.

Jeon

H.-A.

Lee

K.-M.

Kim

Y.-B.

Cho

Z.-H

. (2009). Neural substrates of semantic relationships: Common and distinct left-frontal activities for generation of synonyms vs. antonyms. NeuroImage, 48(2), 449–457. https://doi.org/10.1016/j.neuroimage.2009.06.049

50.

Joyce

(2018). L2 vocabulary learning and testing: The use of L1 translation versus L2 definition. The Language Learning Journal, 46(3), 217–227. https://doi.org/10.1080/09571736.2015.1028088

51.

Karami

(2012). The development and validation of a bilingual version of the Vocabulary Size Test. RELC Journal, 43(1), 53–67. https://doi.org/10.1177/0033688212439359

52.

Koda

(2019). Is vocabulary knowledge sufficient for word-meaning inference? An investigation of the role of morphological awareness in adult L2 learners of Chinese. Applied Linguistics, 40(3), 456–477. https://doi.org/10.1093/applin/amx040

53.

Kong

(2024). The directional effects of hyponyms and hypernyms in semantic priming. Springer.

54.

Kremmel

Schmitt

(2016). Interpreting vocabulary test scores: What do various item formats tell us about learners’ ability to employ words? Language Assessment Quarterly, 13(4), 377–392. https://doi.org/10.1080/15434303.2016.1237516

55.

Laufer

Aviad-Levitzky

(2017). What type of vocabulary knowledge predicts reading comprehension: Word meaning recall or word meaning recognition? The Modern Language Journal, 101(4), 729–741. https://doi.org/10.1111/modl.12431

56.

Laufer

Goldstein

(2004). Testing vocabulary knowledge: Size, strength, and computer adaptiveness. Language Learning, 54(3), 399–436. https://doi.org/10.1111/j.0023-8333.2004.00260.x

57.

Lawson

M.J.

Hogben

(1996). The vocabulary-learning strategies of foreign-language students. Language Learning, 46(1), 101–135. https://doi.org/10.1111/j.1467-1770.1996.tb00642.x

58.

Lee

S.T.

van Heuven

W.J.B.

Price

J.M.

Leong

C.X.R.

(2022). Translation norms for Malay and English words: The effects of word class, semantic variability, lexical characteristics, and language proficiency on translation. Behavior Research Methods, 55, 3585–3601. https://doi.org/10.3758/s13428-022-01977-3

59.

Leech

G.N.

(1974). Semantics. Penguin.

60.

Lemhöfer

Broersma

(2012). Introducing LexTALE: A quick and valid lexical test for advanced learners of English. Behavior Research Methods, 44(2), 325–343. https://doi.org/10.3758/s13428-011-0146-0

61.

Sepanski

Zhao

(2006). Language History Questionnaire: A web-based interface for bilingual research. Behavior Research Methods, 38(2), 202–210. https://doi.org/10.3758/BF03192770

62.

(Eds.). (2021). Xiandai hanyu changyong cibiao (di er ban) [List of common vocabulary in modern Chinese] (2nd ed.). The Commercial Press.

63.

Lin

C.-K.

(2018). Effects of removing responses with likely random guessing under Rasch measurement on a multiple-choice language proficiency test. Language Assessment Quarterly, 15(4), 406–422. https://doi.org/10.1080/15434303.2018.1534237

64.

Linacre

J.M.

(2022). Winsteps (Version 5.3.2) [Computer software]. https://www.winsteps.com/index.htm

65.

Liu

(2022). Guoji zhongwen jiaoyu zhongwen shuiping dengji biaozhun de zhongguo tese yu jiedu yingyong [Characteristis of Chinese proficiency grading standards for international Chinese language education and its interpretation and application]. Guoji hanyu jiaoxue yanjiu, (2), 31–38. https://kns.cnki.net/kcms2/article/abstract?v=Qc2UN8NgW0uHhd27E8vJ4ueY12BYWCEFXloyo3VKCoaFR86C5-ON8U6up3e_8vwkVY1rKjHpA0rTblt6Qd8IwaYJR4s5rcQvUG68XVPtGzYVa1rePNB5mhan7-eSWNjlwlzY6sm74uPgquuX8gCFCn5d_O7MDfBMfTInhX7178oH18bJcgs1sw==&uniplatform=NZKPT&language=CHS

66.

Liu

(2010). Yanzhi yinjie he hanzi cihui dengji huafen tanxun hanyu guoji jiaoyu xin siwei [The development of the graded Chinese syllables, characters and words: Exploring the new perspectives of global Chinese education]. Shijie hanyu jiaoxue, 24(1), 82–92. https://link.cnki.net/doi/10.13724/j.cnki.ctiw.2010.01.003

67.

Lucas

(2000). Semantic priming without association: A meta-analytic review. Psychonomic Bulleting & Review, 7(4), 618–630. https://doi.org/10.3758/BF03212999

68.

Gong

Gao

Xiang

(2017). The teaching of Chinese as a second or foreign language: A systematic review of the literature 2005–2015. Journal of Multilingual and Multicultural Development, 38(9), 1–16. https://doi.org/10.1080/01434632.2016.1268146

69.

Matthews

P.H.

(2007). The concise Oxford dictionary of linguistics (2nd ed.). Oxford University Press.

70.

McLean

Kramer

Beglar

(2015). The creation and validation of a Listening Vocabulary Levels Test. Language Teaching Research, 19(6), 741–760. https://doi.org/10.1177/1362168814567889

71.

McLean

Kramer

Stewart

(2015). An empirical examination of the effect of guessing on Vocabulary Size Test scores. Vocabulary Learning and Instruction, 4(1), 26–35. https://doi.org/10.7820/vli.v04.1.mclean.et.al

72.

McLean

Stewart

Batty

A.O.

(2020). Predicting L2 reading proficiency with modalities of vocabulary knowledge: A bootstrapping approach. Language Testing, 37(3), 389–411. https://doi.org/10.1177/0265532219898380

73.

McNamara

Knoch

(2012). The Rasch wars: The emergence of Rasch measurement in language testing. Language Testing, 29(4), 555–576. https://doi.org/10.1177/0265532211430367

74.

Mochida

Harrington

(2006). The yes/no test as a measure of receptive vocabulary knowledge. Language Testing, 23(1), 73–98. https://doi.org/10.1191/0265532206lt321oa

75.

Nagy

W.E.

Herman

P.A.

Anderson

R.C.

(1985). Learning words from context. Reading Research Quarterly, 20(2), 233–253. https://doi.org/10.2307/747758

76.

Nakata

Tamura

Scott

(2020). Examining the validity of the LexTALE Test for Japanese college students. Journal of Asia TEFL, 17(2), 335–348. https://doi.org/10.18823/asiatefl.2020.17.2.2.335

77.

Nation

I.S.P.

(2007). Fundamental issues in modelling and assessing vocabulary knowledge. In Daller

Milton

Treffers-Daller

(Eds.), Modelling and assessing vocabulary knowledge (pp. 35–43). Cambridge University Press.

78.

Nation

I.S.P.

(2022). Learning vocabulary in another language (3rd ed.). Cambridge University Press. https://doi.org/10.1017/9781009093873

79.

Nation

I.S.P.

Beglar

(2007). A Vocabulary Size Test. Language Teacher, 31(7), 9–13. https://doi.org/10.26686/wgtn.12552197.v1

80.

Nation

I.S.P.

Coxhead

(2021). Measuring native-speaker vocabulary size. John Benjamins. https://doi.org/10.1075/Z.233

81.

Nation

I.S.P.

Webb

(2011). Researching and analyzing vocabulary. Heinle Cengage Learning.

82.

Nelson

T.O.

Dunlosky

(1994). Norms of paired-associate recall during multitrial learning of Swahili–English translation equivalents. Memory, 2(3), 325–335. https://doi.org/10.1080/09658219408258951

83.

M.H.

Mancilla-Martinez

Hwang

J.K.

(2023). Revisiting the traditional conceptualizations of vocabulary knowledge as predictors of dual language learners’ English reading achievement in a new destination state. Applied Psycholinguistics, 44(1), 51–75. https://doi.org/10.1017/S0142716422000479

84.

Park

H.I.

Solon

Dehghan-Chaleshtori

Ghanbar

(2022). Proficiency reporting practices in research on second language acquisition: Have we made any progress? Language Learning, 72(1), 198–236. https://doi.org/10.1111/lang.12475

85.

Pekrun

(2017). Emotion and achievement during adolescence. Child Development Perspectives, 11(3), 215–221. https://doi.org/10.1111/cdep.12237

86.

Pekrun

Marsh

H.W.

Suessenbach

Frenzel

A.C.

Goetz

(2023). School grades and students’ emotions: Longitudinal models of within-person reciprocal effects. Learning and Instruction, 83, Article 101626. https://doi.org/10.1016/j.learninstruc.2022.101626

87.

Pinillos

M.A.

(2024). Differences between phonological and orthographic vocabulary knowledge among L1-Spanish learners of English as a foreign language. Porta Linguarum, (41), 279–296. https://doi.org/10.30827/portalin.vi41.27594

88.

Plonsky

(2013). Study quality in SLA: An assessment of designs, analyses, and reporting practices in quantitative L2 research. Studies in Second Language Acquisition, 35(4), 655–687. https://doi.org/10.1017/S0272263113000399

89.

Plonsky

Derrick

D.J.

(2016). A meta-analysis of reliability coefficients in second language research. The Modern Language Journal, 100(2), 538–553. https://doi.org/10.1111/modl.12335

90.

Plonsky

Oswald

F.L.

(2014). How big is “big”? Interpreting effect sizes in L2 research. Language Learning, 64(4), 878–912. https://doi.org/10.1111/lang.12079

91.

Prior

MacWhinney

Kroll

J.F.

(2007). Translation norms for English and Spanish: The role of lexical variables, word class, and L2 proficiency in negotiating translation ambiguity. Behavior Research Methods, 39(4), 1029–1038. https://doi.org/10.3758/BF03193001

92.

Qian

(2002). Cihuiliang yanjiu chutan [A tentative study on measuring vocabulary size]. Shijie hanyu jiaoxue, (4), 54–62. https://kns.cnki.net/kcms2/article/abstract?v=Qc2UN8NgW0uqTZtkawobG0MqtJKYfX0ltXOeGduTGaSV9bi2ms7RmoYfNZUZmXrYl2iPRo2ANKoCgIGDy-EHdwuQ3huM1dDpQXE_D29wqpoYYPlUFtfbPAQ7vPPHBCImzYOXL8_isTmpsKZg0nb1B42699xWK0ICdAgCfz2NtZiTFMR_WA0lJA==&uniplatform=NZKPT&language=CHS

93.

Ramachandran

S.D.

Rahim

H.A.

(2004). Meaning recall and retention: The impact of the translation method on elementary level learners’ vocabulary learning. RELC Journal, 35(2), 161–178. https://doi.org/10.1177/003368820403500205

94.

Read

(2004). Plumbing the depths: How should the construct of vocabulary knowledge be defined? In Bogaards

Laufer

(Eds.), Vocabulary in a second language: Selection, acquisition, and testing (pp. 209–227). John Benjamins.

95.

Romagnoli

Tao

(2022). Discourse markers in Mandarin L1 and Italian L2 monologue production and their pedagogical implications. In Yuan

(Eds.), Pedagogical grammar and grammar pedagogy in Chinese as a second language (pp. 167–186). Routledge. https://doi.org/10.4324/9781003161646-13

96.

Sadeghi

(2021). Assessing second language reading: Insights from cloze tests. Springer.

97.

Schmitt

(1998). Tracking the incremental acquisition of second language vocabulary: A longitudinal study. Language Learning, 48(2), 281–317. https://doi.org/10.1111/1467-9922.00042

98.

Schmitt

(2008). Instructed second language vocabulary learning. Language Teaching Research, 12(3), 329–363. https://doi.org/10.1177/1362168808089921

99.

Schmitt

(2010). Researching vocabulary: A vocabulary research manual. Palgrave Macmillan.

100.

Schmitt

(2019). Understanding vocabulary acquisition, instruction, and assessment: A research agenda. Language Teaching, 52(2), 261–274. https://doi.org/10.1017/S0261444819000053

101.

Schmitt

Dunn

O’Sullivan

Anthony

Kremmel

(2021). Introducing knowledge-based vocabulary lists (KVL). TESOL Journal, 12(4), Article e622. https://doi.org/10.1002/tesj.622

102.

Schmitt

(2014). A reassessment of frequency and vocabulary size in L2 vocabulary teaching. Language Teaching, 47(4), 484–503. https://doi.org/10.1017/S0261444812000018

103.

Seidenberg

M.S.

McClelland

J.L.

(1989). A distributed, developmental model of word recognition and naming. Psychological Review, 96(4), 523–568. https://doi.org/10.1037/0033-295x.96.4.523

104.

Shen

H.H.

(2009). Size and strength: Written vocabulary acquisition among advanced learners. Shijie hanyu jiaoxue, 23(1), 74–85. https://kns.cnki.net/kcms2/article/abstract?v=Qc2UN8NgW0s14AlidpzFNxZSGCXDpnUAVz2OmTqE2nR_0YKiAYNPRpowYJ2uYyhj7pKu5FfH5naNNSRdck8v4IYLkv3trEWvW76rvM-SAAY647UdWJRzUci88seP8qgXuxiESrTuuYz1gSaXjTlYJy-Her5BJLvBO-dFcBt7lHit0ar7czraqw==&uniplatform=NZKPT&language=CHS

105.

Smith

E.E.

Shoben

E.J.

Rips

L.J.

(1974). Structure and process in semantic memory: A featural model for semantic decisions. Psychological Review, 81(3), 214–241. https://doi.org/10.1037/h0036351

106.

Smith

R.M.

(1987). Assessing partial knowledge in vocabulary. Journal of Educational Measurement, 24(3), 217–231. https://doi.org/10.1111/j.1745-3984.1987.tb00276.x

107.

Snow

C.E.

(1990). The development of definitional skill. Journal of Child Language, 17(3), 697–710. https://doi.org/10.1017/s0305000900010953

108.

Stewart

White

D.A.

(2011). Estimating guessing effects on the Vocabulary Levels Test for differing degrees of word knowledge. TESOL Quarterly, 45(2), 370–380. http://www.jstor.org/stable/41307638

109.

Stoeckel

T.I.M.

Sukigara

(2018). A serial multiple-choice format designed to reduce overestimation of meaning-recall knowledge on the Vocabulary Size Test. TESOL Quarterly, 52(4), 1050–1062. http://www.jstor.org/stable/44987048

110.

Stubbe

(2015). Replacing translation tests with yes/no tests. Vocabulary Learning and Instruction, 4(2), 38–48. https://www.castledown.com/journals/vli/article/view/vli.v04.2.stubbe

111.

(2001). Guanyu xiandai hanyu cidian cihui jiliang yanjiu de sikao [Research on the computational analysis of the vocabulary items in modern Chinese dictionary]. Shijie hanyu jiaoxue, (4), 39–47. https://kns.cnki.net/kcms2/article/abstract?v=Qc2UN8NgW0t_7LPLyf1q3AZhNBr4gZcMIb9wGdOucoq_Qy6O4CzPjTeY2I643lsiBBQq1q3ZvG1UqISl3bfdu1u3C5c-KT3BPUlZR-ImMzAW042WiQBgpgvPKKJ5qB44dlIqZls_J5ELktYUqKsja-MwPFmgg6kCZZw9cLks6gG8G9oSZArPyA==&uniplatform=NZKPT&language=CHS

112.

Tan

L.Y.

McLean

Kim

Y.A.

Vitta

J.P.

(2024). Rasch modelling vs. item facility: Implications on the validity of assessments of Asian EFL/ESL vocabulary knowledge and lexical sophistication modelling. Language Testing in Asia, 14(1), Article 55. https://doi.org/10.1186/s40468-024-00327-8

113.

Taylor

Galaczi

(2011). Scoring validity. In Taylor

(Ed.), Examining speaking: Research and practice in assessing second language speaking (pp. 171–192). Cambridge University Press.

114.

The State Language Affairs Commission. (2021). Guoji zhongwen jiaoyu zhongwen shuiping dengji biaozhun [Chinese proficiency grading standards for international Chinese language education]. Beijing Language and Culture University Press.

115.

Thornbury

(2002). How to teach vocabulary. Pearson Education.

116.

Tokowicz

Kroll

J.F.

De Groot

A.M.B.

van Hell

J.G.

(2002). Number-of-translation norms for Dutch–English translation pairs: A new tool for examining language production. Behavior Research Methods Instruments & Computers, 34(3), 435–451. https://doi.org/10.3758/Bf03195472

117.

Trace

(2019). Clozing the gap: How far do cloze items measure? Language Testing, 37(2), 235–253. https://doi.org/10.1177/0265532219888617

118.

Treiman

Kessler

Caravolas

(2019). What methods of scoring young children’s spelling best predict later spelling performance? Journal of Research in Reading, 42(1), 80–96. https://doi.org/10.1111/1467-9817.12241

119.

Tremblay

(2011). Proficiency assessment standards in second language acquisition research: “Clozing” the gap. Studies in Second Language Acquisition, 33(3), 339–372. https://doi.org/10.1017/S0272263111000015

120.

Wang

Deng

(2020). Laihua liuxue yukesheng hanyu cihui zhishi ceping sheji [The design of the Basic Chinese Vocabulary Knowledge Test for preuniversity students studying in China]. Zhongguo kaoshi, (12), 23–29. https://link.cnki.net/doi/10.19360/j.cnki.11-3303/g4.2020.12.004

121.

Webb

(Ed.). (2020). The Routledge handbook of vocabulary studies. Routledge.

122.

Webb

Sasao

Ballance

(2017). The updated Vocabulary Levels Test: Developing and validating two new forms of the VLT. ITL – International Journal of Applied Linguistics, 168(1), 33–69. https://doi.org/10.1075/itl.168.1.02web

123.

Wen

van Heuven

W.J.B.

(2017). Chinese translation norms for 1,429 English words. Behavior Research Methods, 49(3), 1006–1019. https://doi.org/10.3758/s13428-016-0761-x

124.

Wind

Hua

(2021). Rasch measurement theory analysis in R: Illustrations and practical guidance for researchers and practitioners. Routledge.

125.

Wong

Zhang

(2022). Introduction to Chinese natural language processing. Springer Nature.

126.

Zhang

Sukjairungwattana

Wang

(2022). The roles of motivation, anxiety and learning strategies in online Chinese learning among Thai learners of Chinese as a foreign language. Frontiers in Psychology, 13, Article e962492. https://doi.org/10.3389/fpsyg.2022.962492

127.

Zhang

Zheng

Yang

(2020). Hanying shuangyuzhe yingyu shuiping celiang fangfa de bijiao yanjiu [A comparative study of English proficiency testing in Chinese–English bilinguals]. Waiyu jiaoxue yu yanjiu, 52(5), 701–712. https://link.cnki.net/doi/10.19923/j.cnki.fltr.2020.05.006

128.

Zhang

(2018). Yanjiuyong hanyu shuiping fenji ceshi fangfa dui yanjiu jieguo de yingxiang [The influence of different L2 Chinese proficiency measurements on the results of CSL research]. Yuyan jiaoxue yu yanjiu, (6), 14–23. https://kns.cnki.net/kcms2/article/abstract?v=Qc2UN8NgW0tLJ0LiyoT_gsaO1ihCmxNTPBXKz-dEOLW1ASRSY-O8Z-pb7Ur9iMB1BO4nkEkQkynoV3A5XQ6l6FvkLPk2GCG_13KqxYRtUI-mZad4KUXuvaZKgxcrjf-Z_xffw6ATPYq5iqB0enYduwRXbSiO-H6dcMrsTyrwFub2LWll2rNtDQ==&uniplatform=NZKPT&language=CHS

129.

Zhang

(2021). Hanyu shuiping fenji ceshi fangfa dui yanjiu jieguo de yingxiang zaitan-yi hanyu shuiping he yuanyuyan yishi zhijian guanxi weili [A further study on the influence of Chinese language proficiency tests on the research results]. Guoji hanyu, 5, 37–46. https://kns.cnki.net/kcms2/article/abstract?

130.

Zhang

Jiang

Yang

(2020). Investigating the influence of different L2 proficiency measures on research results. Sage Open, 10(2), 1–14. https://doi.org/10.1177/2158244020920604

131.

Zhang

Sun

Bianglae

Widiawati

(2024). The development of a Chinese Vocabulary Proficiency Test (CVPT) for learners of Chinese as a second/foreign language. Language Testing. Advance online publication. https://doi.org/10.1177/02655322231219998

132.

Zhang

(2017). Hanyu dier yuyan xuexizhe jieshouxing cihuiliang shizheng yanjiu [Investigation into receptive vocabulary size of second language learners of Chinese]. Yuyan wenzi yingyong, (3), 125–133. https://link.cnki.net/doi/10.16499/j.cnki.1003-5397.2017.03.014

133.

Zhang

(2022). The relationship between vocabulary knowledge and L2 reading/listening comprehension: A meta-analysis. Language Teaching Research, 26(4), 696–725. https://doi.org/10.1177/1362168820913998

134.

Zhang

Liu

(2020). Pseudowords and guessing in the yes/no format vocabulary test. Language Testing, 37(1), 6–30. https://doi.org/10.1177/0265532219862265

135.

Zhang

(2014). A longitudinal study of receptive vocabulary breadth knowledge growth and vocabulary fluency development. Applied Linguistics, 35(3), 283–304. https://doi.org/10.1093/applin/amt014

136.

Zhang

(2024). Xiandai hanyu cilei, cichang, ciyi de shuliang tezheng fenxi [Quantitative analysis of word classes, word lengths and word meanings in modern Chinese]. Zhongguo shehui kexueyuan daxue xuebao, 44(5), 64–80.

137.

Zhao

(2018). Validation of the Mandarin version of the Vocabulary Size Test. RELC Journal, 49(3), 308–321. https://doi.org/10.1177/0033688216639761

What counts as an acceptable answer in a vocabulary translation test? A preliminary study on the scoring schemes for learners of Chinese as a second/foreign language

Abstract

Keywords

Get full access to this article

References