American Educational Research Association, American Psychological Association, and National Council on Measurement in Education. (2014). Standards for Educational and Psychological Testing.
2.
BridgemanB.TrapaniC.AttaliY. (2012). Comparison of human and machine scoring of essays: Differences by gender, ethnicity, and country. Applied Measurement in Education, 25, 27–40. https://doi.org/10.1080/08957347.2012.635502
ChapelleC. A.EnrightM. K.JamiesonJ. M. (2008). Building a validity argument for the test of English as a foreign languageTM. Lawrence Erlbaum.
5.
ChenW.LeiY.FengK.WuS.LiL. (2019). Provincial emission accounting for CO2 mitigation in China: Insights from production, consumption and income perspectives. Applied Energy, 25, 51–10. https://doi.org/10.1016/j.apenergy.2019.113754
6.
ChengF. (2011). Justifying the interpretations about a listening-to-retell task in CELST in NMET (GD) [Unpublished doctoral dissertation]. Guangdong University of Foreign Studies.
7.
DeaneP. (2013). On the relation between automated essay scoring and modern views of the writing construct. Assessing Writing, 18, 7–24. https://doi.org/10.1016/j.asw.2012.10.002
8.
Education Examinations Authority of Guangdong Province. (2016). Test syllabus and sample paper disk for Computer-based English Listening and Speaking Test (CELST) of National Matriculation English Test (Guangdong Version). Guangdong Pacific Electronic Press.
9.
HughesA.HughesJ. (2020). Testing for language teachers (3rd ed.). Cambridge University Press.
10.
HymesDell H. (1972). “On communicative competence.” In Sociolinguistics PrideJ. B.HolmesJ. (Eds.), 269–93. Penguin.
KnochU.HuismanA.ElderC.KongX.McKennaA. (2020). Drawing on repeat test takers to study test preparation practices and their links to score gains. Language Testing, 37(4), 550–572. https://doi.org/10.1177/0265532220927407
13.
KunnanA. J. (2014). Fairness and justice in language assessment. In KunnanA. J. (Ed.), The companion to language assessment (pp. 1098–1114). John Wiley & Sons.
14.
LuomaS. (2004). Assessing speaking. Cambridge University Press.
15.
Ministry of Education of the People’s Republic of China. (2003). English curriculum for senior middle schools (Experimental). People’s Education Press.
16.
O’SullivanB.ChengL. (2022). Lessons from the Chinese imperial examination system. Language Testing in Asia, 12(1), Article 52. https://doi.org/10.1186/s40468-022-00201-5
17.
PeregoyS. F.BoyleO. F. (1997). Reading, writing, and learning in ESL. Longman.
18.
TaoB. (2018). Reflections and reform of Neo-Gaokao item writing and scoring. Examinations Research, 2, 94–100.
19.
WoodsD. (1996). Teacher cognition in language teaching: Beliefs, decision-making and classroom practice. Cambridge University Press.
20.
XiX. (2010). Automated scoring and feedback systems: Where are we and where are we heading?Language Testing, 27(3), 291–300. https://doi.org/10.1177/0265532210364643
21.
XuW. (2021). Practice of speaking assessment in large-scale high-stake examination: A case study of Shanghai English Gaokao Listening and Speaking Test. Foreign Language Testing and Teaching, 1, 21–27.
22.
XuY. (2017). Evaluating the use of automated scoring in Computer-based English Listening and Speaking Test (CELST) of National Matriculation English Test (Guangdong Version). Foreign Language Teaching in Schools, 40(2), 47–53.
23.
XuY.ZengY. (2015). An analysis of a mock test’s ratings of Computer-based English Listening and Speaking Test (CELST) of National Matriculation English Test (Guangdong Version) based on G-theory and Multi-facet Rasch Model. E-Education Research, 36(3), 100–106.
24.
ZhangF. (2014). An investigation into the washback effect of Computer-based English Listening and Speaking Test of National Matriculation English Test (Guangdong Version): Teachers’ cognition system and test-preparation behaviours within the framework of Beliefs, Assumptions and Knowledge (BAK). Foreign Language Testing and Teaching, 4(3), 44–49.
25.
ZhangF. (2015). The variability and mechanism of washback: Investigating the washback of NMET CELST through teachers’ test preparations [Unpublished doctoral dissertation]. Guangdong University of Foreign Studies.
26.
ZhangH.Bournot-TritesM. (2021). The long-term washback effects of the National Matriculation English Test on college English learning in China: Tertiary student perspectives. Studies in Educational Evaluation, 68, 1–21. https://doi.org/10.1016/j.stueduc.2021.100977
27.
ZhouY.ZengY. (2016). A Multi-facet Rasch analysis of automated scoring of Computer-based English Listening and Speaking Test (CELST) of National Matriculation English Test (Guangdong Version). Foreign Language Testing and Teaching, 6(1), 22–31.