Abstract
This paper presents evidence that supports the valid use of scores from fully automatic tests of spoken language ability to indicate a person’s effectiveness in spoken communication. The paper reviews the constructs, scoring, and the concurrent validity evidence of ‘facility-in-L2’ tests, a family of automated spoken language tests in Spanish, Dutch, Arabic, and English. The facility-in-L2 tests are designed to measure receptive and productive language ability as test-takers engage in a succession of tasks with meaningful language. Concurrent validity studies indicate that scores from the automated tests are strongly correlated with the scores from oral proficiency interviews. In separate studies with learners from each of the four languages the automated tests predict scores from the live interview tests as well as those tests predict themselves in a test-retest protocol (r = 0.77 to 0.92). Although it might be assumed that the interactive nature of the oral interview elicits performances that manifest a distinct construct, the closeness of the results suggests that the constructs underlying the two approaches to oral assessment have a stable relationship across languages.
Get full access to this article
View all access options for this article.
