Sage Journals: Discover world-class research

Abstract

This article investigates a group oral test as administered at a university in Japan to find if it is appropriate to use scores for higher stakes decision making. It is one component of an in-house English proficiency test used for placing students, evaluating their progress, and making informed decisions for the development of the English language curriculum. The implementation of a cut-score for students to advance through the university system has recently been proposed, bringing the group oral test component under increased scrutiny. On two successive occasion 113 participants sat the oral test in groups composed of different interlocutors each time. Rasch analysis shows rater fit within acceptable levels considering the length and nature of the test; however, at correlations of .74 inter-rater agreements are lower than has been reported in research on commercially available interview tests. Candidates’ scores on the two different test occasions correlate at .61. A generalizability study shows that the greatest systematic variation in test scores is contributed by the person-by-occasion interaction. Topic, or prompt, was not a significant factor. Candidates’ performances, or how raters perceive an individual candidates’ ability, could be affected to a large degree by the characteristics of interlocutors and interaction dynamics within the group.

Get full access to this article

View all access options for this article.

References

Alderson, C. , Clapham, C. and Wall, D. 1995: Language test construction and evaluation. Cambridge University Press .

Anderson, J. R. 1983: The architects of cognition. Harvard University Press .

Bachman, L. , Lynch, B. and Mason, M. 1995: Investigating variability in tasks and rater judgements in a performance test of foreign language teaching . Language Testing 5, 128-159 .

Berry, V. 1996: Rater preferences, learner differences and the assessment of oral discourse , AAAL conference.

Berry, V. 2004: A study of the interaction between individual personality differences and oral performance test facets. Unpublished doctoral dissertation, Kings College, University of London.

Bolus, R. E. , Hinofotis, F. and Bailey, K. M. 1981: An introduction to general-izability theory in second language research . Language Learning 32, 245-258 .

Bonk, W. J. and Ockey, G. 2003: A many-facet Rasch analysis of the second language group oral discussion task . Language Testing 20, 89-110 .

Bonk, W. J. and Van Moere, A. 2004: L2 group oral testing: the influence of shyness/outgoingness, match of interlocutors’ proficiency level, and gender on individual scores . Paper presented at the Language Testing Research Colloquium.

Brennan, R.L. 1983a: Genova. Computer program.

10.

Bonk, W. J. and Van Moere, A. 1983b: Elements of generalizability theory. Iowa City, IA: The American College Testing Program .

11.

Brown, A. 2003: Interviewer variation and the co-construction of speaking proficiency . Language Testing 20, 1-25 .

12.

Carroll, B. 1991: Response to Don Porter’s paper: ‘affective factors in language testing’. In Alderson, C. and North, B. , editors, Language testing in the 1990s. London: Modern English Publications , 41-45.

13.

Crick, J. E. and Brennan, R. L. 1983: Manual for GENOVA: a generalized analysis of variance system. Iowa City, IA: The American College Testing Program .

14.

Cronbach, L. J. , Gleser, G. C. , Nanda, H. and Rataratnam, N. 1972: The dependability of behavioral measurements: theory of generalizability for scores and profiles. New York: Wiley .

15.

Folland, D. and Robertson, D. 1976: Towards objectivity in group oral testing . English Language Teaching Journal 30, 156-167 .

16.

Fulcher, G. 1996: Testing tasks: issues in task design and the group oral . Language Testing 13, 23-51 .

17.

Henning, G. 1992: The ACTFL oral proficiency interview: validity evidence . System 20, 365-372 .

18.

Hilsdon, J. 1991: The group oral exam: advantages and limitations. In Alderson, J. C. and North, B. , editors, Language testing in the 1990s. London: Modern English Publications and the British Council , 189-197.

19.

Iwashita, N. 1996: The validity of the paired interview format in oral performance assessment . Melbourne Papers in Language Testing 5, 1-16 .

20.

Lazaraton, A. 1996: Interlocutor support in oral proficiency interviews: the case of CASE . Language Testing 13, 151-172 .

21.

Levelt, W. J. M. 1989: Speaking: from intention to articulation. Cambridge, MA: MIT Press .

22.

Linacre, J. 2001: Facets (version 3.47.0). Chicago, IL: Winsteps .

23.

Lumley, T. and Brown, A. 1997: Interviewer variability in specific-purpose language performance tests. In Kohonen, V. , Huhta, A. , Kurki-Suonio, L. and Luoma, S. , editors, Current developments and alternatives in language assessment: Proceedings of LTRC 96. Jyvaskyla: University of Jyvaskyla and University of Tampere , 137-150.

24.

McNamara, T. F. 1997: ‘Interaction’ in second language performance assessment: whose performance ? Applied Linguistics 18, 446-465 .

25.

Messick, S. 1996: Validity and washback in language testing . Language Testing 13, 241-256 .

26.

O’Loughlin, K. 1995: Lexical density in candidate output on direct and semi-direct versions of an oral proficiency test . Language Testing 12, 217-237 .

27.

O’Loughlin, K. 2002: The impact of gender in oral proficiency testing . Language Testing 19, 169-192 .

28.

O’Sullivan, B. 2000: Towards a model of oral performance. Unpublished doctoral dissertation, University of Reading.

29.

O’Loughlin, K. 2002: Learner acquaintanceship and oral proficiency test pair-task performance . Language Testing 19, 277-295 .

30.

Porter, D. 1991: Affective factors in language testing. In Alderson, J. C. and North, B. , editors, Language testing in the 1990s. London: Modern English Publications , 32-40.

31.

Reves, T. 1991: From testing research to educational policy: a comprehensive test of oral proficiency. In Alderson, J. C. and North, B. , editors, Language testing in the 1990s. London: Modern English Publications and the British Council , 180-188.

32.

Salaberry, R. 2000: Revising the format of the ACTFL oral proficiency interview . Language Testing 17, 289-310 .

33.

Shohamy, E. 1990: Language testing priorities: a different perspective . Foreign Language Annals 23, 385-394 .

34.

Shohamy, E. , Reves, T. and Bejarano, T. 1986: Introducing a new comprehensive test of oral proficiency . ELT Journal 40, 212-220 .

35.

Stansfield, C. W. and Kenyon, D. M. 1992: Research on the comparability of the oral proficiency interview and the simulated oral proficiency interview . System 20, 347-364 .

36.

van Lier, L. 1989: Reeling, writhing, drawling, stretching, and fainting in coils: oral proficiency interviews as conversation . TESOL Quarterly 23, 489-508 .

37.

Van Moere, A. and Johnson, F. C. 2003: Communicative assessment in a personal curriculum at kanda university of international studies. In Mackenzie, A. and Newfields, T. , editors, Curriculum innovation, testing and evaluation. Tokyo: The Japan Association for Language Teaching (JALT), College and University Educators (CUE) and Testing and Evaluation (TEVAL) Special Interest Groups (SIGs) , 155-162.

38.

Van Moere, A. and Kobayashi, M. 2003: Who speaks most in this group? Does that matter ? Paper presented at the Language Testing Research Colloquium.

39.

Van Moere, A. and Kobayashi, M. 2004: Group oral testing: does amount of output affect scores ? Paper presented at the Language Testing Forum.

40.

Webb, N. 1994: Group collaboration in assessment: competing objectives, processes, and outcomes. Los Angeles, CA: CRESST/University of California .

Validity evidence in a university group oral test

Abstract

Get full access to this article

References