Abstract
This corpus-based investigation addresses a critical research gap regarding systematic comparison of multiword units (MWUs) distribution across input modalities within Chinese university English as a foreign language (EFL) textbooks. In EFL contexts where textbooks constitute the primary source of language input and authentic exposure remains limited, understanding how different input channels contribute to formulaic language acquisition becomes essential for effective materials design and instruction. The investigation analysed two specialized English textbooks corpora: Reading Input Corpus (185,000 tokens) and Listening Input Corpus (311,127 tokens), compiled from widely-adopted Chinese university EFL textbook series. The study employed dual-phase computational-manual extraction methodology using R cross-validated with LancsBox X, applying frequency thresholds, mutual information scores, and distribution criteria. Strong inter-rater reliability (Cohen’s κ = 0.87) was achieved through systematic manual refinement by three trained researchers. The analysis identified 301 unique MWUs across both corpora, examining frequency distributions, structural-functional patterns, and functional category representation differences. Results revealed substantial modality-specific differences with profound pedagogical implications. Listening input demonstrated significantly higher MWU coverage (70.56%) compared to reading input (40.83%). Critical acquisition gaps emerged in repetition patterns: while 77.54% of MWUs in listening materials achieved the 6–12 encounter threshold established for effective acquisition, only 39.10% in reading materials received adequate repetition. Functional analysis confirmed appropriate register sensitivity, with conversational functions doubling from reading to listening materials. These findings provide empirical evidence for corpus-informed materials development in EFL contexts, where optimizing limited input exposure becomes crucial for learner success. The study establishes foundations for enhanced textbook development and targeted MWU instruction through modality-specific strategies in Chinese tertiary English education and other similar EFL contexts.
Keywords
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
