Abstract
Aims and objectives:
This introductory article aims to set the scene for the special issue by discussing existing and future research directions on multiword units (MWUs) in multilingual speakers. It also outlines the purpose and structure of the special issue and presents the individual contributions.
Design, data and analysis:
The introductory article reviews the most relevant theoretical and methodological issues as well as the main research gaps related to the processing, learning, and use of MWUs in both mono- and multilingual speakers. In addition, it introduces the contributions to this volume and briefly presents the types of MWUs, the types of multilingual speakers and the data and methodologies that are in focus.
Conclusions, originality and implications:
The contributions in this special issue on different types of MWUs in different groups of multilingual speakers using different methodologies shed new light on open questions in various areas of multilingualism research, including psycholinguistic approaches to second-language learning and processing, contact linguistics as well as research on heritage speakers and language attrition. In this regard, the special issue contributes to a more complete and differentiated picture of the role of MWUs in multilingual speakers but also in language processing, learning, and use in general.
Keywords
Introduction
Multiword units (henceforth MWUs), also referred to as formulaic sequences or formulaic language (for a terminological discussion see Wray, 2002), is a cover term for different kinds of word combinations, in which two or more words jointly refer to a single conventionalized concept and which are likely to be mentally stored and used as unanalyzed wholes. For many decades, formulaic language has been a research object for scholars from theoretical and applied linguistics and from a variety of linguistic subfields such as computer-, corpus-, psycho-, neuro-, and sociolinguistics. There are several monographs (e.g., Granger & Meunier, 2008; Nattinger & DeCarrico, 1992; Schmitt, 2004; Wood, 2015; Wray, 2002; Wulff, 2008) and special issues (e.g., Wood, 2017; Wulff & Titone, 2014) evidencing a large body of accumulated knowledge about how MWUs are acquired, used, represented, and processed in the brain.
However, Wray (2012, p. 231) questions several assumptions of previous research about the coherence of MWUs as a phenomenon and formulates eight open questions, arguing that “we do not yet have the full measure of how different features associated with formulaicity fit together.” Remarkably, only two types of language users, native speakers and second/foreign language learners, figure in her account as well as in research on MWUs in general. This is unfortunate given two facts. First, formulaic language has been considered “a linguistic solution to the problem of how to promote our own survival interests” (Wray, 2012, p. 231) that characterizes linguistic knowledge of all types of speakers (Wray, 1999, p. 216). Second, most language users in the world are multilingual, exhibiting complex and diverse trajectories of multilingual language acquisition, loss, and use that do not fit neatly into the categories of native speaker and second language learner (Baker & Prys Jones, 1998; Crystal, 1987; Hamers & Blanc, 2000; Dewaele et al., 2003). It is increasingly recognized that “the functioning of every language system is based on a potential multilingual competence” (Franceschini, 2011, p. 344) and that speakers differ not in the number of languages they know but in the “amount and diversity of experiences and use” (Hall et al., 2006, p. 229). Although each language development and use is variable and dynamic, multilingualism “shows it in a way one cannot easily elude” (Franceschini, 2011, p. 352). For these reasons, knowledge about MWUs gained by scholars working on multilingualism should be integrated into general research on formulaicity. We believe that it could help to respond to a main challenge facing current research on MWUs, that is, to explain the facts about MWUs with the underlying motivations that determine usage (Wray, 2012, p. 231).
The aim of obtaining a more unified picture of MWUs in multilingualism research motivated us, a team of researchers from various disciplines at the University of Tubingen, to organize a workshop bringing together scholars working on different types of MWUs in different types of multilingual speakers. The purpose of the workshop was to uncover patterns of MWUs’ acquisition, use and processing in multilinguals. The workshop united scholars from the fields of second and foreign language acquisition, heritage languages, language attrition, language contact, and language teaching who were eager to explore similarities and differences in various types of multilinguals. The present special issue results from the original contributions of the participants as well as from the fruitful discussions during the workshop. In this introductory article, we provide an overview on previous research on MWUs in general and in different fields of multilingualism, and highlight how the articles of this volume contribute to the open questions in research on formulaic language.
We will start by briefly outlining the concept and general classification of MWUs that have been suggested in previous research. After that, we will summarize findings on MWUs from the fields of second and foreign language learning and teaching, language contact, heritage languages and language attrition, before we will introduce the individual contributions to the special issue.
MWUs and their types
There is a range of approaches to and conceptualizations of MWUs (mostly in research on monolingual/L1 speakers), all of them roughly falling into three main strands of research.
According to the traditional phraseological approach, a MWU has to fulfill three criteria: polylexicality (more than one word), idiomaticity (semantic and/or syntactic irregularity), and conventionality understood as fixedness (stability of form) (e.g., Burger et al., 2007; Cowie, 1998/2001). Although, as pointed out by Buerki (2020), the operationalization of a MWU according to these three criteria has the advantage of being concrete and relatively easy to apply to actual data, most studies from a phraseological perspective have focused on the idiomaticity feature and on semantic non-compositionality in particular.
The second, distributional or usage-based approach (e.g., Evert, 2004; Granger & Paquot, 2008; Nesselhauf, 2005), centers on MWUs as conventionalized word combinations that are preferred and frequently used ways of expressing concepts in a community (Bybee, 2010, p. 35; Pawley, 2001, p. 122). Conventionality has been most often operationalized through frequency-based measures of word association automatically extracted from corpora (Biber et al., 2012), and MWUs have been identified in language data according to these measures. As a result, these approaches also include semantically compositional (i.e., “non-idiomatic”) MWUs, which have been shown to be more frequent in language than idioms and proverbs and have not received much attention in more phraseologically oriented accounts. Within the usage-based approach, there is a line of thinking about MWUs in terms of constructions within the realm of cognitively oriented and usage-based Construction Grammar (Croft, 2001; Goldberg, 2006). The notion of “construction” is more ample than the traditional definition of MWUs, since it includes not only lexically specific sequences, but also partially or fully schematic units, which may have lexical, phraseological or more abstract grammatical meaning as well as open slots that may be more flexibly filled (Croft, 2001; Fillmore et al., 1988; Goldberg, 2006). Although constructions are defined as conventionalized form-meaning-pairings and most studies rely on corpus-based frequency measures, Construction Grammar claims to focus more on the cognitive entrenchment of constructions in the mind of individual speakers than on the processes of conventionalization in a speaker community (Bybee, 2010; Langacker, 1987; H.-J. Schmid, 2020 for critical discussion). However, the construction grammar approach has, up to today, not integrated to a sufficient extent the social and psychological motivations of human behavior (such as, for example, perceptual or social salience) that are needed to explain many findings, for example, why certain MWUs are learned easier than others in second language acquisition and why some of them are more important identity markers than others in specific sociolinguistic contexts (Auer, 2014; H.-J. Schmid & Günther, 2016; Wray, 2012, p. 245).
The third research strand, namely psycholinguistic approaches, has mostly focused on processing features of MWUs in individuals, considering them more a feature of idiolects than a part of the linguistic system shared by a speaker community. The processing of MWUs, especially compounds, played an important part in the development of models of speech processing in general—in particular, of morphological processing. In psycholinguistic research on MWUs in monolingual speakers, the question of the processing differences between syntactic (compositional) and lexical (non-compositional) MWUs has been crucial. Traditional models of morphological processing can generally be divided into two approaches: dual-system theories (e.g., Chomsky, 1965; Ullman, 2004) and single-system theories (e.g., Bates & MacWhinney, 1989; Rummelhart & McClelland, 1986; for a discussion of both approaches, see Snider & Arnon, 2012). Most traditional dual-system theories assume that compositional and non-compositional MWUs are processed differently in the course of speech processing. According to this approach, forms that are stored in the lexicon and forms that are computed by grammar are learned differently and are even governed by different neural substrates (Ullman, 2001). By contrast, single-system theories do not assume a distinction between lexically stored and grammatically computed forms. More precisely, grammatical and lexical items are learned and processed in the same way and underlie the same cognitive mechanisms. According to this approach, all linguistic units, regardless of their length and semantic complexity, underlie one single cognitive mechanism. However, recent computational models rely on computed representations without considering the lexicon as a single system (see Baayen et al., 2019).
A number of studies have shown that monolingual adults and children performing a range of psycho- and neurolinguistic tasks are sensitive to distributional properties of at least some types of MWUs in language processing and learning. In psycholinguistic research, phrasal frequency or n-gram effects were found for lexicalized compounds as well as for above-word-level sequences in language processing (e.g., Arnon & Snider, 2010; Hennecke & Baayen, 2021). These effects may be interpreted in terms of the speakers’ probabilistic knowledge about co-occurrence frequencies (see Kuperman et al., 2008), which speeds up processing. However, it is not clear whether faster processing also indicates holistic storage or rather the faster mapping or (de)composition of components, with the possibility of both options representing two sides of the same coin (Wray, 2012, pp. 233–234).
Having sketched the main theoretical approaches to MWUs, we will now turn to the commonly accepted types of MWUs. There have been numerous attempts to classify MWUs from a phraseological perspective (for overviews of existing classifications, see Cowie, 1998/2001; Fleischer, 1982; Granger & Paquot, 2008; Sag et al., 2002; Wood, 2015; for different classifications, see Buerki, 2020, pp. 7–13; Friedrich, 2006, pp. 23–44; Wood, 2019, pp. 31–36, among others). Buerki (2020, pp. 7–13) proposes a working typology of six types identified according to multiple criteria and representing prototype categories of formulaic MWUs with better and worse examples:
Formulas are defined as sequences serving a particular pragmatic function, for example, greetings (How are you?), apologies (I’m sorry) or addressing a person (Dear sir or madam), and so on.
Idioms (spill the beans, pull someone’s leg) are characterized by semantic non-compositionality and structural fixedness.
Proverbs are defined as non-compositional and structurally fixed sentence-like items (out of sight, out of mind, look before you leap).
Multiword terms are semantically non-compositional and highly fixed word equivalents covering compounds (type-token-ratio, real estate), phrasal verbs (put up with, bump into), and periphrastic constructions expressing tense, aspect, and voice (has been done, is going to do something).
Collocations represent an extremely diverse type of MWU. Some phraseologists (Burger, 2010; Cowie, 1994) use the term collocation for all non-idiomatic, semantically transparent combinations. However, more common is a narrower understanding of a collocation in terms of a “preferred syntagmatic relation between two lexemes in a specific syntactic pattern” (Granger & Paquot, 2008, p. 16), such as brush teeth, severely hampered, full agreement. In corpus-based frequency-driven approaches, collocations are identified as a ratio of the frequency of a word appearing in a certain lexical context as compared with its frequency in language as a whole (Hausmann et al., 1989/1991; Jones & Sinclair, 1974).
Usual sequences are defined as largely compositional word sequences with usually adjacent constituents that, however, can include variable slots or extensions and can be quite long (unlike collocations). Examples are from time to time, for the first time in . . . years, cannot afford to, when it comes to. Usual sequences cover MWUs known as lexical bundles (Biber et al., 1999; Cortes, 2004) and congrams (Cheng et al., 2009), recurrent word combinations (B. Altenberg, 1998), chunks (DeCock et al., 1998), or clusters (McCarthy & Carter, 2002).
In addition, recent research at the interface between Construction Grammar and more traditional phraseology has drawn attention to the existence of syntactic MWUs with open slots, non-compositional meanings and (often) pragmatic (e.g., intensifying) functions, which are also called “phraseological templates” (Phraseoschablone) or “constructional idioms,” for example, So ein(e) N! (e.g., So eine Überraschung! “What a surprise!”), N1 hin N1 her (e.g., Krise hin, Krise her “crisis or no crisis”) (cf. Dobrovol’skij, 2016; Fleischer, 1997; Mellado Blanco, 2022; Mellado Blanco & Gutiérrez Rubio, 2020).
As we will discuss in the following, the discovery of MWUs in language has raised important theoretical questions about how language is structured, used, processed, and learned not only in monolingual but also in multilingual speakers and communities.
MWUs in different subfields of multilingualism research
MWUs have played a different role in different fields of multilingualism research. They have been a focus of intensive research in (mostly) psycholinguistically oriented work on second and foreign language acquisition and learning (see overviews in Cowie, 1998/2001; Gries & Wulff, 2005; Meunier & Granger, 2008; Robinson & Ellis, 2008; Schmitt, 2004; Wray, 2002 as well as Erman et al., 2016 and Gablasova et al., 2017). By contrast, they have often been treated together with other types of cross-linguistic influence in other fields, including contact linguistics and research on heritage speakers (HSs) and language attrition.
Psycholinguistic research on second/foreign language acquisition and learning
In psycholinguistic research on multilingualism, there is no general agreement on how the multilingual mental lexicon is organized. The most controversial discussions concern the issue of independence or interdependence of the languages and of selectivity or non-selectivity in lexical access. According to the separate or independent hypothesis, different languages are stored and accessed separately in different memory stores. By contrast, the shared or interdependent hypothesis claims that the languages are stored in one memory store and that words are tagged as belonging to one language in word retrieval (for an overview of the experimental evidence for both hypotheses see, for example, Heredia, 2008). In recent years, there has been growing evidence in favor of the interdependence of languages in the multilingual mental lexicon as well as for a shared conceptual system and non-language-specific lexical selection. Evidence for this claim comes from different experimental studies on cross-linguistic influence, for instance, in cross-linguistic priming and experimental studies on code-switching (for an overview, see Kootstra & Muysken, 2017). Moreover, cross-linguistic influence and interconnectivity between the languages have been found at the lexical as well as at the grammatical level (e.g., Bernolet et al., 2007).
Psycholinguistic research on learning and processing in multilingual speakers has mostly been conducted for MWUs that are fixed in form and meaning (Hernández et al., 2016, p. 3). While some studies have proposed that (especially classroom) L2 learners have more difficulty in detecting and learning larger distributional patterns (Ellis, 2006; Wray, 2002), others point to a facilitative effect of MWUs in (both classroom and immersive) L2 learning and processing (Arnon & Ramscar, 2012 as well as Hernández et al., 2016 for an overview). For example, it has been shown that both L1 and L2 speakers recognize high-frequency MWUs faster in lexical decision tasks (e.g., Arnon & Snider, 2010; Hernández et al., 2016), remember them better (Tremblay et al., 2011), and are also sensitive to the frequency of exposure to MWUs in eye-tracking (Siyanova-Chanturia et al., 2011). Moreover, in several studies, collocational and idiom priming effects have been found for L1 speakers as well as for L2 speakers (see, for example, Carrol et al., 2016; Wolter & Gyllstad, 2011). While the processing of MWU in L2 learners and possibly also their higher entrenchment in memory seems to be affected by their distributional features, most often operationalized in terms of frequency-based measures, the effect of their type and associated properties (familiarity and decomposability for idioms, predictability and semantic association for compounds, mutual information [MI] for collocations) has only been investigated in L1 adults so far (e.g., Carrol & Conklin, 2020). Drawing on psycholinguistic, developmental and computational findings, Arnon and Christiansen (2017) suggest that adult L2 learners are generally less likely to extract (e.g., grammatical) regularities from MWUs in the process of learning. In this regard, Kessler and Beck (2022) have found evidence that different acquisition mechanisms for first and second language acquisition might be mirrored in different processing mechanisms: L1 children tend to perceive a MWU as a whole, whereas L2 adults tend to perceive the parts of a MWU separately.
As for the production of MWUs, there is also no consensus on whether L2 learners are indistinguishable from native speakers with regard to the amount and type of MWUs they use (Ellis et al., 2015). Some corpus studies have shown that advanced L2 learners use the same quantity of MWUs as native speakers (Forsberg, 2008; Nesselhauf, 2003; Siyanova-Chanturia & Schmitt, 2008), while others have found that learners under- or overuse them (DeCock et al., 1998; Durrant & Schmitt, 2009; Hickey, 1993; Laufer & Waldman, 2011). However, all studies (e.g., Ellis et al., 2015; Gilquin, 2015; Granger & Bestgen, 2014; Schmitt, 2012) confirm that L2 learners use a considerable amount of MWUs that are different from those produced by L1 speakers. According to some studies, L2 learners mainly produce MWUs that are frequent in the input and strongly associated on the MI score, whereas L1 users also produce more infrequent combinations, whose constituents may also be infrequent as individual words (Erman et al., 2016 and Schmitt, 2012 for an overview). It is unclear in this regard whether the avoidance of infrequent collocations results from the learners’ lack of the corresponding words in their vocabulary (Gablasova et al., 2017; Nguyen & Webb, 2017).
Apart from linguistic and distributional features, the use of MWUs in L2 learners also seems to be influenced by their L1 (Gablasova et al., 2017; Kellerman, 1978; Laufer & Waldman, 2011). As shown for situations of intense language contact in heritage and contact linguistics (see “Contact linguistics”), this influence might also be bidirectional, that is, the MWUs of the L2 may also change the way speakers use MWUs in their L1 (Doğruöz & Backus, 2009; Rakhilina et al., 2016; Treffers-Daller et al., 2016). As yet, how and when this happens has been poorly understood from a language representation and processing perspective and individual, cognitive, linguistic and sociolinguistic variables of the speaker may also play a role (Dörnyei et al., 2004; Treffers-Daller, 2012).
Contact linguistics
The more general “paradigm gap” (Sridhar & Sridhar, 1986) between research on second language acquisition and contact linguistics can also be found for investigating MWUs. Whereas MWUs are central in psycholinguistically oriented research on second and foreign language learning (cf. “Psycholinguistic research on second/foreign language acquisition and learning”), the notion of MWU is largely absent in traditional language contact research. The latter commonly relies on distinctions between “lexical” and “grammatical/structural” transfer or interference phenomena in the form of borrowing, code-switching, and loan translation/calquing, which are not always consistent in themselves. For example, US-Spanish viaje redondo (calqued on English roundtrip), días de semana (English weekdays), perder peso (English lose weight) or cambiar de mente (English change one’s mind) have been classified as lexical calques, whereas correr para presidente (English run for president), llamar para atrás (English call back), or esperar por (English wait for) have been considered syntactic or grammatical calques (cf. Escobar & Potowski, 2015, p. 135, and Wiesinger, in press, for critical discussion with regard to code-switching, calquing and structural transfer). Although the traditional notions have been specified by more fine-grained classifications such as “global” versus “partial copying” (Johanson, 2008) or “matter” versus “pattern replication” (Matras & Sakel, 2007) in more recent contact linguistic works, they still rely on a distinction between words and structure (Backus, 2019, p. 205). However, this line of research has moved away from the traditional view of two separated linguistic systems that are in contact and aligns with a more holistic and cognitive understanding of multilingualism that also acknowledges the individual speaker’s mind as the locus of contact (Grosjean, 1989; Jarvis & Pavlenko, 2008; Matras, 2009; Matras & Sakel, 2007; Palacios & Pfänder, 2014; Zenner, 2013 and the contributions in Zenner et al., 2019). These and other works also make reference to more cognitive aspects such as the role of “perceived similarity,” the cost of bilingual processing, the degree of executive control as well as frequency effects. With this new wave of language contact research, also the concept of MWUs has come more into focus, especially in usage-based contact linguistics (e.g., Backus, 2003, 2015, 2019; Backus & Dorleijn, 2009; Backus & Verschick, 2012; Hakimov, 2016; Hakimov & Backus, 2021; Treffers-Daller, 2005). These studies have shown for various language pairs that contact phenomena such as code-switching or calquing may be tied to various types of MWUs, including compounds, verb-object collocations, phrasal verbs, idioms as well as prepositional or noun phrases. A similar perspective is found in usage-based constructionist approaches to language contact, which is understood as a cover term for speaker-internal restructurings in their multilingual constructional network, involving constructions at different levels of schematicity, complexity, and abstractness (e.g., Dux, 2020 and the contributions in Erfurt & De Knop, 2019 and Boas & Höder, 2021; Höder, 2012, 2014; Wasserscheidt, 2016). Under this account, code-switching, calquing as well as structural transfer can be reconceptualized in terms of a dynamic and creative combination and embedding of (lexically specific, partially filled or fully schematic) constructions as well as of emerging generalizations in the speaker’s multilingual “constructicon” (cf. Wiesinger, in press, for an overview).
So far, most of these accounts rely on bilingual corpus data that is much more restricted in size than “monolingual” corpora and considerably limits the possibilities of frequency-based statistical investigation necessary for a truly usage-based approach; by contrast, they have rarely reached out toward experimental multi-method approaches (Backus, 2019, pp. 199–201) or longitudinal studies (e.g., Endesfelder Quick et al., 2006). In general, it is an important desideratum for usage-based and cognitively oriented research on language contact to create a stronger linking to more psycholinguistically oriented research on language processing and learning.
Heritage languages
A special branch of multilingualism research has been concerned with the study of so-called heritage speakers (HSs). HSs are individuals who were raised in families where a language other than the dominant language of the community was spoken and who therefore began to learn the heritage language before, or simultaneously with, the community language, which would eventually become their stronger language (Benmamoun et al., 2013; Polinsky, 2015). Although HSs’ language biographies and the levels of proficiency are extremely variable, they all share two features: (1) their knowledge of the heritage language is acquired from early oral exposure in a family, and (2) they shifted from one language (their heritage language) to their dominant language (the language of their speech community) at some point during childhood. These characteristics shape their intermediary status between L1 and L2 speakers, that is, like L1 speakers, HSs were exposed to language during childhood but, like L2 learners, they experience transfer from their dominant language. Due to this status, the study of HSs can help linguists to determine which language structures are vulnerable and require extensive input and which are completely acquired and immune to attrition (cf. “Language attrition”) or “naturally resilient without extensive input and use” (Benmamoun et al., 2013, p. 172).
“Vulnerable” structures have so far been identified mostly in HSs’ morphology and syntax (for an overview, see Montrul, 2016; Montrul & Polinsky, 2021; Polinsky, 2018) and less so in their lexicon (e.g., Chappell, 2018; Fridman & Meir, 2023; Garcia & Gollan, 2022; Montanari et al., 2018; Montrul & Polinsky, 2021). However, Kopotev et al. (2020, p. 1) hypothesize that “heritage speakers deploy fewer probabilistic strategies in language production compared with native speakers and that their active knowledge of and access to ready-to-use MWUs are restricted compared with native speakers.”
Research specifically targeting MWUs in HSs falls into two categories, namely, corpus-based and experimental. Corpus-based studies (Karl, 2012; Kopotev et al., 2020; Rakhilina et al., 2016; Vyrenkova et al., 2014) aim at identifying MWUs in HSs’ oral or written production that deviate from corresponding monolingual patterns. These studies are not limited to any specific type of MWUs, although non-monolingual-like patterns used by HSs often fall in the broad category of “usual sequences” (see “MWUs and their types”). Due to the small size of available bilingual corpora, MWUs in HSs’ production have so far been identified based on negative evidence, that is, on the violation of monolingual MWUs (Kopotev et al., 2020) rather than according to statistical criteria. The emergence of these HSs’ MWUs non-conforming to the monolingual MWUs have been explained by three processes (Rakhilina et al., 2016). First, the copying of a MWU from the dominant language in the heritage language results in multiword calques. Second, creating a MWU out of elements of the corresponding MWUs in the heritage and dominant language results in semi-calques. Finally, semantic decomposition of the underlying monolingual MWU and the reassembling of its meaning out of semantic primitives results in decomposed structures. This kind of corpus-based research on MWUs in HSs’ corpora is very valuable but, similar to other contact-linguistic studies (cf. “Contact linguistics”), it is largely limited to individual examples produced by individual HSs. Therefore, many questions remain open. First, it is not clear whether HSs’ MWUs are just random one-time occurrences rather than systematic features of a HS’s idiolect. Second, it has not been systematically investigated whether HSs’ MWUs are typical for most HSs with a given dominant language versus HSs with another dominant language. Third, it needs to be investigated which HSs’ MWUs were already present in the input HSs received and therefore, were learned as a MWU and which MWUs were created by HSs themselves.
The experimental line of research is so far represented by only a few studies focusing on HSs’ receptive knowledge of MWUs of the monolingual variety. The question asked here is to what extent HSs possess receptive knowledge of monolingual MWUs, which they do not use in production, to which the existing studies have provided different answers. While Zyzik (2021) considers HSs’ recognition of MWUs almost native-like, Karl (2012) argues that HSs’ receptive skills in monolingual MWUs are very limited. More research is needed to show which monolingual MWUs HSs are more or less likely to have in their receptive knowledge and how it is determined by the characteristics of the input they received.
An emerging direction of research is the perception of HS’s MWUs by monolingual speakers. The only available study here is Karl (2012). She showed that the native speakers’ perception of non-typical multiword calques produced by HSs depends on the type of MWUs. Whereas non-typical collocations are less noticeable for native speakers, polysemy, word formation and phraseological calques are evaluated as not acceptable. Moreover, Karl (2012) finds variation in acceptability even within one type of MWU, that is, some collocations are judged more acceptable than others. It remains to uncover what exactly makes a HSs’ MWU acceptable or not acceptable for native speakers and to what extent these HSs’ MWUs are disturbing communication.
In sum, the exploration of MWUs in HSs has just started but already provided some valuable observations about the use of MWUs by HSs in spoken and written corpora. The generalizability of available corpus findings remains to be tested on larger corpora (which are often lacking) and by means of experimental methodology. The question to what extent HSs have receptive knowledge of monolingual MWUs remains open, as does the question of the perception of HSs’ MWUs by monolingual speakers. Finally, the developmental aspect of MWUs in HSs is still uncharted territory and requires longitudinal research. Overall, the existing studies have focused on providing insights into HSs’ production and perception of MWUs and have not aimed at integrating their findings into more general models of multilingualism. An important task of future work is thus to incorporate a growing body of empirical evidence about the HSs’ MWUs into theories of multilingual language storage, perception and production.
Language attrition
A special phenomenon that is linked to language shift is language attrition. It has been defined as a non-pathological decline in the L1 of immigrants or HSs affecting the storage or the retrieval of the L1 structures and resulting from a reduced use of L1 and interference from an L2 (E. P. Altenberg & Vago, 2004; Köpke, 2004; M. S. Schmid, 2004). Vocabulary is considered to be affected by attrition earlier and to a larger extent than other linguistic levels, partially because it is more susceptible to cross-linguistic influence (M. Schmid & Köpke, 2009). Along with borrowings, loan translations, semantic convergence, and restructuring of individual lexical items, “destroyed” or “unconventional” collocations are usually mentioned among lexical changes occurring in spontaneous speech of attriters (e.g., Besters-Dilger, 2013; De Bot & Clyne, 1994; Isurin, 2007; Jarvis, 2003; Marian & Kaushanskaya, 2007; Negrisanu, 2008; Pavlenko, 2000). Jarvis (2019, p. 243) states that syntagmatic problems are frequent among attriters in compounds, collocational idioms, and cliches and underscores the importance of investigating MWUs in attrition, as already pointed out by Bardovi-Harlig & Stringer (2013) for L2 attrition.
Despite the recognized importance of MWUs for understanding processes of language attrition, there are very few studies that specifically focus on this topic (Doğruöz & Backus, 2009; Karl, 2012). Doğruöz and Backus (2009) provide a very detailed investigation of MWUs in spontaneous speech of Turkish first-generation immigrants in the Netherlands. Taking a construction grammar perspective, they demonstrate that these speakers use many MWUs that are unconventional in the monolingual variety, most of them being lexically specific constructions. The unconventionality in such items frequently (but not always) results from the process whereby an attriting speaker literally translates a Dutch word that is part of a Dutch collocation and adjusts it to the corresponding construction in Turkish. Those Dutch words that were copied into the Turkish construction (e.g., nemen in de trein nemen “take the the train” copied into Turkish as tren almak “take train” instead of the conventional tren binmek “get on train”) tend to have a non-literal extended semantic meaning in Dutch, whereas their equivalents in Turkish have a narrower literal meaning. This process accounts for many unconventional constructions in the data but there were also unconventional MWUs that could not be explained by Dutch influence. In some of them, two competing variants already exist in monolingual Turkish, yet in others, processes of simplification might have been at stake. Overall, unconventional MWUs that cannot be attributed to the overt transfer from the dominant language have not been given much attention and represent a promising direction for further research. Another important finding by Doğruöz and Backus (2009) concerns the fact that none of the unconventional constructions was found to completely replace the conventional equivalent. It means that both conventional and unconventional constructions might be used by attriters at the same time and future research should uncover the factors underlying the use of each variant.
The contributions to the special issue
Taking up different aspects of the outlined research strands on MWUs, the contributions to this special issue focus on a wide variety of multilingual speakers and language combinations, different types of MWUs as well as on different methodological approaches to the analysis of MWUs in multilingual speakers.
Some contributions investigate MWUs in HSs, focusing on different language combinations such as Turkish-German (Treffers Daller, this volume), Russian-German (Perevozchikova & Kessler, this volume) as well as Spanish-English (Hennecke & Wiesinger, this volume). Several contributions to this volume focus on MWUs in foreign language learners of different languages, such as French (Wolf, this volume), Dutch (Barking, Backus & Mos, this volume) and English (Gilquin, this volume). Other contributions investigate the use of MWUs in second language learners of Russian (Kopotev, Kisselev & Klimov, this volume) and English (Gilquin, this volume). While some studies analyze MWUs in a specific speaker group, other studies take a comparative perspective between HSs and monolingual speakers (Perevozchikova & Kessler, this volume), foreign language learners and monolingual speakers (Wolf, this volume; Barking, Backus & Mos, this volume) or between foreign language and second language learners (Gilquin, this volume).
From a methodological perspective, the special issue unites various methodological approaches to research on MWUs in multilingual speakers. Several contributions approach MWUs using qualitative and/or quantitative corpus analyses (Gilquin; Hennecke & Wiesinger; Treffers-Daller; Wolf, all this volume). The paper by Kopotev, Kisselev & Klimov (this volume) further adds a quantitative perspective of automatic extraction of corpus data. Other contributions rely on experimental methods such as experimental production tasks (Barking, Backus & Mos, this volume) and acceptability judgment tasks (Perevozchikov & Kessler, this volume).
The contributions to this special issue will be presented in more detail in their order of appearance in the following:
In her contribution “The Simple View of borrowing and code-switching,” Jeanine Treffers-Daller introduces an innovative approach for the distinction between borrowing and code-switching, called the “Simple View.” This approach relies on the criterion of listedness, which is operationalized as a MI score. Treffers-Daller tests the Simple View using code-switching data of single words and MWUs from a Turkish-German corpus. The findings demonstrate that the Simple View offers a unified approach to the borrowing of lexical items and function words.
In the contribution “Language contact phenomena in multi-word units: The code-switching—calquing continuum,” Inga Hennecke and Evelyn Wiesinger aim at developing a more differentiated account of code-switching and calquing of and within MWUs and constructions, bringing together recent psycholinguistic and Construction Grammar theories and modeling with cognitively oriented approaches to language contact and bilingualism. They analyze and discuss data from the CESA Corpus from first-, second-, and third-generation bilingual Spanish-English speakers in Arizona. The analysis highlights the importance of both lexically specific MWUs and partially and fully schematic constructions in code-switching and calquing. The authors further argue that code-switching and calquing may ultimately be viewed as a continuum.
In their contribution “Now they accept it, now they don’t. Acceptability judgments of non-typical MWUs in Russian as a native and a heritage language,” Tatiana Perevozchikova and Ruth Kessler investigate the acceptability of non-typical MWUs and their typical monolingual equivalents by Russian-German HSs and Russian native speakers. They present the results of an acceptability judgment task that support the idea of a unified multilingual “constructicon” suggesting that HSs do have some receptive knowledge of monolingual MWUs, although this knowledge differs from that of monolingual speakers.
In their contribution “Investigating language transfer from a usage-based perspective,” Marie Barking, Ad Backus and Maria Mos study language contact from a usage-based perspective by testing the explanatory power of a schematicity continuum. They report the results of a production task with native German speakers living in the Netherlands, who experience language transfer between German and Dutch. The results of the production task, which are compared with data from native Dutch speakers, confirm that transfer in language contact depends on schematicity.
The contribution “Exploring collocational complexity in L2 Russian: A corpus-driven contrastive analysis” by Mihael Kopotev, Olesya Kisselev, and Alexander Klimov focuses on collocations in corpus data of L2 Russian learners. It explores a novel approach to the automatic extraction and assessment of collocations in texts created by L2 learners and aims at exploring the role of collocations in placing learner texts on a proficiency scale.
In her contribution “Second and foreign language learners: The effect of language exposure on the use of English phrasal verbs,” Gaëtanelle Gilquin investigates the possible effect of language exposure on the use of MWUs. Gilquin analyzes corpus data to compare the use of phrasal verbs with up among learners of English as a second language and as a foreign language. The results show some similarities but also some differences between the two groups of learners reflecting their type and degree of exposure to English.
The contribution “In chunks we trust ... The problem of gender assignment in foreign language learning of French” by Johanna Wolf investigates gender assignment in nominal, verbal and adjective phrases in foreign language learners of French. On the basis of a corpus of authentic productions by learners with Romance languages as L3 and German as L1, the article studies the particular problems of foreign language learners compared with data from L1 acquisition. It aims at discussing an explanatory model for the observed difficulties with gender assignment, based on the model of feature-driven acquisition as well as on the processing of MWUs, which seems to facilitate the filtering of morphosyntactic rules out of the input.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
