Abstract
In the introduction to this special issue on articulatory approaches to the study of L2 speech, we first highlight the interest and unique contributions of such methods to the investigation of speech production among second language speakers. This is followed by a brief overview of the four articulatory methods—electropalatography, nasometry, magnetic resonance imaging, and ultrasound—featured in the experimental studies presented in the seven articles that constitute the issue. We then turn to an overview of the speech phenomena investigated—consonants (laterals, rhotics), vowels (individual as well as entire inventories), and sequences (both phonemic vowel-nasal sequences as well as coarticulation in phonetic sequences)—as produced by L1 speakers of various target languages (L1s: English[-Croatian], Czech, French, Japanese, Mandarin, Spanish; Target languages: English, French, Swedish). This introduction concludes with a summary of recurring acquisition themes (L1-based crosslinguistic influence, relative difficulty and target-likeness, inter-learner variability including as conditioned by individual differences) and the general speech phenomena studied (articulatory settings, gestural timing/coarticulation, effects of phonetic context).
1 Introduction
To date, the majority of instrumental studies of second language (L2) speech production have involved acoustic analyses, likely due to a number of factors including the ready availability of acoustic analysis software such as Praat, the larger number of L2 speech perception theories having acoustic primitives (SLM[-R]; Flege, 1995; Flege & Bohn, 2021; L2LP, Escudero, 2005, 2009; Escudero & Boersma, 2004), and the resource-intensive nature of articulatory methodologies in terms of both set-up and researcher training. However, the growing availability of articulatory methodologies as well as their potential unique contributions to a variety of research questions and theories make them of great interest to researchers. For example, certain speech production properties are difficult or impossible to measure acoustically (e.g., tongue-tip position and degree of constriction, degree of nasalization) and, given that different articulations may map to the same acoustic output (e.g., Ananthakrishnan & Engwall, 2011; Stevens, 2002), it is not always clear what it means for L2 learners to be target-like and/or whether they may achieve target-like acoustic realizations via vocal tract configurations that differ from those of native speakers. It is also the case that methodologies such as ultrasound and magnetic resonance imaging (MRI) allow for imaging not only isolated aspects of vowel and consonant articulation but also the more global configuration of the vocal tract, permitting a more holistic view of learners’ production. As such, the greater integration of articulatory methods promises important new insights into L2 speech learning.
This special issue presents current research using a variety of articulatory methods—in particular, electropalatography (EPG), MRI, nasometry, and ultrasound—to study the L2 production of consonants, vowels, and segmental sequences among learners of English, French, and Swedish of various first languages (English [-Czech], French, Japanese, Mandarin, Spanish). The research questions explored relate to a range of phenomena central to L2 speech learning such as crosslinguistic influence including as conditioned by target language proficiency, relative difficulty, inter-learner variability, target-likeness, and the effects of training as well as core aspects of speech production including articulatory settings, gestural timing/coarticulation, and the effects of phonetic context. Given the dearth of published articulatory research on L2 speech, this special issue of Language and Speech provides readers with new empirical data involving various phenomena including from an understudied target language (Swedish), an introduction to theoretical frameworks less often encountered in L2 speech research (e.g., Articulatory Settings; Honikman, 1964), and a detailed overview including the practicalities of the various articulatory methods employed.
2 Overview of the contributions: articulatory methodologies, research themes, and recurring findings
Rather than presenting summaries of each of the seven articles—these can be found in the individual abstracts—here, we will review the articulatory methodologies used, the research questions and themes investigated, and recurring findings across the studies.
2.1 Articulatory methodologies
As highlighted above, in this special issue, four different articulatory methods are presented then illustrated with experimental studies. Four of the contributions involve ultrasound, which uses ultra-high-frequency soundwaves originating from a probe held below the speaker’s jaw, to visualize tongue shape and movement, making it of interest for the study of both consonants and vowels. 1 Using this method, Chen, Whalen & Mok study L1 Mandarin speakers’ English /ɹ/ production; Kocjančič Antolík, Bořil, and Hoffman investigate the effects of articulatory and acoustic training on the production of Swedish vowels by native Czech speakers; Oakley examines the French rounded vowels /y/ and /ø/ as realized by L1 English speakers; and Wilson, Perkins, Sato & Ishii measure the similarity of the L1 and L2 inter-speech postures for vowel production among Japanese-speaking English learners.
Each of the other three methodologies is represented by a single contribution. Colantoni, Kochetov and Steele study the acquisition of English-/l/ allophony using EPG. This technique involves participants speaking with a custom-made acrylic palate embedded with a series of electrodes that allow for the measurement of tongue constriction location and amount of linguopalatal contact. First and foremost, EPG lends itself to the study of consonants articulated from the (denti-)alveolar to velar regions. The two remaining techniques are the least rarely found in published L2 speech research. Beristain’ study of nasal coarticulation in vowel-nasal consonant sequences, as produced by L1 Spanish-L2 English speakers, illustrates the use of nasometry, which lends itself to the study of both nasal consonants and vowels. In the particular methodology used in this study, a mask covering a speaker’s mouth and nose allows for the measurement of oral and nasal airflow. Finally, Badin, Sawallis, Tabain and Lamalle demonstrate the use of static MRI, which allows for high resolution imaging of a single articulator or the entire vocal tract configuration using magnetic fields and radio waves. These authors compare the L1 and L2 vowel articulation of two English(-Croatian)-speaking learners of French.
These four techniques differ in various principal ways. The first concerns the type(s) of segmental phenomena that can be investigated. As highlighted above, whereas MRI can be used to study the production of any consonant and vowel, ultrasound is limited to examining lingual consonant or vowel articulations (unless combined with lip video), EPG to most lingual consonants (especially coronals), and nasometry to oral and nasal airflow in vowels and consonants (e.g., degree of vowel nasalization; amount of aspiration of stops). Moreover, ultrasound and, particularly, MRI allow for the study of not only individual segments but also larger articulatory aspects including the movement of a particular articulator or, in the case of MRI, the configuration of the entire vocal tract from “lips to larynx” as Badin et al. describe it. The techniques also differ in their degree of invasiveness: EPG contrasts with the three other methodologies in being somewhat invasive, involving an artificial palate in the oral cavity during speech production. Finally, in terms of the resources required, while all four techniques are more cost-intensive than acoustic research that only requires a microphone and one’s laptop loaded with freeware such as Praat, these articulatory methodologies differ in their costs. For example, while portable ultrasounds are becoming relatively affordable and do not require a dedicated research space, researchers wishing to conduct an MRI study typically need access to expensive machines available in a restricted set of locations including hospitals. Nasometry is much like ultrasound in involving increasingly more affordable equipment and being portable. EPG falls somewhere in between the other three methods: it involves a certain expense including for the preparation of individual palates and, like MRI, requires a dedicated research space. These differences in the resources required for each methodology have some consequences for the participant sample sizes found in articulatory research, which tend to be higher in ultrasound and nasometer studies than in those using EPG and MRI.
2.2 Research themes and recurring findings
The seven contributions to this issue investigate a range of phenomena. In their MRI study of two L2 French learners—one L1 English monolingual, the other a simultaneous English-Croatian bilingual, Badin et al. explore the extent to which the production of analogous L1-L2 phones is similar both in terms of specific and overall vocal tract configuration, the degree of resistance to coarticulation in liquids, and the articulatory gestures used by speakers to produce the French front rounded vowels /y ø œ/. Beristain, using nasometry, examines how Spanish-speaking English learners adjust the coarticulatory timing in vowel-nasal sequences, in particular the onset and proportion of nasality, including the degree to which target-like production correlates with speakers’ L2 English accentedness (i.e., global oral proficiency). The four ultrasound studies target a range of phenomena. Chen et al. study L1 Mandarin-L2 English speakers’ /ɹ/ production focusing on tongue shapes—bunched or retroflex—as conditioned by position of the rhotic in the syllable and L2 English proficiency. Kocjančič Antolík et al., the only training study in this issue, test whether real-time visual ultrasound (and acoustic) feedback can assist L1 Czech learners in improving their production of the Swedish vowels /yː, iː, ɛ, ɛː, oː, ʉː, ɵ, ʊ, uː, ʉː/. Oakley investigates English-speaking learners’ use of L1 articulatory and acoustic categories in producing French vowels that are both similar to and different from those of their L1 (/i, u, e, o/ and /y, ø/, respectively). Wilson et al. evaluate the existence of distinct L1 and L2 articulatory settings among L1 Japanese English speakers via a comparison of the production of the vowel inventories of the two languages. Finally, in their EPG study, Colantoni et al. examine the acquisition of the light and dark (i.e., velarized) English-/l/ allophones among native French and Spanish speakers.
Although there are different segmental phenomena investigated, articulatory techniques employed, and L1–L2 pairings, a number of recurrent themes are observed across the seven studies. These include:
L1-based crosslinguistic influence: as in L2 speech research in general, persistent L1-based influence on production of varying strength is observed in all studies. Moreover, the type and/or extent of L1-based crosslinguistic influence may vary across target language (TL) phonemes (Badin et al., Oakley) and L1-TL production similarity may differ for speakers of different varieties of the L1 (American versus Australian English in Badin et al.).
Relative difficulty and target-likeness: various studies demonstrate that learners may vary in their mastery of different structures or of the phonetic parameters used to realize phonological contrasts, including the extent to which learners match native speaker production. Differences in relative difficulty conditioned by L1-TL similarity (Oakley) including its interaction with syllable position (Chen et al.; Colantoni et al.) and specific vowel quality (Kocjančič Antolík et al.) are explored. To varying degrees, native-like behaviors are observed in several studies (Beristain; Chen et al.). Moreover, even when not target-like, there is often evidence that learners have moved away from purely L1-based articulations (Chen et al.; Colantoni et al.; Kocjančič Antolík et al.; Oakley).
Inter-learner variability and individual differences: as is the case with phonetic research in general, all studies observe differences among at least some speakers. Moreover, this inter-learner variability can sometimes be explained, in part, by individual differences including target language proficiency (Beristain; Chen et al.; Wilson et al.) and experience (Beristain) as well as first language or dialect (Badin et al.; Colantoni et al.).
Alongside these acquisitional themes, two or more of the studies touch on themes of interest to speech production research more generally including:
Articulatory settings: the existence of language-specific articulatory settings (ASs; Honikman, 1964; Laver, 1978)—the overall configuration of articulators for speech production—has a long history in speech research. The role of ASs in L2 speech learning is either the theoretical framework adopted in the study (Wilson et al.) or is discussed in various contributions (Badin et al.; Beristain; Oakley).
Gestural timing/coarticulation: the language-specific ways in which consecutive segments are coarticulated is a well-studied phenomenon and is a core feature of theories such as Articulatory Phonology (Browman & Goldstein, 1992). Very little is known, however, about the L2 acquisition of inter-gestural timing and co-articulatory patterns. Badin et al. and Beristain investigate, among other things, the extent to which L2 learners are able to adjust L1 timing/coarticulatory patterns to approximate those of the target language, potentially in a native-like manner (Beristain).
The effects of phonetic context: related to the theme of coarticulation, multiple studies investigate how phonetic context (i.e., flanking segments) and syllable position shape the realization of a given phoneme. For example, Chen et al. find that while syllable position conditions their L1 Mandarin learners’ production of English /ɹ/, the distribution of their tongue shapes by syllable position differs from native speakers;
Relation between speech production and perception in L2 acquisition: the learners in Kocjančič Antolík et al. are relatively good at perceiving non-native vowel contrasts, while showing considerable difficulties in production. As is often found, for the acquisition of at least some L2 sounds, production lags behind perception. This, according to the authors, points to some challenges for current theories of L2 learning.
3 Areas for future research
Our intention when putting together this special issue was both to disseminate research that has already been done and to encourage colleagues working in L2 speech to integrate one or more of these articulatory techniques into their research methodologies. As highlighted in the first section, specific techniques and research questions go hand in hand, in the sense that one needs to understand which articulatory method is appropriate to answer a given speech learning problem. We believe that the techniques discussed here are particularly important to address two issues in the area. First, techniques that allow for the mapping of the whole vocal tract, such as MRI, will allow us to either strengthen existing theories, such as Articulatory Settings, or develop new theories of second language speech production. Second, the articulatory techniques discussed (e.g., nasometry, EPG) as well as other techniques not illustrated here (e.g., electromagnetic articulography [EMA]) will expand our understanding regarding the acquisition of co-articulatory processes and allow us to establish the complexity of what is involved when acquiring the segments of a second language.
The contributions included here also highlight the fact that recent technical developments have made some of these techniques (e.g., ultrasound and nasometry) more accessible and more portable. We encourage the reader to consult Appendix A in Badin et al., which offers an excellent summary of the latest technological developments in MRI research and includes a list of research centers around the world where this technique is available. Finally, we invite readers to explore an online database containing videos of L1–L2 EPG data (Kochetov et al., 2017). Post-pandemic research may have introduced an additional challenge to the expansion of articulatory methods, namely articulatory techniques still require that the researcher and the participant interact in person, as opposed to making use of online platforms for participant recruitment and online testing. We are positive, though, that the variety of research topics and techniques discussed in this special issue will convince the reader of the richness and the availability of articulatory techniques to answer a variety of questions in the field of L2 speech.
Footnotes
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
