Abstract
The present study aims to measure the effects of the teaching of second language (L2) phonological forms on L2 receptive vocabulary learning. Two teaching methods were compared in a pre- and delayed post-test to evaluate their impact on L2 word learning. Participants (n = 127; mean age = 12;6, i.e. 12 years and 6 months) were randomly divided in two groups that followed either an explicit teaching method focused on L2 phonological forms, or a communicative teaching method focused on meaning, in which L2 phonological forms were taught implicitly. The teaching methods in the two groups aimed to foster the skills and the learning of phonological forms involved in the development of receptive vocabulary. The two teaching methods trained the same skills and relied on the same vocabulary. They both targeted the phonological forms of two difficult phonemic contrasts in French as a foreign language. The two teaching sequences took place during mandatory lessons in French as a foreign language for six weeks (12 lessons), in a Swiss state school. Generalized mixed models were fitted to the data to test for differences across teaching methods in their impact on L2 word learning. Overall, the results indicate that participants made significant progress in word learning, with no significant differences between the two teaching methods. Pronunciation, discrimination, retention in verbal working memory, and the mastery of phoneme–grapheme correspondences are significant factors of vocabulary learning in French as foreign language. The teaching of L2 phonological representations and the training of their processing facilitated the learning of words in L2 French. However, the teaching of vocabulary in French as a foreign language rarely involves a focus on phonological representations.
Keywords
I Introduction
The goal of the present study is to compare the impact of teaching second language (L2) phonological forms on L2 receptive vocabulary learning and to compare the effectiveness of two different teaching methods. L2 phonological forms were taught either by an explicit teaching method or by a communicative teaching method where L2 phonological forms were implicitly taught. The two teaching methods, explicit or implicit, both aim at improving listening and pronunciation skills, and developing the link between L2 orthographic forms and phonological forms.
1 Relationships between L2 vocabulary and L2 phonological learning
The development of L2 vocabulary knowledge and L2 phonological forms seems to be interdependent. Many empirical studies, reviews and meta-analyses have found a correlation between L2 vocabulary knowledge and L2 listening comprehension (e.g. Burgoyne, et al., 2011; Jeon & Yamashita, 2014; S. Zhang & Zhang, 2022). A larger L2 vocabulary correlates with better L2 listening comprehension. Thus, L2 vocabulary knowledge supports listening comprehension, while in turn listening comprehension is key for learning L2 vocabulary, namely for establishing new connections between sound and meaning.
Listening abilities influence the success of establishing new sound–meaning connections with L2 listening ability moderating the effect of repetition on L2 word learning, contrary to L2 previous vocabulary knowledge (P. Zhang, 2022). The number of repeated presentations of an L2 word necessary for its learning is smaller for a learner with good listening abilities. Poor listening abilities lead to the formation of imprecise and L1-like phonological representations, which hampers the learner’s ability to establish new sound–meaning connections.
Forming new sound–meaning connections is hindered by imprecise phonological representations (e.g. Cook et al., 2016; Gor et al., 2021; Perfetti & Hart, 2002). Llompart (2021) compared two groups of first language (L1) German speakers learning English and differing in English proficiency. Participants had to complete a lexical decision task (distinguishing words from non-words), a phonemic categorization task involving the English phonemic contrast /ε/–/æ/, and an English vocabulary test. For intermediate learners, accuracy in non-word rejection was predicted by accuracy in the phonemic categorization task, but for advanced learners it was predicted by the vocabulary size. Imprecise and less target-like representations limit word learning by less proficient L2 learners. Even when L2 learners encode differently two difficult contrastive sounds, it takes time to refine this encoding into precise and non-L1-like phonological representations (Darcy et al., 2013). The ability to learn new L2 words depends on the quality (i.e. precision and L2 target-like parameters) of their L2 phonological representations (van de Ven et al., 2019).
The acquisition of high quality L2 phonological representations is subordinate to the relationship between the L1–L2 auditory and phonological systems. In a L2 word learning paradigm, Tuninetti et al. (2020) compared the facility of matching sound to meaning when the two presented spoken words formed a non-minimal or vowel minimal pair. Words presented in non-minimal pairs proved to be the easiest to learn, while within minimal pairs, those with greater perceptual difficulty posed a greater challenge for learning. The difficulty to learn an L2 word is commensurable to the difficulty to perceive an L2 phonological contrast, which is accounted for by the pronounceableness of a word.
Pronounceableness of words partly explains learners’ difficulty in learning an L2 word. N. Ellis and Beaton (1993) defined the pronounceableness of words as the ability of learners to perceive and pronounce L2 sounds. They measured pronounceableness of twelve L2 words in a repetition task by novice L2 speakers. Their participants, English native speakers (N = 47) who were novice learners of German, had to complete a translation task after a learning phase. During the learning phase, the twelve German words were presented with their translation, their written forms, and their spoken forms. The authors observed a strong effect of pronounceableness in the learning of L2 words, and this effect is proportional to the conformity of the word to the phonological and orthographic patterns of the L1 of the learner. Mismatches between L1 and L2 limited word learning.
Mismatches in phonological and orthographic representations between L1 and L2 lead to lexical confusion in word production and recognition. Lexical confusion is defined as observed difficulties in linking the meaning of a novel word to its accurate written and oral forms. Several studies that aimed at categorizing errors in vocabulary learning have claimed that phonological and orthographic confusions are the main source of errors (Gu & Leung, 2002; Laufer, 1988, 1990). L2 learners who have difficulties perceiving and pronouncing L2 phonological contrasts also have difficulties in identifying these contrasts in L2 words (Ota et al., 2009), even when the learner can accurately categorize the two contrastive sounds (Díaz et al., 2012). Lexical confusion can occur when auditory and phonological representations of the L2 are not precisely defined and poorly linked to the other aspects of word knowledge (Perfetti & Hart, 2002; Read, 2000; Schmitt & Schmitt, 2020). Imprecise representations and unreliable links between different aspects of word knowledge lead to lexical confusion.
Lexical confusion is hardly overcome by increasing the amount of input. Simply increasing exposure to L2 phonological forms is not sufficient to link high quality phonological forms to meaning. The amount of exposure to L2 phonological forms does not seem to influence the knowledge of high frequency L2 words (Lu & Dang, 2023). Through the assessment of form–meaning connections, Lu and Dang measured the learning of L2 English words at the first three 1,000-word frequency levels by Chinese students (N = 201). Exposure to L2 oral English in hours per week and length of studying English were measured through a questionnaire. These two measures of the amount of exposure had no impact on L2 word learning. Implicit learning of L2 words through exposure seems to be slowed down by the influence of the L1 phonological system (e.g. N. Ellis, 2015). The link between L2 phonological forms and meanings seems to be the most difficult association to learn (Gu & Leung, 2002; Hulstijn, 2013) without dedicated instruction.
Instruction has the potential to prioritize the link between L2 phonological forms and meaning over direct associations between orthographic form and meaning, thereby promoting the learning of L2 novel words (e.g. Bürki et al., 2019; Krepel et al., 2021; Uchihara et al., 2023). Uchihara et al. (2023) compared the learning of form–meaning connection and pronunciation of L2 low-frequency words in three teaching conditions. Japanese learners of English (N = 75) completed a picture naming test in a pre-, post- and delayed post-test. The results indicated that the learning of L2 spoken word forms is superior when teaching proposed reading-while-listening in place of reading or listening only (see also Bürki et al., 2019; Krepel et al., 2021). The simultaneous activation of L2 phonological and orthographic forms seems to facilitate their mapping to meaning.
2 Instruction in phonological skills to enhance L2 vocabulary learning
Teaching of L2 phonological forms should pursue two objectives when seeking to favour L2 word learning. First, teaching should aim to facilitate the pronunciation and the perception of difficult L2 phonological contrast to overcome the mismatches between L1–L2 phonological systems. Second, teaching should aim to reduce lexical confusion by:
• developing precise and L2-like lexical phonological representations; and
• establishing reliable links between L2 orthographic, phonological, and semantic representations.
Teaching pronunciation and discrimination of a difficult L2 phonological contrast will limit L1 phonological influences and reduce the risk of lexical confusion. L1 auditory and phonological knowledge influences the processing of L2 sounds leading to potential difficulties in perceiving and producing an L2 phonological contrast (Best, 1995; Flege, 1995; Flege et al.,1997; McAllister et al., 2002; Tyler, 2019). Identifying two L2 contrastive sounds as similar to an L1 sound results in a linkage between the two different L2 auditory representations to the same L1 phonological representation, which can impede a further phonological distinction between the two correspondent L2 words and their meanings (Flege, 1995). Lexical confusion is the result of linking two different L2 words to the same auditory representation. Lexical confusion may hinder the learning of phonological forms, and conversely, imprecise L2 or L1-like phonological representations may hamper L2 word learning. L2 learners experiencing challenges in perceiving and pronouncing L2 contrasts similarly encounter difficulties in recognizing these contrasts within L2 words (Ota et al., 2009).
L2 pronunciation teaching should be integrated in the teaching of L2 novel words (Bürki et al., 2019) to develop more precise L2 phonological representations that can be effectively mapped to orthographic representations. Mappings from sound patterns to orthographic representations are not automatically acquired during L2 instruction when the L1 and L2 share the same alphabetic writing system but differ in terms of phonological contrasts (Bassetti & Atkinson, 2015; Dherbey-Chapuis & Berthele, 2020; Kaushanskaya & Marian, 2008). Enhancing the instruction of L2 pronunciation and discrimination of phonological forms may contribute to the development of more precise L2 auditory and phonological representations. These refined representations could then be more easily linked to orthographic or semantic representations, particularly when teaching incorporates a comprehensive approach to all aspects of L2 word knowledge (Mora & Levkina, 2017).
Teaching phoneme–grapheme correspondences may enhance a more accurate mapping between L2 orthographic and L2 phonological representations during L2 speech perception and during L2 word reading. Lexical confusion occurs when orthographic and phonological representations of the L2 are not accurately and precisely linked to each other. Incongruence between L1 and L2 phoneme–grapheme correspondences favour L1-like phonological representations in L2 words (Welby et al., 2022) that promote lexical confusion.
L1–L2 incongruencies enhance the learning of incorrect pronunciations in L2 even when L2 words are presented conjointly in oral and written forms (Hayes-Harb et al., 2010; Welby et al., 2022). Bassetti (2006, 2007) analysed the pronunciation of vowels in rhymes of L2 words by beginner learners of Chinese. In a read-aloud task in a Romanized alphabet (pinyin) and in a phoneme segmentation task, L2 learners of Chinese omitted the vowel /e/ when it was not present in the written form (e.g. gui = /guei/) and pronounced it when it was present in the written form (e.g. wei = /wei/). In another study, Bassetti (2017) compared the pronunciation of native and L2 speakers of English in a read-aloud task. Participants were English monolingual native speakers and English L2 speakers who have Italian as an L1, a language in which phonotactic rules indicate that double consonants should be pronounced longer than single consonants, in contrast to English phonotactics. Spelling affected only the pronunciation of the L2 speakers and not of the native English speakers, and these results were confirmed in a delayed word repetition task. Other studies have confirmed how L1 phonotactic rules and phoneme–grapheme correspondences interfere in the pronunciation of L2 words (Bassetti et al., 2018; Sokolović-Perović et al., 2020). Imprecise and non-target like pronunciation of L2 words may hinder their retention in verbal working memory.
Training retention of verbal information in working memory may enhance L2 word learning. Learning of L2 phonological forms of words is correlated with the ability to repeat phonological forms of either a word or a non-word (for a review, see Gathercole, 2006). Non-word repetition and word learning seem to rely both on common procedural processing and shared lexical properties such as phonological and orthographic representations. Vocabulary learning may benefit from training retention in working memory of auditory, phonological, and visual orthographic representations.
3 Implicit or explicit teaching of phonological and orthographic forms
To limit lexical confusion the four skills, namely discrimination, pronunciation, phoneme–grapheme correspondences and retention in working memory of verbal information should be taught integrated in the teaching of vocabulary. These skills can be taught either explicitly or implicitly.
Following Norris and Ortega (2000), explicit teaching is focused on forms, relying on explanations, descriptions, rules and metalanguage use, and conscious attention of the learner is drawn by teaching to the targeted forms; conversely, implicit teaching is focused on meaning, and conscious attention of the learner is not drawn by teaching to the targeted forms.
Explicit teaching is generally more efficient than implicit teaching to enhance the learning of L2 phonological knowledges and to overcome the influence of L1 knowledge (De Keyser, 2003; Saito, 2012; Norris & Ortega, 2000). The explicit teaching of L2 phonological forms aims to retune the perceptual system and to enhance noticing of L2 phonological regularities in words (e.g. N. Ellis, 2015; Llompart & Reinisch, 2021).
Implicit teaching may induce a limited noticing to L2 phonological regularities in word. Incidental encounters with L2 forms that have no counterpart in the L1 of the learners may not lead to the learning of these forms. Their specific L2 characteristics remain undetected or neglected by the perceptual system of the learner. Integrated in a communicative teaching approach, implicit teaching of L2 phonological forms can draw learner’s attention on L2 specificities to enhance the learning of the L2 target through focusing on meaning.
Teaching can focus on meaning, as in communicative teaching, and can trigger at the same time incidental learning of forms. Increasing the salience of L2 contrasts and limiting the number of cues in competition may enhance their recognition by the verbal working memory during incidental learning (N. Ellis, 2006). Salience of a linguistic unit depends on some of its characteristics such as frequency, intrinsic perceptual salience and extrinsic saliency (for a review, see Boswijk & Coler, 2020). Linguistic units are more salient when the context in which they are included puts them in evidence. This extrinsic salience can be manipulated in the teaching materials to draw the attention of the learner toward the target in an implicit focus on form.
In implicit teaching, attention of the learner can be raised toward the targeted forms unconsciously in a communicative activity (N. Ellis, 2005; R. Ellis et al., 2009). For example, to promote the discrimination of a difficult contrast, the communicative requirements can oblige the learner to use the two words including the two members of a minimal pair in the same sentence (e.g. Donne-moi le papillon). The proximity of the two contrastive phonemes may promote their salience and may consequently enhance the learning of two distinct and separate representations for the two phonemes of the minimal pair (N. Ellis, 2015). When the learner’s attention has spotted the specificity of the L2 forms, an increase in its frequency in the input can accelerate their learning.
Feedback, whether implicit or explicit, is an effective tool to promote the learning of L2 phonological forms when individually addressed (Lyster, 2017; Saito, 2021). Although both types of feedback are efficient, implicit feedback seems to be more durable, and explicit feedback seems to be more efficient in the short term (Li, 2009).
Giving immediate corrective feedback was shown to be very powerful for enhancing L2 learning (Lee et al., 2015; Li, 2009; Saito, 2021). Recasts can be either explicit, completed by explanations, descriptions, and metalinguistic knowledge, or more implicit, with repetitions (recast) of the target included in the conversation. (Li, 2009; Lyster at al., 2013; Sheen & Ellis, 2011). The most frequent type of feedback at school are recasts (Lyster et al., 2013). Recasts were shown to be in general the more efficient type of feedback (for a review, see Saito, 2021) and to be well suited to communicative teaching (Lyster et al., 2013) in foreign language lessons. Both types of recasts, explicit and implicit, enhance the learning of L2 phonological forms (Lyster, 2017; Saito, 2021).
Younger learners may be more responsive than adult learners to implicit teaching and feed-back. Effects of explicit and implicit teaching of L2 phonological forms were mostly compared for adult learners. Implicit teaching can be as efficient as explicit teaching for young learners when salience is enhanced (R. Ellis et al., 2009). When the salience of the targets is enhanced in recasts, both implicit and explicit teaching may result in the same improvement of L2 phonological skills and knowledge. Furthermore, children may learn more implicitly than explicitly (De Keyser, 2003).
Implicit teaching may provide a solution for overcoming the concerns of teachers when they must teach L2 phonology at the mandatory school. Teachers often give up teaching pronunciation and discrimination because they lack support in teaching methods and lack training (Darcy, 2018; Géron & Billerey, 2020). The implicit teaching of L2 phonological forms can be integrated in communicative teaching through manipulations of the teaching method, contrary to explicit teaching that requires the teacher to be prepared.
Implicit teaching of L2 phonological forms may be as efficient as explicit teaching to enhance L2 vocabulary learning by young L2 learners at school. First, even if explicit teaching of L2 phonological representations was shown to be more efficient than no instruction of pronunciation and perception (e.g. Lee et al., 2015), it was not compared to implicit teaching of L2 phonological representations whose salience was increased. Second, an explicit or an implicit focus on L2 phonological representations was very occasionally integrated in the teaching of L2 vocabulary.
Based on the hypothesis that teaching L2 phonological forms enhances the learning of L2 receptive vocabulary, we compared the effects of phonological form-focused teaching and communicative teaching (where L2 phonological forms are implicitly taught) on L2 receptive vocabulary learning in a longitudinal study.
II Research questions
In the present study, we compared the effects of two teaching methods of L2 phonological forms on vocabulary learning by early teenagers (L1 German) learning French as a foreign language at school. The compared teaching methods aimed to enhance skills related to phonological forms, listening processing and pronunciation, with the goal of fostering the link between phonological and orthographic forms in L2. Both teaching methods targeted the same two French phonemes and frequent French words that incorporated them. In L2 French, these two phonemes – a vowel (/ɔ̃/) and a consonant (/ʒ/) – are known to be easily confused in a lexical contrast with the other member of their respective minimal pair. These two phonemes with a high functional load (Derwing & Munro, 2005) may induce important lexical confusion and hence limited learning of novel L2 words.
• Research question 1: Can teaching of L2 phonological forms enhance L2 vocabulary learning?
• Research question 2: Does integrating implicit teaching of L2 phonological forms within a communicative, action-based approach yield a similar improvement in L2 receptive vocabulary learning compared to explicit teaching of L2 phonological representations?
For research question 1, we hypothesize that both teaching methods might favour L2 receptive vocabulary development as imprecise and L1-like phonological representations was shown to limit L2 novel word learning. For research question 2, given that young learners tend to learn better implicitly compared to older learners, we hypothesize that our participants, who are young learners, may benefit from the teaching of L2 phonological forms regardless of whether the teaching method is explicit or implicit, provided that the salience of the targets is reinforced.
Furthermore, we performed an exploratory analysis to investigate the links between the development of receptive vocabulary and the four targeted skills: discrimination, pronunciation, phoneme–grapheme correspondences, and the capacity to retain a phonological form in verbal working memory.
III The study: A pedagogical field experiment
1 Participants
Informed consent was obtained from all relevant parties before data collection. Participants were 127 Swiss German teenagers (mean age 12;6) learning French as a foreign language in the compulsory state school system in Switzerland. Participants were 78 girls and 49 boys belonging to six classes, three of which were in last year of primary education and the other three in the first year of secondary education. These low proficiency learners (level A1–A2 in French) have already followed between 312 hours and 390 hours of French as a foreign language when the experiment started.
All students enrolled in the standard French as a foreign language curriculum across the six classes were included in the experiment, and nobody was excluded. Three participants had been diagnosed as dyslexic. All students were bilinguals with Standard High German and Swiss German, and among them 39% fluently spoke a third language at least one time per week.
French and German share the same alphabetic code but differ in their phonological repertoire. Contrary to French, nasal vowels and post-alveolar voiced fricative do not support lexical contrasts in German. In French, nasal and oral vowels, and voiced and unvoiced post-alveolar fricatives determine numerous minimal pairs (e.g. pot–pont, dos–don, rot–rond; haché–âgé; char–jars; acheté – à jeter).
2 Teaching
The two teaching methods share identical learning objectives concerning targeted phonological and phonetic representations, vocabulary, and the skills targeted.
a Targeted skills
The two teaching methods aimed at enhancing pronunciation, discrimination, knowledge of phoneme–grapheme correspondences and reading and writing of words containing the two targeted French phonemes.
The intervention targeted different levels of phonetic, phonological, and orthographic representations. Starting with perception and moving on to production within each lesson, each level of phonetic or phonological representation was reinvested in more complex tasks from lesson to lesson. Each lesson involved the five targeted skills after the two first lessons that were dedicated to oral teaching.
b Phonological targets
The teaching objectives were to develop the knowledge and the use of the phonetic, phonological and orthographic forms of words containing two phonemes, the nasal vowel /ɔ̃/ and the voiced fricative /ʒ/. These two phonemes are known to be difficult to learn in French as a foreign language. For many learners of French as a foreign language, the phonemes /ɔ̃/ and /ʒ/ are difficult to distinguish in speech flow and to pronounce in words (e.g. Detey et al., 2016). According to N. Ellis (2006), associative learning factors may explain these difficulties.
Perceptual salience is one of the learning factors that may explain phonological difficulties whatever the first language of the learner (N. Ellis, 2006). Perceptual salience of a phoneme is closely related to its sonority, with low vowels (e.g. /ɔ̃/) having a higher sonority than fricatives (e.g. /ʒ/) (Selkirk, 1984, cited in Goldschneider & de Keyser, 2001, p. 25).
Two other important learning factors that are independent from the first language of the learner are contingency, which is the regularity of an association between a cue and an outcome, and the number of cues in competition. Regarding these two factors, the correspondences of the targeted phonemes and their graphemes are far more complex for /ʒ/ than for /ɔ̃/.
c Targeted words
The targeted French words (N = 80) were selected in the (compulsory) teaching materials used by the teachers. Each selected word contains one of the two targeted phonemes. In the selected words, the targeted phonemes are equally present in three different positions: at onset, in the middle or at the end. The two groups of words containing either /ʒ/ or /ɔ̃/ were matched by corpus frequency (i.e. the frequency in the teaching materials) and grammatical categories. Cognate words were avoided as much as possible. Half of the selected words were used in teaching, and the other half were reserved to serve as control condition in testing.
d Teaching methods
The two teaching methods targeted the development of the same skills, the same words, and the same phonological representations, but the teaching of phonological and orthographic forms was either explicit or implicit. In group ‘E’, explicit teaching of the phonological forms focused on forms without any meaning support. The learners were conscious of the targeted forms. Salience of the targets was enhanced by developing the metalinguistic awareness of the learner. The number of cues in competition was limited by restraining lexical diversity. Pronunciation of the phonological targets was taught by the articulatory and the verbo-tonal methods using recasts and immediate repetitions, and discrimination by AX and ABX games. Orthographic representations were taught by collective elaboration of grapheme–phoneme rules, dictation games and exercises developing metalinguistic awareness (blending, segmentation, elision). The retention in verbal working memory was improved through repetitions of songs and tongue-twisters.
In group ‘C’, teaching was organized by communicative action-oriented activities focused on meaning. In this group, the preparation of the teaching materials was aimed at developing an implicit teaching of the phonological forms. The teaching material was prepared to enhance the salience of the targets, favour a better contingency between forms and limit the number of cues in competition. First, the frequency of the targeted phonemes was increased in input by choosing themes in which they were overrepresented (e.g. Les marmott
3 Instruments
Five tasks were administered in a repeated measure design; they tapped into: receptive vocabulary, pronunciation, discrimination, retention in verbal working memory and phoneme–grapheme correspondences.
The receptive vocabulary task was a Yes/No test in which the written stimuli were 80 selected words and 20 non-words (see supplemental material 1). Half of the words and non-words were related to /ʒ/ and the other half to /ɔ̃/. Half of the words equally distributed between the two phonemes were used in teaching and the other half was reserved to serve as control condition in testing. If untaught words are not learnt and taught words are learnt, learning occurred by teaching (control condition). Non-words were disyllabic and made from French consonant-vowel syllables. Each non-word recognized as a word was counted as a false alarm, a recognized word as 1 point and no recognition as zero points. Non-words are used to control for guessing and not overestimating the number of known words (Pellicer-Sánchez & Schmitt, 2012). Two lists were randomly distributed to the participants with a different order of words and non-words.
Pronunciation was assessed by an imitation task. After some training items, participants had to repeat a stimulus item immediately after hearing it. The stimuli were a balanced set of 24 words and non-words containing one of the two targeted phonemes. Individual records of the task were evaluated by three non-musician coders (L1 French speaker, L1 French experimented linguist, and C1 1 L2 French speaker). Comprehensibility by repeated word was evaluated on a five-point Likert-scale. The score by participant is the percentage of words that were on average judged comprehensible with no doubts by the coders (comprehensibility by word >3).
Discrimination was assessed by an AX task. After some training items, participants had to decide if the two stimuli they just heard were the same or not. Individual answers were collected on paper. The targeted phonemes and their counterparts in their minimal pairs composed 10 identical pairs and 20 different pairs of stimuli. An accurate answer was scored 1 point. Following signal detection theory (Stanislaw &Todorov, 1999), the scores of participants were expressed as d prime.
Verbal working memory was assessed by a non-word repetition task. After a training phase, four groups of eight lists of monosyllabic non-words were presented to the participants with a 3-minute pause in between each trial. Each of the four groups contained eight lists that increased linearly from two to five non-words for a total of 68 phonemes. For the present analysis, only results from the group of lists that contained the two targeted phonemes among shared L1–L2 phonemes are considered. Individual records of participants’ recalled words were manually transcribed by two L1 French speakers (not linguists). Phonemic transcription was then automatically counted and scored 1 point by repeated phoneme when it was in the stimulus. The score by participant is the percentage of correct phonemes repeated in the group of eight lists that contain the targeted phonemes.
The knowledge of phoneme–grapheme correspondences was measured by a newly developed dedicated task (Dherbey-Chapuis & Berthele, 2020). In this task, participants had to identify the written form of dictated non-words among four written propositions: three lures and the correct answer; each lure corresponding to an error type: either phonological, orthographic or phonotactic. For example, for /ɔ̃/ and the stimulus /bɔ̃tile/, written propositions are ‘boutilé’ (phonological lure), ‘bonetilé’ (orthographic lure), ‘bonntilé’ (phonotactic lure), and ‘bontilé’ (the correct answer). The correct answer was scored by 1 point and the score by participant is the percentage of correct answers.
4 Procedure
In each class, participants were randomly assigned to two groups that followed one or the other teaching method for six weeks (two lessons per week; total duration of the instruction 4 hours). Experimental teaching took place during the normal time schedule. During the 45 minutes dedicated to French as a foreign language, half of the students received regular French instruction, while the other half received experimental teaching. At mid-time, the students were exchanged between the experimenter and the regular teacher.
The effects of the two teaching methods were evaluated in a pre-, immediate post- and delayed post-test design (Table 1). At pre-test, all five tests were administered. At immediate post-test, to be sure that participants would not learn the targeted words from the test, the vocabulary test was not administered with the other four tasks. Three months after the last day of the experimentation, the delayed post-test included the receptive vocabulary task, the phoneme–grapheme task, and the discrimination task.
Timetable of tests.
5 Data analysis
Generalized mixed models were fitted to the data to test for differences across teaching methods (package lme4, Bates et al., 2015, in version 4.2.2, R Core Team, 2022). Models were evaluated using the lmerTest package (Kuznetsova et al., 2017), and effect plots were obtained using the sjPlot package (Lüdecke, 2023). The models aim to measure the probability of correct recognition of a word.
Accuracy in word recognition (correct vs. incorrect) was modelled in logistic linear mixed-effect models with fixed effects (FA) of teaching method (explicit or implicit) in interaction with time of test (pre-or post-test), status of the word (taught or non-taught) in interaction with time of test, and proportion of false alarm (non-words that were identified as words in the vocabulary task). The model included random slopes of test within participants to account for individual differences in learning of receptive vocabulary and random slopes of test within stimuli to account for differences in the learnability of the words.
The independent variable teaching method (explicit or implicit) was coded as a dummy variable centred around zero (named n.group with values = {−.5; +.5}). The independent variables time of test (=Test {Pre-test; Post-test}) and status of word (=Status {Untaught = 0, Taught = 1}) were coded as contrast {0; 1}. Results of the model are hence estimates calculated when the dummy variable is equal to zero. In practice, this means that each estimate was calculated for the mean of two teaching methods.
Model selection was based on the AIC criteria. The residuals of the elected model have a normal distribution after excluding three outliers. More details on the statistical analyses are given in supplemental material 2.
For the exploratory analysis we fitted a series of linear models with receptive vocabulary as the dependent variable and each of the four different skills as predictor.
IV Results
1 Effects of the two teaching methods on receptive vocabulary knowledge
Two words related with /ɔ̃/ and matched by frequency (ça with Status = untaught; nous with Status = taught) were already known at pre-test by all the students. They are taken off the analyses.
The descriptive analysis of the data suggests a progression in receptive vocabulary knowledge after teaching, whatever the method (Table 2).
Mean and SD of receptive vocabulary test scores (%).
Results of the full model indicate that the interaction between teaching methods and time of test is non-significant (−.03; 95% CI [−.41; .35], Z = −.163, p > .05). The final model was run another time after suppression of the interaction between teaching methods and time of test, and keeping the effect of the teaching methods through the dummy variable n.group.
Figure 1 presents the fixed effects of the model with a total of 123 participants from whom we obtained usable data (3785 observations). The two teaching methods (n.group) do not show significant differences in their effect on receptive vocabulary when the other variables are considered as constant (see also ‘n.group effect plot’ in Figure 3). We observed a significant interaction between the status of words (untaught vs. taught) and time of test (pre-test vs. post-test) indicating that only taught words were learnt at post-test.

Effects plots for teaching methods, time of test, word status, and false alarm.
We explored the interaction, observed in the visualization provided in Figure 1, between the status of words (untaught vs. taught) and time of test (pre-test vs. post-test) by the emmeans package (Russell, 2023). For an average between the two teaching methods, results expressed in log odds ratio indicate that pre-test and post-test significantly differ only for taught words (1.08; SE = .2 (95% CI), Z = 5.41, p < .0001), and not for untaught (.1; SE = .17 (95% CI), Z = .55, p > .05). At pre-test, the differences between words of different status are non-significant (.34; SE = .38 (95% CI), Z = .9, p > .05). Conversely at post-test, the differences between words of different status are significant (1.32; SE = .47 (95% CI), Z = 2.82, p < .05).
The untaught words constitute a control condition indicating that the observed increase of receptive vocabulary is a consequence of the teaching. The two teaching methods induced learning of the taught words (Figures 2 and 3 ‘Status*Test effect plot’). Word guessing was controlled in the model by inclusion of false alarm as fixed effect (FA, non-words recognized as a word; see ‘FA effect plot’ in Figure 2) (Pellicer-Sánchez & Schmitt, 2012).

Interaction between time of test and status of words.

Observation of each fixed effect when the other fixed effects are maintained constant.
Estimates calculated in logit and transformed into probabilities indicate that a word has 20% more chance of being learned after teaching for both teaching methods (see Table 3). However, confidence intervals overlap between pre- and post-test, which may indicate that an important variation exists between words.
Estimates calculated from the model output (logit) and transformed in probabilities.
Note. [. . .] indicates confidence intervals at 95%.
2 Exploratory analysis of the influence of the targeted skills on receptive vocabulary
For the exploratory analysis, the score of vocabulary is the percentage of recognized words minus the percentage of false alarms (Pellicer-Sánchez & Schmitt, 2012).
Considering our theoretical framework, we expect a correlation among the four specified skills. Significance in correlations was observed, varying based on the specific skills, test timing, and teaching methods under consideration (see supplemental material 2). To avoid multicollinearity, we ran individual models for each skill.
We fitted the data in a series of linear models with receptive vocabulary as the dependent variable, and teaching methods, time of test, targeted phonemes and the skill of interest as independent variables (i.e. discrimination, pronunciation, phoneme–grapheme correspondences, and retention in working memory). For each of the four skills, the model incorporated scores from both the pre-test and, one score from the post-test (preferably the delayed post-test). Teaching methods and targeted phonemes were coded as dummy variables (i.e. teaching method = n.group {Explicit = −.05; Communicative = .05}; phoneme = n.phone {/ʒ/ = −.05; /ɔ̃/ = .05}), in order to average the model on the two teaching methods and on the two phonemes and to centre the influence of these predictors around zero. More information is available in supplemental material 2.
a Influence of discrimination
Receptive vocabulary was the dependent variable, and discrimination, targeted phonemes, teaching methods and time of test were the independent variables. Residuals of the model have a normal distribution after removing two outliers. The fitted model shows that discrimination is a significant predictor of receptive vocabulary learning (p < .001). The more students can discriminate the targeted phonemes, the more they learnt the words that contains them (Figure 4). No significant differences were observed between teaching methods.

Predicted effects of discrimination on receptive vocabulary learning.
b Influence of pronunciation
Receptive vocabulary was the dependent variable, and comprehensibility of pronunciation, targeted phonemes, teaching methods and time of test were the independent variables. Residuals of the model have a normal distribution after removing three outliers. The fitted model shows that pronunciation is a significant predictor of receptive vocabulary learning (p < .05). The more students mastered the pronunciation of the targeted phonemes, the more they learnt the words that contain them (Figure 5). No significant differences were observed between teaching methods.

Predicted effects of pronunciation of the targeted phonemes on receptive vocabulary learning.
c Influence of phoneme–grapheme correspondences
Receptive vocabulary was the dependent variable, and the score in phoneme–grapheme correspondences, targeted phonemes, teaching methods and time of test were the independent variables. Residuals of the model have a normal distribution after removing one outlier. The fitted model shows that the mastery of phoneme–grapheme correspondences is a significant predictor of receptive vocabulary learning (p < .001). The more students mastered L2 phoneme–grapheme correspondences, the more they learnt L2 words (Figure 6). No significant differences were observed between teaching methods.

Predicted effects of the mastery of phoneme–grapheme correspondences on receptive vocabulary learning.
d Influence of retention in working memory
Receptive vocabulary was the dependent variable, and retention in working memory, targeted phonemes, teaching methods, and time of test were the independent variables. Residuals of the model have a normal distribution after removing one outlier. The fitted model shows that retention in working memory is a significant predictor of receptive vocabulary learning (p < .001). The more students could retain phonemes, the more they learnt L2 words (Figure 7). No significant differences were observed between teaching methods.

Predicted effects of phoneme retention in working memory on receptive vocabulary learning.
V Discussion
1 Findings for language learning
Overall, the results indicate that participants made significant progress in receptive vocabulary, with no major differences between the two teaching methods. Three months after instruction, both teaching methods yielded comparable progress in receptive vocabulary learning. The action-based communicative teaching approach included implicit instruction of L2 phonological forms with a focus on meaning, whereas the explicit teaching solely emphasized L2 phonological forms without incorporating meaning. The taught words are learnt contrary to the untaught words, which indicates that the observed increase in receptive vocabulary is the consequence of the teaching. This result suggests that teaching L2 phonological forms is crucial for receptive vocabulary learning (Janssen et al., 2015; Llompart & Reinisch, 2021; van de Ven et al., 2019), and confirms the importance of teaching the link between phonological and orthographic forms of a word (Hulstijn, 2013).
Previous studies have given contradictory results about the effects of instruction of pronunciation alongside L2 orthographic forms on form–meaning connections learning. Some have concluded that the orthographic forms help form–meaning mapping (e.g. Bürki et al., 2019), others have shown no effects (e.g. Simon et al., 2010). The co-instruction of orthographic and phonological forms on form–meaning connections may be moderated by the mismatches between the languages of the learner in phoneme–grapheme correspondences (Escudero et al., 2014). In the present study, the simultaneous incorporation of pronunciation instruction with the teaching of orthographic forms fostered L2 word learning. These results suggest that an integrated teaching of vocabulary thought to address the specific difficulties of words, either orthographic or phonological, may lead to more complete and precise representations of a word. Our hypothesis is that the observed positive effect of instruction on written word recognition is the result of a synergic development of the four taught skills.
Both teaching methods exhibited a positive effect, and the exploratory analysis shed light on how superior proficiency in the four targeted skills contributes to enhance receptive vocabulary learning. The mastery of the four targeted skills, discrimination, retention in verbal working memory, pronunciation and command of grapheme–phoneme correspondences, necessitates the learning of precise and redundant phonological representations for optimal efficiency.
Our exploratory analysis suggests that each of the four targeted skills contributes to the learning of L2 vocabulary. Discrimination, which is the ability to discriminate the target phonemes from their counterparts in minimal pairs, is a significant predictor of the learning of novel words containing them (Figure 4). The more the student can successfully discriminate phonemes, the more they can learn the words that contain them. This result indicates that phonemic discrimination should be taught integrated with vocabulary teaching (S. Zhang & Zhang, 2022).
The development of receptive vocabulary in L2 has been facilitated through the training of retention of L2 phonological forms in verbal working memory. The more the student can successfully repeat non-words, the more they can learn L2 novel words. Numerous studies have shown that retention in verbal working memory, as assessed through the repetition of non-words, reflects the capacity to learn new phonological forms, a crucial aspect for learning novel words (e.g. Gathercole, 2006).
The two teaching methods of pronunciation enhanced L2 receptive vocabulary learning with no significant differences. In the explicit teaching group, the teaching of pronunciation was based on two explicit methods (i.e. articulatory and verbo-tonal) and occurred during the activities aimed at developing discrimination, phoneme–grapheme correspondences, and retention in working memory beyond the activities dedicated to pronunciation training. In the communicative teaching method, the teaching of pronunciation was based on conversational recasts, which were given to the learners during activities focused on meaning (role-play, reading aloud. . .). Our hypothesis is that pronunciation instruction has played a role in refining L2 phonological representations.
More precise and reliable links between orthographic and phonological forms seems to help the mapping of forms into meanings (e.g. Hayes-Harb & Masuda, 2008; Perfetti & Hart, 2002; van de Ven et al., 2019). In the present study, the exploratory analyses have shown that teaching L2 phoneme–grapheme correspondences alongside L2 phonological forms enhanced the learning of L2 receptive vocabulary. The more the student masters phoneme–grapheme correspondences, the more they can learn L2 novel words. Instruction of precise phonological forms, in perception and production, may have facilitated their link to orthographic forms, and in doing so increased the reliability of the links between the two forms.
The points raised by our exploratory analysis need to be confirmed in further studies. However, they seem coherent with results from other studies, which have shown the influence of orthography on pronunciation and perception, and their links to L2 word learning.
2 Pedagogical implications
Teaching the four skills, pronunciation, training of the retention of L2 phonological forms in verbal working memory, discrimination, and mastery of grapheme–phoneme correspondences alongside vocabulary may help to reduce lexical confusion. These four skills are almost never taught in L2 classrooms, and certainly not in an explicit way. The results of the present study suggest that building the skills that the intervention focused on enhanced receptive vocabulary learning in L2 low-level learners (A1–A2). Teaching L2 vocabulary can be more efficient when the phonological forms are taught and when their related skills are trained. Our intervention study shows that these four skills can be taught, either explicitly or implicitly, in an integrated fashion to young learners, and that this teaching contributes significatively to vocabulary learning.
The lack of significant differences between the two teaching methods shows that there are at least two (but certainly more) possible methods for the integrated teaching of vocabulary. The efficiency of the implicit method may be explained by the salience of the targets, which were reinforced in the materials and promoted by the designed activities and the limitation of the number of cues in competition. These improvements of the teaching materials can be prepared before classes and do not require any special training. Increasing the extrinsic salience of phonological forms (Boswijk & Coler, 2020) may be a good option for teaching implicitly L2 phonological forms. This way of focusing on forms may overcome teachers’ reluctance to teach phonological skills.
In the present study, the type of acquired knowledge, whether explicit or implicit, was not assessed. It remains a possibility that some participants in the communicative teaching group may have identified the targeted forms. Nevertheless, during the debriefing discussions conducted after the experiment in each class, no student in the communicative teaching group asserted that they had been taught about the targeted phones.
3 Limitations
This study has several limitations. First, in all small-scale research, statistical power is limited. Testing empirically different teaching methods is time-consuming and complicated in naturalistic school settings, as the number of enrolled students and the number of tests is limited by external constraints. Second, only two phonemes were explored, and it would be pertinent to test similar teaching sequences with other targets. Third, the experimenter taught all the lessons herself, and the same sequences taught by another teacher may lead to different results. Fourth, even if yes/no tests have been shown to be highly correlated with form–meaning connection tests (as reported by Chenu & Jisa, 2009), we did not verify whether the meaning attributed to an orthographic form by our participants was accurate. The assessment of word meaning knowledge was not included because word meaning was not trained in the explicit group.
VI Conclusions
The results of this study underscore the significance of incorporating L2 phonological forms into vocabulary teaching to enhance L2 vocabulary learning by young L2 learners at school. The research indicates that, regardless of whether the teaching method is explicit or implicit, the effectiveness of teaching L2 receptive vocabulary is optimized when the instruction of L2 phonological forms and the development of associated skills are integrated comprehensively in the teaching process.
Supplemental Material
sj-docx-1-ltr-10.1177_13621688241270803 – Supplemental material for Teaching methods emphasizing phonological forms enhance L2 vocabulary learning
Supplemental material, sj-docx-1-ltr-10.1177_13621688241270803 for Teaching methods emphasizing phonological forms enhance L2 vocabulary learning by Nathalie Dherbey Chapuis and Raphaël Berthele in Language Teaching Research
Supplemental Material
sj-docx-2-ltr-10.1177_13621688241270803 – Supplemental material for Teaching methods emphasizing phonological forms enhance L2 vocabulary learning
Supplemental material, sj-docx-2-ltr-10.1177_13621688241270803 for Teaching methods emphasizing phonological forms enhance L2 vocabulary learning by Nathalie Dherbey Chapuis and Raphaël Berthele in Language Teaching Research
Supplemental Material
sj-docx-3-ltr-10.1177_13621688241270803 – Supplemental material for Teaching methods emphasizing phonological forms enhance L2 vocabulary learning
Supplemental material, sj-docx-3-ltr-10.1177_13621688241270803 for Teaching methods emphasizing phonological forms enhance L2 vocabulary learning by Nathalie Dherbey Chapuis and Raphaël Berthele in Language Teaching Research
Supplemental Material
sj-docx-4-ltr-10.1177_13621688241270803 – Supplemental material for Teaching methods emphasizing phonological forms enhance L2 vocabulary learning
Supplemental material, sj-docx-4-ltr-10.1177_13621688241270803 for Teaching methods emphasizing phonological forms enhance L2 vocabulary learning by Nathalie Dherbey Chapuis and Raphaël Berthele in Language Teaching Research
Supplemental Material
sj-docx-5-ltr-10.1177_13621688241270803 – Supplemental material for Teaching methods emphasizing phonological forms enhance L2 vocabulary learning
Supplemental material, sj-docx-5-ltr-10.1177_13621688241270803 for Teaching methods emphasizing phonological forms enhance L2 vocabulary learning by Nathalie Dherbey Chapuis and Raphaël Berthele in Language Teaching Research
Supplemental Material
sj-docx-6-ltr-10.1177_13621688241270803 – Supplemental material for Teaching methods emphasizing phonological forms enhance L2 vocabulary learning
Supplemental material, sj-docx-6-ltr-10.1177_13621688241270803 for Teaching methods emphasizing phonological forms enhance L2 vocabulary learning by Nathalie Dherbey Chapuis and Raphaël Berthele in Language Teaching Research
Supplemental Material
sj-docx-7-ltr-10.1177_13621688241270803 – Supplemental material for Teaching methods emphasizing phonological forms enhance L2 vocabulary learning
Supplemental material, sj-docx-7-ltr-10.1177_13621688241270803 for Teaching methods emphasizing phonological forms enhance L2 vocabulary learning by Nathalie Dherbey Chapuis and Raphaël Berthele in Language Teaching Research
Footnotes
Acknowledgements
We thank the team of teachers at Steffisburg, and in particular Mrs Gfeller, for their active support and participation in this study. We also sincerely thank the reviewers for their insightful comments and valuable suggestions, which have significantly improved the quality of our manuscript.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research has received financial support from Swiss Universities, the Haute Ecole Pédagogique de Fribourg and the University of Fribourg, Switzerland.
Supplementary material
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
