Abstract
We investigated the effectiveness of two different teaching methods based on two different theoretical views of how languages are learned in oral proficiency after three years of L2 French instruction. The first method is commonly used in the Netherlands and is in line with structure-based (SB) principles, viewing language as a set of grammar rules that need to be explained to achieve accuracy, usually in the L1. The second method aligns with dynamic-usage-based (DUB) principles in that language is viewed as a set of conventionalized routines that are learned through frequent exposure and the L2 is spoken exclusively in class. In a large study (Rousse-Malpat et al., 2019), the DUB method proved more effective, but the effects of method and L2 exposure could not be separated as the amount of L2 exposure is a crucial difference between the methods. However, one SB teacher spoke French almost exclusively, comparable to what happens in a DUB classroom. In this study, we compared this SB group with a DUB group matching in scholastic aptitude. The free oral L2 French production of 41 Dutch participants was measured in terms of holistic and analytical scores. The DUB method was more effective in terms of general proficiency, fluency, grammatical complexity, accuracy of the present tense, and overall L2 use. Our findings suggest that a teaching method in line with DUB principles is more beneficial in achieving overall oral proficiency and explicit grammar is not needed to achieve accuracy.
Keywords
I Introduction
If the past 50 years of research in L2 instruction has taught us anything, it is that the modern language classroom should be a place where learners are surrounded by meaningful L2 input and engage in motivating communicative activities in order to enhance their general proficiency level (Dörnyei, 2002; Krashen, 1981; Lantolf et al., 2015; Verspoor, 2017). However, in the Netherlands, as in Austria and France, most teachers still use methods that are based on structure-based (SB) views, where a focus on grammatical practice to achieve grammatical accuracy is viewed important (Graus & Coppen, 2018; Schurz & Coumel, 2020; West & Verspoor, 2016; see also Lightbown & Spada, 2013). This is not so surprising as much L2 instructional research also points to the effectiveness of some explicit focus on form (Goo et al., 2015; Norris & Ortega, 2000; Spada & Tomita, 2010). However, effectiveness is usually measured in short-term experiments and usually only in terms of achieving grammatical accuracy in writing and not in overall proficiency or oral proficiency.
In the Netherlands, teachers are free to choose their methods as long as students meet CEFR (Common European Framework of Reference) related learning outcomes in their final year, but most teachers use one of the standard textbooks available. According to Haijma (2013), these standard foreign language methods to teach German, French and to a certain extent English fail to achieve a satisfying level of general L2 proficiency, particularly oral proficiency. However, one small group of teachers, not satisfied with especially the oral proficiency outcomes, introduced the Accelerated Integrated Method (AIM) developed by Maxwell (2001) for teaching French to young children in Canada. It is based on story scripts, in which the target language is spoken exclusively. Because the method involves a great deal of L2 exposure and frequent repetition of meaningful utterances, the method is very much in line with dynamic-usage-based (DUB) principles (Verspoor, 2017), a combination of complex dynamic systems theory (CDST) and usage-based (UB) theories on language and language acquisition. Teaching methods in line with these principles do not see language as a set of rules but as conventionalized routines, where non-linear learning emerges from the dynamic interaction between input and output.
In a previous study (Rousse-Malpat et al., 2019) with 229 learners, the DUB inspired method (AIM) was found to be more effective after three years of instruction than the standard SB method in both writing and speaking. However, it was not possible to tease apart the effects of exposure and method because on the whole SB teachers, with one clear exception, used the L1 much more than the DUB teachers. One SB teacher spoke a great deal of French in class, and we will compare the oral proficiency outcomes of her students with those of a DUB group that was similar in scholastic aptitude after three years of instruction. The aim of the present study is to see if the DUB method is more effective than the SB method when L2 exposure is controlled for more.
II Structure-based versus dynamic-usage-based principles
In their review of the most commonly known foreign language teaching approaches in the world Lightbown and Spada (2013) focused on differences in how much attention is paid to grammar, meaningful input, and meaningful interaction. They conclude that a communicative approach has the best chance to be effective in language teaching, when language is used meaningfully, is taught with a large amount of input – preferably as authentic as possible – and some attention to grammar is given. However, despite the evidence of the efficacy of such approaches, Lightbown and Spada (2013) point out that the use of true communicative approaches remains rare in the foreign language classroom, while the use of structure-based teaching methods-those that focus on grammatical issues-remains widespread. This is very much true in the Netherlands as a classroom observation study by West and Verspoor (2016) showed. We will use the term structure-based slightly differently in that it involves not only how language is taught, but also how language is viewed implicitly or explicitly: as a system of predictable morphological or syntactic rules.
Many studies have been conducted to investigate the effects of methods on second language proficiency. Long (2000) made a distinction between focus-on-forms approaches and focus-on-form approaches, where grammar instruction is provided within a meaningful communicative setting, and claimed that focus-on-form instruction is more effective than focus-on-forms. Several meta-analyses and studies support this claim and find a beneficial effect for focus-on-form instruction (Doughty, 2003; Norris & Ortega, 2000; Spada & Tomita, 2010). It should be noted, though, that the effects of different types of instruction almost always focus on the mastery of grammatical forms only and thus such studies reveal an implicit structure-based view of language. It is true that evidence in general points to a beneficial effect of grammar instruction on the acquisition of grammatical rules, but much of the evidence is based on short-term interventions, mainly involving written products, that test mainly for accuracy on the grammatical rules (Goo et al., 2015; Norris & Ortega, 2000; Spada & Tomita, 2010). Very few, if any studies, are long-term and have looked at overall proficiency gains, especially in oral production. Finally, as Doughty (2003) points out, in the meta-analysis by Norris and Ortega (2000) many issues may have been confounded, and as Andringa and Schultz (2016) have argued, Spada and Tomita (2010) did not control adequately for the amount of exposure.
In Foreign Language (FL) teaching, where the target language is not the ambient language and students have very little extra-mural exposure, structure-based approaches are still the norm, at least in the Netherlands. The standard FL textbooks used in the Netherlands and Finland, whether they be labeled communicative or task-based, are often implemented in a structure-based way (for German textbooks, see Maijala & Tammenga-Helmantel, 2019) and teachers implicitly or explicitly build on the premise that grammar forms the core of the language to be learnt. Standard L2 French textbooks in Europe also contain relatively little authentic language and especially lack in terms of lexical bundles or formulaic sequences (for French textbooks, see Vandeweerd & Keijzer, 2018). Usually, chapters include a few pages on grammatical forms presented from simple such as the simple present tense to complex such as a subjunctive form in French; other aspects of language such as vocabulary, formulaic phrases, pronunciation, intonation or pragmatic use are learned separately. Unfortunately, most Dutch teachers still prefer to spend a relatively large amount of class time explaining the grammar sections in the L1, limiting the amount of L2 exposure and interaction, so their teaching is basically in line with what Long called a focus on forms approach (West & Verspoor, 2016) with very limited L2 exposure. (Note that the teacher in the current study, however, was an exception and spoke much French to her students.)
Communicative or content-based approaches are in principle based on the idea that meaningful exposure and interaction are key to successful L2 learning, but as Long (1996) points out, they are not necessarily based on any particular linguistic theory, which leaves room for misinterpretation or reverting back to traditional structure-based methods. Therefore, even though some methods may be intended as communicative, they are implicitly based on structure-based views as the acquisition of correct grammar and morphology is still considered paramount to avoid fossilization. Other communicative methods such as the comprehension approach, lexical approach, accelerated integrative method, total physical response and storytelling (TPRS), comprehension approach and CLIL (content and language integrated learning) are not explicitly related to usage-based linguistic theory, but their practices are very much in line with current DUB views.
A Usage-Based perspective on L2 learning would predict that language emerges from repetitive exposition to meaningful input and language use (Langacker, 2000; Tomasello, 2003). Linguistic constructions (pairings of form and meaning within a pragmatic context, which we call use) are learned through association as they are ‘heard and used frequently and therefore entrenched, which is the result of habit formation, routinization or automatization’ (Verspoor & Schmitt, 2013, p. 354). The term dynamic-usage-based (DUB) is inspired by the title of one of Langacker’s articles (2000) in which he argues that a usage-based view is per definition a complex dynamic system theory view. However, in our own use of the term, we would also like to accentuate the fact that language development is per definition non-linear in that some sub-systems need to be learned before others and that variability in the use of structures (which includes making errors) is needed to progress (Verspoor, 2017).
The key difference between a SB and DUB view is that a DUB view assumes no priority for grammar or syntax in language. Language is not driven by rules. Instead, language forms – from concrete morphemes, words, phrases, clauses, sentences, and discourse sequences to abstract lexical categories and morphological and syntactic patterns – are all fundamentally similar as they all bear meaning to different degrees and form a continuum (Goldberg, 2006; Verspoor, 2017).
A teaching setting in a DUB perspective would entail that learners are exposed frequently to examples of authentic language from the target language community and that learners discover the form–use–meaning mappings (FUMMs) at all levels from pronunciation, to morphology, words, short phrases, formulaic sequences and syntax implicitly or inductively (Verspoor, 2017). Language features are never presented in isolation but in meaningful utterances; speaking, listening, reading and writing are presented as integrated skills. The target language is learned in interactive communicative activities. Negotiation for meaning (Long, 1996) is encouraged and language is used meaningfully with a positive attitude of the teacher towards unpredictability, risk-taking and choice-making (Verspoor, 2017).
DUB foreign language instruction – in line with true communicative methods – should thus provide learners with meaningful L2 input made comprehensible with visuals, gestures, paraphrases, and when needed translations. Learners may be asked for output, but early on should focus mainly on meaningful repetitions and guided interactions. Grammar is always a by-product of the process of language learning and not the primary focus. In fact, this is the original idea behind Long’s notion of focus on form (2000) as L2 learning occurs during a meaning-based activity where attention to form (not only in the morphological sense) can be drawn inductively when the learner needs it. It also aligns with the definition of meaning-focused instruction (Norris & Ortega, 2003), which provides exposure to rich input and meaningful use of the target language in a context where grammatical acquisition is not the main goal but the incidental by-product of language acquisition. In a way, this approach also reminds us very much of input-processing (VanPatten & Cadierno, 1993), except for the fact that the goal is not to acquire only the grammatical morphemes, but also the lexical items and the words associated with them. It also reminds us of audio-lingual methods, in which new habits are built through repetition, but the main difference is that the constructions taught are meaningful and geared towards lexical associations, rather than morphological patterns.
To summarize, the key differences in the two theoretical views of language – structure-based versus dynamic-usage-based – are not necessarily aligned with the distinction between structure-based and communicative approaches, or between focus on form(s) or focus on meaning approaches as the main difference is concerned with the (implicit) way language itself is viewed. An SB view considers language structure as primary: the rules drive the system and need to be acquired. A DUB view considers form–use–meaning mappings (FUMMs) as primary: language is a large array of constructions that need to be learned one by one. The regular patterns can be discovered without too much attention. The main ideas and implications for language teaching are summarized in Table 1.
Summary of structure-based (SB) and dynamic-usage-based (DUB) views of language and implications for instruction.
III Previous studies
There are probably many examples of effective teaching approaches that are in line with DUB principles in that they focus on meaningful exposure, such as the comprehension approach, total physical response and storytelling (TPRS), the movie method (Hong, 2013; Irshad et al., 2019; Koster, 2015) and AIM (Maxwell, 2001). However, as far as we know, there is very little empirical evidence, especially in long-term studies. Thus far only a few studies have systematically compared an SB approach to a DUB approach over a longer period of time. Those that have done so have found the DUB methods to be more effective than traditional semi-communicative L2 teaching (Rousse-Malpat et al., 2019; Verspoor & Hong, 2013) on general proficiency skills. For the AIM method developed in Canada, Maxwell (2001) and Michels (2008) investigated oral fluency and reported that the AIM learners were better than the non-AIM learners. However, these two were very small-scale qualitative studies. Studies with more participants and statistical analyses did not seem to reveal any significant differences between AIM and non-AIM (Mady et al., 2009). However, qualitative findings showed that AIM teachers spoke more French in the classroom and that AIM learners reported to feel more at ease in listening and speaking skills. Bourdages and Vignola (2009) looked at the oral communication skills of two groups of third grade learners in Canada (AIM vs. non-AIM) by means of interviews. They found some qualitative differences between the AIM and non-AIM learners in that AIM learners code-switched less with English than the other learners and seemed more willing to communicate in French. AIM learners also produced more incomplete sentences suggesting that they dared to take more risks but could either not maintain sentence-level speech, or were more similar to fluent target language speakers, who do not necessarily use full sentences (Salaberry & Kunitz, 2019). They concluded, however, that there were no significant differences between the groups with regard to proficiency and grammatical accuracy. But as Cummins (2014) pointed out later, their general conclusions were based on traditional morphological accuracy measures only and failed to take communicative fluency into account, which was clearly different as was evident from one of the tables: Specifically, the AIM students produced 1,751 utterances compared to 811 for the non-AIM students –more than twice as much. The AIM students also produced 1,662 utterances completely in French (95%) compared to 306 for the non-AIM group (38%). (p. 3)
In the Netherlands, where extra-mural French is almost non-existent, a few studies have looked at the effects of AIM on oral proficiency using free-production data. Rousse-Malpat et al. (2019) presented evidence that AIM was more effective on oral skills after one year and two years of instruction compared to a communicative SB method. As far as accuracy for specific grammatical constructions such as the present tense and negation was concerned, there were no differences. For grammatical gender, however, the SB group was more accurate after the first year, but this difference disappeared after the second year, suggesting that the DUB learners took longer to acquire these constructions. Both groups made the same number of errors on all constructions both together and separately. However, there were some qualitative differences. The SB group used the same prefabricated chunks learned in class for negation (je n’aime pas, je ne comprends pas and je ne sais pas) whereas the AIM students showed more creative variation in their expressions of negation with different verbs and subjects (je n’ai pas d’amis, il ne vait pas avec on, il ne pas gentil), sometimes not accurate. The same difference was observed for present tense. Another qualitative difference was the use of Dutch during the oral interviews. SB students tended to fall back on Dutch when they did not know a word in French or when they wanted to indicate that they did not understand. The AIM students kept speaking French. The same differences were found for writing skills (Rousse-Malpat et al., 2012). This qualitative difference is in line with Bourdages and Vignola’s (2009) findings and Cummins’ (2014) reinterpretation of their results.
In the current article, we will compare again speaking proficiency of two groups of students after three years of instruction, but this time controlling for L2 exposure to a greater extent. Based on findings in the previous studies we hypothesize that
The DUB students will score better than the SB students on general oral proficiency as they will have had more practice.
The DUB students will score better than the SB students on analytical complexity and fluency measures as they have access to whole phrases and chunks of language.
The DUB students will score the same or better than the SB students on accuracy measures as they have had time to discover the patterns.
IV Method
1 Participants
The current study included 41 participants similar in gender, age, and scholastic aptitude and controlled for L1 Dutch, amount of extramural or informal exposure to the L2 outside of the classroom, and beginning level of French. The SB instructional group consisted of 21 students (10 male, 11 female) with a mean age of 14.29 (range 13 to 15). The DUB instructional group consisted of 20 students (9 male, 11 female) with a mean age of 14.45 (range 14 to 15). The SB group had two different teachers over the course of three years. The DUB group kept the same teacher throughout the study.
All pupils had a high scholastic aptitude level as measured by the Cito test (a general scholastic aptitude test taken in the last year of primary education in the Netherlands), which was shown by Verspoor, de Bot, and Xu (2015) to be a strong predictor of L2 development of English at the Dutch high school level. An independent-samples t-test showed that the SB group (M = 547.35, SD = 2.70) and the DUB group (M = 547.95, SD = 2.54) did not differ with regard to scholastic aptitude, t(38) = −0.72, p = 0.474.
2 Teaching methods
Grandes lignes (Noordhoff Uitgevers, 2014) is the best sold textbook for high-school French in the Netherlands. It is supposed to be a communicative task-based method composed of a textbook and an exercise book, but each chapter contains a section on a grammatical topic. The chapters are organized around topics such as family, school, sports and holidays. They start with a reading text containing target grammatical rules and vocabulary followed by questions about the meaning of the text. Our classroom observations showed that the teacher in the current study spoke mostly French, even to give instructions and explain rules. Vocabulary in the book is given in the form of a word list or a chunk list called phrases clés (‘key sentences’) with their translation into Dutch. These sentences are often literally the expected response to most of the productive exercises. The interaction between the teacher and the learners was rather spontaneous in the L2.
The DUB method in our study is operationalized as the Accelerative Integrated Method (AIM) (Maxwell, 2001). Teachers and learners use only the target language in the classroom from day one, both in speaking and writing. Teachers use stories to provide learners with meaningful utterances in meaningful contexts. Each story provides the learners with lots of exemplars of similar constructions as the first few lines of the very first story in the method illustrate: Il joue de la guitare et Il travaille un peu et il aime la musique. [He works a little and he likes music.] Il danse et chante et
Learners first listen to the story and repeat the story lines chunked by the teacher, imitating the teacher’s words and gestures. (For an example of gesture use, see https://vimeo.com/58676240). In other words, especially early on a great deal of repetition and imitation is built into the method. Each lesson (usually 50 minutes) consists of several blocks of 10 minutes. The primary focus is the playful revisiting of chunks to understand the meaning of the story. An implicit focus on form also occurs, albeit inductively, as the method is built to create associations between form, use and meaning. For example, teachers use gestures and songs to draw attention to some forms of the language (e.g. grammatical gender or present/ future tense) in the meaningful context of the story.
3 L2 exposure in class
The teachers were asked to self-evaluate the amount of L2 exposure that the learners received and we confirmed their score by timing the actual amount of L2 exposure during the lessons that we observed (one per teacher). Then, we estimated the number of hours of L2 exposure by multiplying the total number of hours of French instruction with the percentage of L2 exposure learners received on average in the classroom (see Table 2). Although this score is a rough estimate, we believe the groups were reasonably comparable in terms of L2 exposure.
Overview of total hours of instruction and second language (L2) exposure after three years.
Notes. SB = structure-based. DUB = dynamic-usage-based. AIM = Accelerative Integrated Method.
4 The oral test
In this study, a standardized, validated oral proficiency test called the Student Oral Proficiency Assessment (SOPA) developed by the Center of Applied Linguistics (Thompson et al., 2002) was used (http://www.cal.org/ela/sopaellopa). The rating rubric was developed based on the American Council on the Teaching of Foreign Languages (ACTFL) Proficiency Guidelines and the scores refer to ACTFL Proficiency Levels.
The SOPA can be used independent of the mode of language instruction. The interviewer asks questions based on a script (see extra materials) with a positive, complimentary attitude and the rater (trained research assistant) takes notes. To help learners feel at ease, the interview is conducted in pairs, selected by the teacher to make sure the pupils got along and were willing to cooperate.
The test consists of three tasks, increasing in the demands for productive language and continues until the interviewer senses that the task or the question exceeds the current proficiency level of the learner. The first task, called ‘Fruits and vegetables’, starts with asking learners to point at fruits or blocks with colors, then moving them according to commands to identifying and naming things in the form of single words. The second task is called ‘All about you’ and is a semi-structured interview about topics that are familiar to the learners (e.g. school, holidays, sports). The last task is called ‘The farm’ and is meant to create an environment where more complex commands can be given and participants can tell a story (see Appendix 1 for the detailed script).
The testing sessions were video-taped so scores could be double-checked and transcribed for further quantitative analysis. Each interview lasted 15 minutes on average ranging from 13 to 20 minutes.
5 Holistic measures
Immediately after the testing session, the interviewer and the rater agreed on a score for the learners on the SOPA rubrics on fluency, grammar, vocabulary and oral comprehension.
6 Analytic measures
The interviews were transcribed and coded. One data file per participant was created and words prompted by the interviewer were removed. The samples were analysed on the same constructs as the holistic SOPA rating: fluency, grammar and vocabulary.
Fluency was operationalized by calculating the speech ratio and the use of filled pauses. Speech rate was calculated by dividing the total number of words produced in a sample by the total amount of time required for a particular activity or the whole interview (including pause time) expressed in seconds. 1 Speech rate was based on the total duration of the interview as the exact speaking time of each student could not be calculated as they were interviewed in pairs. Studies which used computer technology and included big samples affirm that speech rate and mean length of runs are strongly related to holistic ratings of fluency (Ginther et al., 2010; Kormos & Dénes, 2004). Fulcher (2015) concludes that speech rate seems the most successful temporal measure in predicting rater’s perceptions of fluency, with correlations ranging between 0.30 and 0.89 across studies. Filled pauses were calculated by dividing the total number of filled pauses by the total number of French words. The research findings concerning frequency of filled pauses are ambiguous. Some researchers found that frequency of filled pauses distinguished between fluent and non-fluent speakers (Lennon, 1990; Riggenbach, 1991), but these studies were based on a small number of participants. The conclusions from research projects with a higher number of participants were that filled pauses did not correlate with fluency (Kormos & Dénes, 2004; Rekhart & Dunkel, 1992; van Gelderen, 1994). Although speech rate and filled pauses have been found to be influenced by patterns in the L1 (Peltonen, 2018), we assumed that in these beginners, speech rate would be affected by L2 word finding problems as L1 and scholastic aptitude were controlled for.
Grammar was operationalized as sentence type diversity (STD) and accuracy. STD was the number of complete sentences (with at least a subject and a verb regardless of how accurate they were) categorized into four sentences types as defined in Verspoor and Sauter (2000): simple, compound, complex and compound-complex. Compound sentences were included because Bardovi-Harlig (1992) points out that the amount of coordination seems to be a more sensitive measure than subordination measures at initial levels of L2 development. For each sentence type, the ratio of the number of occurrences of the sentence type against the total number of complete sentences was calculated. Because these categories are mutually exclusive and therefore complementary, the absolute number of occurrences of each sentence type was converted to percentages.
Grammatical accuracy was operationalized as the correct use of three specific grammatical features that have been dealt with in both classes: gender, negation and tense. Accuracy was calculated as the ratio of the correct use of the grammatical construction against the total number of occurrences of that construction. French has two genders, feminine (articles: la/une) and masculine (l/l’/un). In Dutch, nouns can be masculine, feminine or neuter. Dutch beginners must first learn the specific gender of each word in French even though studies on the predictability in French gender attribution and the acquisition of grammatical gender argue that a noun’s ending and its grammatical gender are related (Lyster, 2004, 2006; Lyster & Izquierdo, 2009).
Negation in French has two elements: ne and pas need to be placed before and after the conjugated verb, like Je ne sais pas (‘I don’t know’). The ne can be left out in spoken French, but the pas must always occur after the verb. For Dutch learners, this construction is particularly difficult because in Dutch there is only one negator (niet), which occurs after the verb. Dutch learners leave off the pas or the negative terms are not in the right place.
For tense, the use of present tense was investigated. It is the first tense that learners hear in the AIM method and it is explicitly taught in the SB method. The past tense (imparfait and passé composé), the near future (futur proche), and the conditional (conditionnel) also occurred in the data, but in very low numbers.
For lexical diversity, the Guiraud’s index was taken. We also looked at lexical access, operationalized by the total number of French tokens and the number non-French tokens. Lexical access shows a certain degree of automaticity and entrenchment and is likely to vary according to the level of proficiency (Segalowitz, 2010).
7 Statistical analyses
To analyse the relationship between our measures, two different kinds of correlations were used: a two-tailed Pearson R correlation when the data was normally distributed and a two-tailed Spearman’s Rho correlation when the data was not normally distributed (α = .05). The strength of the R values was valued in line with Plonsky and Oswald (2014): small = .25, medium = .40, large = .60.
Independent samples t-tests were used as the data was normally distributed (α = .05). The strength of the effect was calculated with Cohen’s d based on benchmarks suggested by Cohen (1988), defining small (η2 = 0.01), medium (η2 = 0.06), and large (η2 = 0.14) effects.
V Results
1 Holistic scores
Participants received a score from 1 to 9 for fluency, vocabulary, grammar and comprehension. To evaluate whether the variables were measuring the same construct a correlation analysis was performed. Because the variables were not normally distributed, a two-tailed Spearman’s Rho correlation (α = .05) was performed. The four variables were highly and significantly correlated with each other (see Table 3). Thus, they were averaged into one variable representing general oral proficiency.
Correlations between variables.
Note. **significant at p < .01.
A Kolmogorov–Smirnov test and a Levene’s test revealed that the scores for general proficiency were normally distributed. Therefore, the DUB group and the SB group could be compared with an independent t-test (α = .05) (see Figure 1). As Figure 1 shows, the SB group scored lower (M = 3.24, SD = 0.71) on general oral proficiency than the DUB group (M = 4.29, SD = 0.91). An independent-samples t-test indicated that this difference was significant (t(39) = −3.72, p = .001) with a large effect size (Cohen’s d = 1.16).

Boxplot comparing structure-based (SB) and dynamic-usage-based (DUB) on general oral proficiency.
2 Analytical scores
For fluency the speech rate and filled pauses were analysed. The speech rate of the DUB group was significantly higher than the speech rate of the SB group in each separate task and all tasks combined, indicating that the DUB group was more fluent (see Table 4).
Results of the fluency analysis per task.
As Table 4 shows, the speech rate of the DUB group was significantly higher than the speech rate of the SB group. Besides the speech rate, a ratio of filled pauses was calculated. There was no significant difference between the DUB group (M = 0.16, SD = 0.07) and the SB group (M = 0.18, SD = 0.06) with regard to the number of filled pauses, t(39) = 0.99, p = .333.
For grammar, sentence type diversity (STD) and accuracy were analysed. As far as STD is concerned, results show that both groups used mostly simple sentences but especially the DUB group used other types of sentences as well (See Table 5). An independent samples t-test revealed that the DUB group (M = 0.25, SD = 0.16) used significantly more non-simple sentence constructions than the SB group (M = 0,07, SD = 0.07), t(24.77) = −4.58, p < .001 with a large effect according to Cohen’s d (d = 1.47).
Number of occurrences according to sentence type (percentages in parentheses).
For grammatical accuracy, an independent samples t-test showed that there was only a significant difference between the groups in the correct use of present tense with the DUB group outperforming the SB group (SB (M = 0.90, SD = 0.09)), DUB (M = 0.96, SD = 0.03; t(23.96) = −2.99, p = .006)). The effect size of this difference was large (d = .92). Gender and negation were found to be used equally accurately (see Figure 2).

Ratio of correct occurrences of gender, negation and present tense.
For vocabulary the Guiraud Index and L2 lexical access were analysed. Results show that lexical diversity as measured with Guiraud’s index was not significantly different (t(39) = −1.35, p = .186) between the DUB group (M = 6.73, SD = 0.50) and the SB group (M = 6.51, SD = 0.58). A word count analysis of all interviews showed that the DUB group used 5,035 French tokens (M = 251.75, SD = 88.01) and the SB group used a total of 3,951 French tokens (M = 188.14, SD = 50.41). An independent samples t-test showed that this difference was significant, t(39) = −2.85, p = .007.
We also counted to what extent the students used other foreign languages than French and in which situations these languages were used (Table 6). Participants mainly fell back to Dutch and English during the interview when they could not use French. They usually used single words when they lacked vocabulary – e.g. ‘party’, ‘homework’, spinazie (‘spinach’) and wandelen (‘to walk’) – or code-switched between French and Dutch in one sentence; e.g. c’est le mama de petite vogel (‘It’s the mother of the small bird’). Sometimes they also used entire Dutch sentences; e.g. Ik weet het niet (‘I don’t know’).
Absolute number of words in another language than French during the interviews.
3 Holistic versus analytical scores
To find out which analytical measures contributed to our perception of general proficiency, a correlation analysis was conducted. Because our variables were normally distributed, a two-tailed Pearson’s r correlation analysis (α = .05) to assess the relationship between the holistic and the analytic scores was performed.
General oral proficiency correlated significantly with almost all the analytical measures. Strong positive correlations were found between general oral proficiency and speech rate (r = .829, p < .001), general oral proficiency and STD (r = .720, p < .001) and general oral proficiency and the number of French words (r = .822, p < .001). A moderate positive correlation was found between general oral proficiency and lexical diversity measured by Guiraud’s index (r = .483, p = .001). A weak positive correlation was found between general oral proficiency and grammatical accuracy on three morphemes (r = .370, p = .017). A moderate negative correlation was found between general oral proficiency and the number of filled pauses (r = −.489, p = .001) and a weak negative correlation was found between general oral proficiency and the amount of non-French vocabulary (r = −.355, p = .023).
VI Discussion and conclusions
In our greater study (Rousse-Malpat et al., 2019), we compared the effectiveness of an SB or DUB approach in groups. The overall finding was that the DUB approach was significantly more effective than the SB in both speaking and writing, but as acknowledged in that study, one strong confounding issue was the amount of L2 exposure. In the Dutch SB classes, the teachers spoke the L2 about 40% of the time; in the DUB classes, the teachers spoke the L2 about 95% of the time. Therefore, it was not possible to separate the effects of method and exposure. However, there was one exception. One of the SB teachers, who taught a group of learners with a high scholastic aptitude, spoke mostly French. In this study, we compared this SB group with a DUB group matched in age and scholastic aptitude.
We compared the L2 French oral proficiency of two intact high school classes (n = 41) with relatively similar amounts of L2 exposure after three years of instruction. One group used a standard textbook with reading, writing, listening and speaking activities and some explicit grammar explanation and an implicit SB view of language. The teacher spoke French a great deal of the time and there was also spontaneous interaction in the L2 between the teacher and students. Still the book and exercises dealt with grammar and vocabulary quite separately. The other group used the Accelerated Integrative Method (AIM), which is in line with DUB principles in that language is seen as a network of conventionalized routines, and focuses on the use of linguistic routines at the phrase, chunk and clause level, which are frequently repeated. The teacher used a story script and spoke French exclusively from the first day on. To scaffold for meaning, all words were accompanied with a specific gesture. Classes consisted of various activities, including drills, to memorize the phrases, chunks and clauses used in the stories.
To compare the oral proficiency after three years of instruction, the standardized and validated SOPA test, which seeks to elicit as much oral free production data as the learners can provide, was used. Oral proficiency was measured in terms of holistic scores based on a rubric based on ACTFL (which worked quite well for our young students) and student texts were transcribed and analysed for several grammar, fluency and vocabulary measures. Our hypotheses were that the DUB method would be more effective on oral proficiency in terms of holistic scores, but we expected that the SB method would be more effective on some grammatical measures, especially accuracy in verb forms as they had been focused on to a great extent in the method.
The results showed that the DUB learners scored significantly higher on the holistic scores of general oral proficiency. In the analytical measures, the most striking findings were that the DUB learners showed a higher speech rate, had more diversity in sentence types, had more consistent L2 use and less non-target language use. All these analytical measures correlated very strongly with each other, suggesting that the more French words the participants used, the faster they spoke, and the more complex sentence types they used, the more proficient they sounded overall. Previous research has also shown such strong relationships between general proficiency and speech rate (e.g. Lennon, 1990; Riggenbach, 1991; Towell et al., 1996), and holistic ratings and syntactic complexity (Verspoor et al., 2012). Quite unexpectedly, the DUB group also outperformed the SB group in the correct use of present tense. For filled pauses, the other two accuracy measures, gender and negation, and vocabulary diversity (Guiraud) there were no differences.
The overall findings were surprising because both groups received a relatively great amount of L2 exposure in class, which is generally recognized as one of the best predictors in L2 acquisition (Ellis, 2002). The differences in outcome might thus be sought in the way that the L2 was presented. In the SB approach there were reading texts, listening exercises, writing exercises, conversation practice and grammar exercises. Although we did not analyse the French textbook in detail, it is clear that repetition of vocabulary, phrases, and formulaic sequences is much less built into the method than in the DUB method. The same holds for the use of common lexical bundles (Vandeweerd & Keijzer, 2018). Often the texts to be read and listened to are on different topics. Each chapter concerns a different topic and there is not enough review or repetition of the vocabulary, phrases or chunks dealt with previously.
In the DUB class, in contrast, it took two thirds of one whole academic year to listen to and read one simple story. For example, the first story they learn is Les trois petits cochons. As the story line is already familiar and predictable to most learners, it is comprehensible for students learning a new language. They can focus on the words and utterances rather than on deciphering the meaning. The story has a great deal of built-in repetition and students can easily recognize the most salient words such as the pigs or the wolf, even upon first presentation. Moreover, each word is presented with a gesture, aiding multi-modal learning. Furthermore, each exercise or activity is built on repeating the phrases again. One question not answered by this study, however, is whether so much time on the same story is motivating enough for the learners.
We believe that intensive meaningful, repeated exposure helped the learners to form all types of linguistic associations and be flexible in their L2 use. Presenting the learners with input made of conventionalized routines from these stories and pushing them to produce output gave them the experience and confidence they needed in the L2 to perform well. We think that all this time spent on drilling routines and producing as much language as possible made the participants speak faster and use more types of sentences. We can illustrate this with an example from the interviews. In all interviews we asked the question Qu’est-ce que tu aimes manger? (‘What do you like to eat?’). The less proficient students often reply in single words La pomme (‘The apple’), while more proficient students use complete sentences to answer and add more information on their own initiative, like J’aime les pommes et j’ai une pomme dans mon sac (‘I love apples and I have an apple in my bag’).
The fact that high exposure learners are more fluent and use different types of sentences is not a novelty. Studies in immersion programs in Canada already pointed out that these types of methods worked well on these aspects (Swain, 1991). As far as fluency is concerned, the groups were similar in their use of filled pauses. Our results suggest that counting filled pauses at the beginning stages of acquisition might not be a valid way to distinguish fluent and non-fluent speakers. The findings in fluency research to date were also ambiguous with regard to the utility of filled pauses as a measure of fluency (e.g. Bosker et al., 2013; Cucchiarini et al., 2002; Kormos & Dénes, 2004; Rekhart & Dunkel, 1992). It might be more interesting to investigate the distribution of pauses of foreign language learners’ oral production, as it is suggested that their pausing pattern is influenced by their L1 (Raupach, 1987).
As far as grammar is concerned, it is striking to find an advantage for the DUB method on some grammatical accuracy measures, especially considering the fact that the SB method spent a significant amount of time explicitly presenting and practicing the grammatical rules. We had expected the SB learners to be at least the same as the DUB learners in this respect. The lack of such a finding could stem from the fact that the three grammatical constructions analysed for this study may be categorized as ‘simple’ constructions and are therefore perfectly suitable for learning through hearing and producing many examples. However, the DUB students also used more different tenses, which they apparently picked up from the input. Our results on the accuracy of gender in an oral task are in line with Lyster (2004), who found that form-focused instruction on gender paired with recasts, prompts or no feedback was equally effective in the context of a lab setting. In their experimental groups, form-focused instruction was implemented with several activities in the form of inductive rule discovery and metalinguistic explanations. In our DUB group, there was only awareness raising of gender by means of specific gestures for feminine and masculine determiners. It could be that the gesture associated with the article helped in forming strong association strength or giving corrective feedback, but it is not unlikely either that the frequent repetition of the noun phrases (article and noun) as a unit may have helped memorize them as FUMMs.
As far as vocabulary is concerned, the study shows that there were no differences in lexical diversity. The SB learners learned lists of vocabulary whereas the DUB learners were asked to repeat words accompanied by a gesture. Both techniques seem to be effective. However, the DUB group had significantly greater L2 lexical access in that they consistently used more L2 in their interviews. They used significantly fewer non-French vocabulary items and did so in more different situations. We assume this was due to the chunking technique, which helped form strong associations between words, entrenching complete sequences. The DUB learners usually communicated in French and none of them used complete Dutch utterances in their speech. Our findings with regard to the use of non-French vocabulary are comparable to the findings of AIM research in Canada and in the Netherlands. Bourdages and Vignola (2009) found that AIM students used significantly fewer bilingual and English utterances than the non-AIM students. In the Netherlands, Rousse-Malpat and Verspoor (2012) also found that the non-AIM students used more Dutch during the interview and in more situations than the AIM students after one and two years of instruction.
In our opinion, the AIM method was especially effective because it allowed the teachers to speak the target language almost exclusively from the first day on to these absolute young beginners and helped them remember whole phrases and chunks. However, we believe that our findings also strongly support Lightbown and Spada (2013): the most effective approaches in language teaching are communicative approaches with language used meaningfully–preferably as authentic as possible–taught with a large amount of input. Unfortunately, in the Netherlands, there is still a very strong belief that rules drive the language system and grammatical accuracy – often explained in the L1 – needs to be achieved to avoid fossilization in the L2 (Graus & Coppen, 2018). Such beliefs are probably widespread as for example in L2 English in Vietnam (see Verspoor & Hong, 2013) and even L2 Spanish textbooks in the US seem to have a strong focus on grammar (Fernández, 2011). Hopefully, an awareness of usage-based linguistics and dynamic-usage-based approaches to teaching an L2 will help the teaching field realize that there should be a much stronger focus on the lexicon and especially chunks of multi-word and formulaic sequences, which should be repeated enough to become entrenched, and that even though accuracy is not achieved immediately, it will come given enough guided exposure and interaction.
There are a number of limitations to this study that need to be acknowledged and addressed in further research. First of all, we were limited in controlling for some known confounding variables in classroom-based research, such as teachers and attitudes and motivation of the learners. The groups differed in the number of hours of French lessons and the amount of L2 exposure, so even though they were similar, they were not the same. The DUB group had relatively more exposure, which could have influenced the results. Furthermore, the generalizability of our findings is limited, as the scope of our study concerned two groups of learners with a high scholastic aptitude in Dutch secondary education. The findings can thus not be generalized to all learners and all contexts. We have suggested that the differences between the two groups could be explained by the underlying assumptions each method has of how languages are viewed, but we cannot point at single variables that influence particular elements of L2 development because many variables dynamically interact over time and are impossible to tease apart. Thus, we need more detailed longitudinal studies in various DUB approaches to see what contributes relatively more to effectiveness: the amount of authentic input, frequency of exposure, the repetition of chunks, the enhanced output or the scaffolding in the form of gestures or visuals. However, the effect studies need to use free response data in which learners can show their true productive ability and should be gauged on overall proficiency rather than on some isolated grammatical issues.
Footnotes
Appendix 1
Script test oral: SOPA non-immersion.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/ or publication of this article: The author(s) received financial support from the NWO (Dutch research council) for this research.
