Abstract
This article reports a pre–post comparative study investigating whether the data-driven learning (DDL) approach has different pedagogical effects on grammar students of English as a foreign language (EFL) with different levels of English proficiency. The study entailed a treatment group (TG) of 95 first-year undergraduates who learned grammar using DDL and a control group (CG) of 84 students who received no grammar treatment. Most of the participants were 18 or 19 years old, with only a few outliers, aged 17 or 20. The grammar performance and learning attitudes in both groups (their motivation and self-efficacy) were quantitatively examined through grammar achievement tests and a questionnaire. The data obtained from the groups were then compared at three proficiency levels: high, intermediate, and low. The results of an analysis of covariance show that in grammar performance, the proficiency levels in all the TG students rose significantly and in the posttest they outperformed their counterparts in the CG. However, neither the members of the TG nor those of the CG made any statistically significant improvement in their learning attitudes; no significant differences were found between the groups at any proficiency level. The mixed findings make an important contribution to the field, confirming that DDL is pedagogically suitable for enhancing the linguistic knowledge of university-level grammar learners, regardless of their proficiency, but warning that practitioners who treat the development of learner attitudes (e.g., motivation and self-efficacy) as important should be cautious with this approach.
Introduction
The integration of corpus use into language teaching and learning contexts has received considerable attention and interest since Johns (1991) introduced the data-driven learning (DDL) approach. It teaches students no explicit rules but gives them corpus-based material to analyze to generalize contextualized linguistic features/patterns. This rule-inferencing model has hitherto enjoyed support from important learning theories (e.g., the discovery learning and noticing hypothesis) and has capitalized on much empirical evidence. For example, many scholars have found DDL effective for learning collocations (Daskalovska, 2015; Saeedakhtar et al., 2020; Uçar & Yükselir, 2015; Wu et al., 2019), writing (M. Chen et al., 2015; Crosthwaite, 2020; Mizumoto et al., 2017; Poole, 2016; Sun & Hu, 2020), and vocabulary (Karras, 2016; H. Lee et al., 2017, 2019, 2020; P. Lee & Lin, 2019; Tsai, 2019). Some researchers have also reported that their students react positively to the DDL approach (M. Chen et al., 2015; Mizumoto et al., 2017; Sun & Hu, 2020). Even so, Kılıçkaya (2015) warned that such a discovering approach may be inappropriate in educational contexts where learners are used to learning from teachers who offer didactic/explanatory instruction. In addition, DDL’s pedagogical suitability seems also limited in the context of adverse learner feedback (Hirata & Hirata, 2013; Kennedy & Miceli, 2001) and its effects cannot be judged superior to those of a traditional deductive approach (TDA) in certain areas of grammar (Smart, 2012). Still, these concerns have been mitigated by the meta-analytical study by Boulton and Cobb (2017), which showed overall consistent evidence in favor of the efficacy of DDL in almost every language learning context.
DDL does indeed have strong theoretical and empirical underpinning, but one critical area calling for an empirical investigation is comparing the efficacy of DDL on learners with different levels of language proficiency. This issue, which has so far been under-investigated in the field, is particularly important, as proficiency has been deemed by some to have an influential impact on either the affective learning experience of DDL students or their language performance (Hirata et al., 2013; Hirata & Hirata, 2013; H. Lee et al., 2019; Liu & Jiang, 2009). The issue is also worthy of discussion with regard to the current uncertainty about DDL use for lower-level language learners (Aston, 2001; Boulton & Cobb, 2017; Lin, 2016). An educational context that may seem especially suitable for examining this issue is Taiwan, where teacher-centered instruction is still dominant, as Lin and Lee (2019), Meyer (1988), and Wang and Tsai (2012) described it. Investigating whether or not DDL would successfully work in the Taiwanese context should thus shed further light in the field. This being the case, a comparative experiment was conducted on two groups of grammar students from Taiwan where English is mostly studied as a foreign language (EFL). Both groups had students of different proficiencies. One group was learning grammar with the DDL approach (namely, the treatment group, TG) and the other was receiving no grammar treatment (the control group, CG). To judge the pedagogic effects of DDL adequately, the grammar performance, learning motivation, and self-efficacy of both groups were examined and compared before and after the experiment. These two attitude variables are particularly important and should be examined together with grammar performance because motivation and self-efficacy have long been considered critical indications of students’ learning attitudes (Bandura, 1986, 1997; Dörnyei & Ushioda, 2013; Pajares, 2003). They have also been found to correlate strongly to learners’ performance (Zuffianò et al., 2013), in such areas as grammar (Lin, 2016) and writing (Lin, 2014). Considering all three variables should generate much evidence from which to judge DDL’s effects.
Along the line of discussion, the research question that arises to meet the goal of this study is whether or not EFL grammar students of different proficiencies would significantly benefit from the DDL treatment. To answer this question properly, two practical research questions were formulated:
After the treatment, to what extent do the TG participants with different grammar proficiency levels improve their grammar performance and learning attitudes?
After the treatment, to what extent do the TG participants with different grammar proficiency levels outperform the CG participants in terms of grammar performance and learning attitudes?
Theoretical Support for DDL
The nature of the DDL approach reflects the characteristics of the inductive approach, the discovery learning, and noticing hypothesis. First, an inductive approach has two major components: “1) the students’ attention is focused on the structure being learned; 2) and the students are required to formulate for themselves and then verbalize the underlying pattern” (Shaffer, 1989, p. 396). These operational aspects are consistent with those of DDL-centered treatments, where learners begin by focusing on target linguistic data and then on their own generalize the observed grammatical rules/features. Induction as such is further believed to be capable of reducing students’ cognitive workload, which is often associated with traditional didactic lectures (Sweller et al., 2011), and thus helps learners to concentrate on developing the meaning and structure of language use (Boulton & Cobb, 2017). That learners’ ability to use induction is nurturing by using DDL is generally accepted (L. S. Huang, 2017), but some scholars (H. J. H. Chen, 2011; Kirschner et al., 2006) tend not to recommend the inductive-based approach for low-level language learners, deeming it difficult for them.
Being inductive, DDL also involves discovery learning. The process of discovery learning, as Richards and Schmidt (2002) elaborate, comprises five major steps: observing, inferring, formulating, predicting, and communicating. Essentially, DDL also embraces these constituents. Specifically, it “entails encouraging learners to take the role of language researchers by systematically engaging in discovery learning (Gavioli, 2001) and in learning how to learn through observations, analyses, interpretations, and presentations of language-use patterns in corpus data” (L. S. Huang, 2011, p. 482). The correspondences between discovery learning and DDL have hitherto been widely acknowledged in the field (Boulton & Cobb, 2017; Frankenberg-Garcia, 2016; Liu & Jiang, 2009; Vyatkina, 2013, 2016), although it has also been advised that students participating in discovery learning tasks should have attained a certain level of linguistic knowledge (Johns, 1991; H. Lee et al., 2019).
Furthermore, when involved in DDL-centered activities, learners are focused on observing and analyzing recurring linguistic features of study, so their language awareness is enhanced (Boulton & Cobb, 2017; Flowerdew, 2015; Hadley, 2002; L. S. Huang, 2011; Timmis, 2015), leading to possible gains in language use (L. S. Huang, 2017). This method and its anticipated outcome recall the noticing hypothesis proposed by Schmidt (1990, 2001). As he asserts, increasing language awareness is a key element whereby learners consciously notice the linguistic features of interest; this turns input into effective intake and thus generates successful output. Such an innate quality of noticing in DDL has been added to experimental assessments, and much positive evidence in its favor has been produced (Hadley, 2002; Hong, 2010; Liu & Jiang, 2009; Mizumoto et al., 2017; Moon & Oh, 2018; Sealey & Thompson, 2007).
DDL Effects in Grammar Classes
While ample evidence has been found in favor of DDL treatments for the learning of different linguistic skills (e.g., vocabulary, collocation, and writing), this section focuses entirely on addressing DDL’s effects on the learning of grammar, which refers to the principles by which a person puts words together to form phrases, clauses, or sentences. Johns (1994) was one of the earliest scholars to advocate the instructional effects of DDL on grammar students. He submits that allowing learners to analyze a sufficient amount of organized input (i.e., concordance lines) will effectively facilitate their grammar acquisition. Johns’ views are endorsed not only by early scholars (Conrad, 2000) but also by later experimental findings. For example, Hong (2010) empirically examined two groups of grammar students and found that the group taught with corpus-aided instruction developed a stronger consciousness at the level of noticing, that is, the cognitive ability for learning the use of determiners (e.g., zero article and countability). Smart (2014) reported similar effects, writing that inductive DDL treatments resulted in significantly more gains in learning the passive voice than did traditional deductive treatments. Z. Huang’s (2014) comparative study further revealed that referencing a corpus helped students create more accurate lexico-grammatical patterns than those of students who consulted dictionaries alone. In addition, Lin (2016) and Lin and Lee (2019) found that treatments blending DDL and the TDA led to enhanced grammar skills and learning attitudes, although the improvement was mainly nonsignificant when compared with that of a pure TDA. Recently, Moon and Oh (2018) also revealed that secondary students benefited more from DDL grammar than from traditional grammar teaching, specifically in the use of the verb to be.
The above discussion seems to support the use of DDL in EFL grammar classes. However, such approval is mostly for higher-proficiency language learners rather than for lower-level students. In fact, few have empirically focused on verifying whether DDL suits this group. To begin with, Aston (2001) warned that corpus-based activities may not be appropriate for beginning-level students. This is because DDL tasks, being inductive and based on discovery learning, may require a certain level of linguistic knowledge on the part of students to cope with them (H. J. H. Chen, 2011; Johns, 1991; Kirschner et al., 2006; H. Lee et al., 2019). This concern is endorsed by Liu and Jiang (2009), who found empirically counterproductive cases with students who either had low levels of language proficiency or lacked vocabulary knowledge. Some recent works (Boulton & Cobb, 2017; Lin, 2016) also suggest that DDL may benefit intermediate-to-advanced-level learners, but excludes low-level students. Nonetheless, one study by Boulton (2010) showed encouraging results with beginning grammar students: they significantly improved their grammar performance and showed a preference for DDL. In light of the scarce and inconsistent empirical evidence, further empirical experiments are urgently needed to confirm whether or not DDL is pedagogically suitable for EFL students with different proficiencies, particularly those at lower levels.
The Present Study
The general research design and procedure for the current study is presented in Figure 1 and then discussed.

Flowchart of the research design.
Sample
The study, based at a single university in Taiwan, recruited a convenience sample of 179 first-year non-English-major college students from four general English classes (Classes A, B, C, and D) which aimed at developing students’ general English skills, with a particular focus on reading, vocabulary, and grammar. Class A (52 students) and Class B (43), forming the TG (95), were taught by the present researcher, who had taught English skills using DDL at higher education for several years. Class C (41) and Class D (43) formed the CG (84 students) and were taught by another teacher, who had had several years’ experience of teaching English skills to undergraduates. The gender distribution in the TG was 43 males and 52 females; the CG had 20 males and 64 females. In both groups, most of the participants were 18 or 19 years old, with only a few outliers, aged 17 or 20. Before taking part, all the participants had studied English for approximately 10 years in Taiwan’s educational system. The general English proficiency levels in both groups were mostly at the A2 (TG: 65 students; CG: 63) and B1 levels (TG: 28; CG: 18) in the Common European Framework of Reference for Languages. Only a few reached the B2 level (TG: 2 students; CG: 3). Once enrolled in the experiment, the students’ specific grammar abilities were explicitly examined by means of a grammar achievement pretest (described below in the data collection section) and, for analysis and comparison purposes, set at one of three distinct levels: high, intermediate, and low (described in the “Data Analysis” and “Results” sections).
The Treatment
The TG treatment
The TG had one 90-min DDL grammar lesson per week for 3 weeks. The grammar items for each week comprised the language use of agree and deprive (Lesson 1), adjective clauses introduced by who/whom/whose (Lesson 2), and adjective/noun clauses introduced by that (Lesson 3). These items were purposely selected because, in his years of teaching experience, the current researcher had found that many of his previous EFL grammar students were confused by them, thus deeming them suitable material to test the effects of DDL on grammar learners at different levels of attainment.
Following the instructional model of Lin (2016), the TG was first shown concordance lines containing the node word (the key words in context) for observation (see Figure 2 for illustration). The material, which the researcher compiled and delivered to students in the form of printed handouts, comprised authentic examples from the Corpus of Contemporary American English (corpus.byu.edu/coca). In reading the concordance, the TG was given several minutes to answer a general question (e.g., Observation in Figure 2); they could either work out on their own or discuss with peers and share their findings. When they were unable to analyze the concordance or offered inaccurate inferences, more guidance or questions were given, such as “Please observe the preposition phrases or patterns after the key words,” “Please judge the part-of-speech of to in the first two sentences and that of to in the following sentences,” or “Can you detect any differences between the functionality of who and whom in these sentences?” After sharing answers, learner understanding was checked by creating sentences, answering multiple-choice questions, and judging correct or incorrect grammar sentences (see Figure 2 for illustration).

Example of data-driven learning material.
The CG treatment
In contrast, over the 3 weeks, the CG was given neither a DDL treatment nor taught the specific grammar items chosen for the TG. Instead, they focused primarily on learning about general reading skills and appreciating textbook articles. For example, reading strategies were explained to them, such as scamming/scanning articles and inferring their main ideas. They were also taught the concepts of topic sentences and concluding sentences, which needed skills that would be advantageous when the students read for main ideas. The teacher also demonstrated how the skills/strategies were acquired by using the actual reading material. Afterwards, the teacher asked the class if they had any questions regarding the knowledge or application of these reading skills/strategies. To all the questions, the teacher would give a full answer. In addition, the CG also learned about appreciating different genres of writing and discussed them. For instance, some articles concentrated more on presenting opinions about social events; some were more intent on reporting scientific findings; while others told interesting historical stories. The classes learned how to interpret or understand these different genres more effectively. They also discussed the ideas of each article after they had finished reading. The teacher raised questions, invited answers, and gave feedback. Finally, the meaning of some vocabulary items was explained when necessary, but explanations of language use and grammar rules relating to those designated for the TG were avoided.
Data Collection Instruments
Grammar achievement tests
Two sets of self-created grammar achievement tests were administered as pre/posttests for both groups. Specifically, they comprised concordance lines from the COCA. Each set had 30 questions, with every 10 items (six 4-option multiple-choice questions and four correct-incorrect grammar items) focusing on one distinct grammar lesson. Three other senior teachers teaching the same courses at the experimental site were invited to evaluate the quality of the test questions. They agreed with the researcher that the items adequately reflected the target linguistic features. Furthermore, a pilot study involving 117 other undergraduate students was conducted to verify the validity and reliability of the grammar tests. The pilot study results show that the whole test was valid because all its items had good item difficulty (all difficulty values between .40 and .80) (Chase, 1978) and item discrimination (all discrimination values higher than .25) (Noll et al., 1979). In addition, test–retest reliability was also obtained, with Pearson’s r indicating strong correlation (r = .91, p < .001) between the test scores that the pilot study participants produced on two different occasions. These results suggest that the tests were suitable test instruments for this study. The sets were then randomly assigned for use in a pre- or posttest, in which each correct answer counted as 1 point, 0 being the minimum and 30 the maximum.
Motivation and self-efficacy questionnaire
To understand learner attitudes, the researcher adapted the 5-point Likert-type-scaled questionnaire designed by Lin (2016), which also examined learner motivation and self-efficacy in regard to grammar learning. The changes made to his version comprised minor rewording of the grammar items examined, to make it fit the grammar points of the present study. For example, some original items asked about learner confidence in using grammar passives, relative clauses, or phrases to express purpose. These were changed to noun clauses introduced by that, adjective clauses introduced by that, or agree/deprive and their phrases. The quality of the revision was then examined in a pilot study involving another sample of 173 participants. The results of factor analysis show that the 15-item revision had overall strong validity (67.23% variance explained) and reliability (Cronbach’s α = .918). It also comprised three valid and reliable components: (1) self-efficacy in learning and using grammar (Items 1–7) (27.26% variance explained; Cronbach’s α = .915), (2) self-efficacy in identifying learned grammar (Items 8–11) (22.47% variance explained; Cronbach’s α = .848), and (3) motivation to learn grammar (Items 12–15) (17.51% variance explained; Cronbach’s α = .779).
Data Analysis
The data of the grammar tests and questionnaires were analyzed using several statistical methods. First, descriptive statistics presented the grammar pretest results, which served as criteria dividing all the participants into three levels of grammar proficiency. The top 1/3 of the scorers were assigned to the high level, the bottom 1/3 to the low level, and the remainder to the intermediate. Second, independent t-tests examined whether any significant differences existed between the groups at each level, in terms of both grammar and questionnaire outcomes. Third, paired-sample t-tests certified whether after the experiment each group determined by level improved its grammar performance and learning attitudes. All the t-test results were reported with effect sizes using Cohen’s d (Plonsky & Oswald, 2014). Fourth, a set of ANCOVAs (analysis of covariance) compared whether, after the experiment, the groups with different levels differed from each other in their grammar posttest results. Likewise, several MANCOVAs (multivariate analysis of covariance) examined whether the groups with different proficiencies differed from each other in the exit questionnaire results. The effect sizes reported for both ANCOVA and MANCOVA results were partial η2.
Results
This section is divided into three parts. The first part presents the results of both groups’ entry behaviors in terms of grammar performance and learning attitudes. The second part shows the results of the participants’ changes (if any) after the experiment, to answer Research Question 1. The last part presents the results of the comparisons between the groups, which answers Research Question 2.
Results of Entry Behaviors
Entry grammar proficiency
Table 1 presents the descriptive statistics of the grammar pretest results of both groups. Each group’s top 1/3 scorers, namely, those who obtained 17 points or above, were labeled high proficiency (TG: 44 students; CG: 22), the bottom 1/3 were labeled low proficiency, with 14 points or below (TG: 17 students; CG 22), and the remaining group, with either 15 or 16 points (TG: 34 students; CG: 40) were labeled intermediate. In addition, independent t-tests further revealed no statistically significant differences (p < .05) among the groups as a whole and between each level of the groups. These results suggest that either as a whole group or at different proficiency levels, the groups had similar grammar proficiency at entry, which made them suitable for purposes of comparison and further analysis.
Descriptive Statistics for the Pretest Grammar of Both Groups.
Note. TG = treatment group; CG = control group.
Entry learning attitudes
A set of independent t-tests further showed that both groups also had similar attitudes (motivation and self-efficacy) to learning grammar whether compared as intact groups or between each level of the groups, since all the examined items evinced no statistically significant t-values (p < .05).
Results of Improvements After the Experiment: Answers to RQ1
Grammar improvements after the experiment
Table 2 presents the pre–post comparisons of both groups’ grammar performance. As the table shows, after the treatments, all the TG participants of different levels significantly improved their grammar performance, Whole: t(94) = −10.51, p ≤ .000; High: t(43) = −4.57, p < .001; Intermediate: t(16) = −6.45, p < .001; Low: t(33) = −9.47, p < .001, with nearly large to very large effects (d-values between 1.28 and 10.84). In addition, the low level TG had the largest gain (pre–post mean difference: 6.5), the intermediate level had the second largest (5.53), and the high level had the least (2.91). In contrast, however, in the posttest grammar, the CG as a whole had no statistically significant gain, t(83) = 0.43, p > .05, and neither did the high level CG, t(22) = 1.11, p > .05. The intermediate level CG was even found to statistically deteriorate, t(22) = 2.50, p < .05, d = 3.31. The only statistically significant gain for the CG was found in the low-level participants, t(40) = −2.04, p < .05, although its effect size was rather small (d = 0.65). Overall, this shows that the TG’s improvement in grammar was not random but subject to the DDL treatments.
Paired Sample t-Tests for the Grammar Tests of Both Groups for All Levels.
Note. TG = treatment group; CG = control group.
Learning attitudes after the experiment
However, in terms of learning attitudes, there were no statistically significant differences between entry and exit questionnaires of both groups at any level (p > .05) (Table 3), suggesting that the treatment had no effects on the development of the TG’s motivation and self-efficacy with regard to learning grammar.
Paired Sample t-Tests for Entry-Exit Questionnaire Results for Both Groups for All Levels.
Note. TG = treatment group; CG = control group.
Differences Between the Groups After the Experiment: Answers to RQ2
Grammar differences
Before examining any differences between the groups on the grammar posttest, all the major ANCOVA assumptions were tested and found tenable. First, all the grammar test scores were entered into SPSS and were checked in terms of normality of distribution via their skewness and kurtosis indices. Both values were found to be within the range of ‒1 to +1, suggesting that the data were of acceptably normal distribution and thus suitable for developing parametric analysis (Field, 2017; Peng & Woodrow, 2010). Second, as discussed above, the assumption of the independence of the covariate and treatment effect was met, since no significant difference was found between the groups at all levels (p > .05). Third, Table 4 shows nonsignificant F-values at p > .05 at all levels when customizing the ANCOVA model to examine the independent variable and covariate interaction. This indicates no statistically significant interaction between the groups at all levels, justifying the assumption of the homogeneity of the regression slopes. In short, the assumption test results verified the appropriateness of using the grammar pretest results as a covariate in running the main ANCOVA analysis for the posttest grammar performance.
The Interaction Between the Independent Variable Group and the Covariate (Pretest Grammar) When Posttest Scores Were Examined as the Dependent Variable.
While Table 5 presents the grammar posttest scores before and after adjustment using pretest grammar scores, ANCOVAs for the scores show that the group effects are significant at all levels, Whole: F(1, 176) = 86.716, p < .001; High: F(1, 63) = 13.150, p < .001; Intermediate: F(1, 33) = 45.673, p < .001; Low: F(1, 71) = 37.314, p < .001, with large effects (partial η2 = .330, .173, .581, .344). The results indicate that, following grammar treatments, the TG significantly outperformed the CG on the grammar posttest, lending support to the claim that DDL treatment affects grammar students of all levels of proficiency.
Adjusted and Unadjusted Group Means and Variability for the Posttest Grammar Scores Using Pretest Scores as Covariates.
Note. TG = treatment group; CG = control group.
Learning attitude differences
Before analyzing the differences between the groups with regard to their learning attitudes, the assumptions for running MANCOVAs on the data were tested. First, the assumption of the independence of the covariate and treatment effect had previously been found to be tenable, since nonsignificant independent t-test results emerged from the groups at all levels. Second, the assumption of the homogeneity of regression slopes was also tenable, as shown in Table 6: no statistically significant interactions were found between any level and its corresponding entry questionnaire score (p > .05). Third, Levene’s tests of equality of error variance were all statistically nonsignificant for all the exit questionnaire scores (p > .05), suggesting that the assumption of the homogeneity of variances was met for the scores at all levels. The results endorsed the feasibility of performing main MANCOVA analyses on the data.
The Interaction Between the Independent Variable Groups (Levels) and the Covariates—Entry Questionnaire Scores When Exit Scores Were Examined as Dependent Variables.
Table 7 presents the adjusted mean scores for each exit questionnaire item after controlling for their overall entry scores as the covariates. In Table 8, no significant differences were found between the groups at all levels, Whole: Hotelling’s Trace = .029, F(3, 174) = 1.67, p > .05; High: Hotelling’s Trace = .015, F(3, 61) = 0.303, p > .05; Intermediate: Hotelling’s Trace = .055, F(3, 34) = 0.619, p > .05; Low: Hotelling’s Trace = .052, F(3, 69) = 1.197, p > .05. This suggests that, after the experiment, the participants at different levels between the groups had similar attitudes to learning grammar for a linear composite of the questionnaire dimensions. In other words, no significant DDL treatment effects were found on the learning motivation and self-efficacy of the TG.
Descriptive Statistics for the Exit Questionnaire Results After Controlling for the Covariate.
Note. TG = treatment group; CG = control group.
Multivariate Tests for the Groups’ Exit Questionnaire Scores by Hotelling’s Trace.
Discussion
The aim of this article was to describe the effects of the DDL approach on EFL students with different grammar proficiencies. It reports on an empirical assessment that compared the performance of three groups of students of different proficiency levels who were either in the TG learning grammar with DDL or in the CG not doing so. Both groups’ grammar performance and learning attitudes (motivation and self-efficacy) were quantitatively investigated. It was found that, in grammar performance, all three proficiency levels of the TG students made significant gains and outperformed their counterparts in the CG on the posttest. However, neither the TG nor the CG made any statistically significant improvement in their learning attitudes; no significant differences were found between the groups at any proficiency level. These mixed findings from the field merit further discussion.
First, the finding that the intact TG had statistically significantly better gains in grammar than the CG echo those of Crosthwaite et al. (2019), Hong (2010), Z. Huang (2014), Lin and Lee (2019), Smart (2014), and Moon and Oh (2018). These researchers also reported that corpus-aided language learning leads to significantly more grammar gains than do traditional deductive treatments. Taken together, these findings lend further support to the pedagogical practice of DDL in grammar classes. In addition, although the above scholars and the current researcher all examined different grammar points (e.g., relative clauses, determiner use, be verbs, passive voices), their positive findings in turn collectively alleviate the concern raised by Lin (2016) and Liu and Jiang (2009), in that the language or grammar skill focus of a lesson is deemed to be a factor that affects students’ receptivity to the DDL approach and/or its effectiveness on their performance.
The pedagogical suitability of DDL to grammar learning has been especially validated by the fact that all three different levels of the TG statistically significantly improved their grammar performance and outperformed their counterparts in the CG. To some extent, the positive learning gain may lend support to the theoretical claims for DDL. That is, having students learn inductively, which is the theoretical practice of DDL (L. S. Huang, 2017), does indeed have a beneficial effect on their acquisition (Shaffer, 1989). Similarly, the finding also dissolves the doubts of some scholars (Johns, 1991; H. Lee et al., 2019) who advised against students of different levels learning with a discovery-learning approach such as DDL. Finally, since DDL was operationalized through focusing students’ attention on observing the grammar rules, the significant improvements that the TG participants collectively made here further verifies that DDL does in fact effectively reflect the noticing hypothesis (Schmidt, 1990, 2001), as discussed in the literature review.
The fact that the greatest improvement was found in low-level DDL students is worth special attention. Echoing Saeedakhtar et al.’s (2020) study, this finding goes beyond the belief of most previous researchers on effectiveness of using DDL with high and intermediate proficiency students (Aston, 2001; Boulton & Cobb, 2017; H. Lee et al., 2019; Lin, 2016). The finding goes so far as to recommend it also as beneficial for lower-proficiency learners. It also endorses that of C. Y. Lee and Liou (2003), that low-vocabulary-level students were found to benefit more than students at other levels from concordancing. In addition, the result supports Boulton (2010), whose investigation shows that beginning grammar learners can also benefit from learning with corpus material. Further, the finding has helped allay concerns raised by Lin (2016), H. J. H. Chen (2011), Hirata and Hirata (2013), Hirata et al. (2013), Liu and Jiang (2009), who all cast doubt on low-level students’ abilities to cope with DDL-centered activities, such as analyzing corpus data on their own.
Several probable reasons exist for the differing performance of the low-level students in the above studies. One possible cause, especially worth discussing here, is the different design of the corpus material used in them. Whereas the students of, Hirata and Hirata (2013), Hirata et al. (2013), and Liu and Jiang (2009) were asked to consult electronic corpora by themselves, the students in the current study were presented with paper-based, short-listed concordance lines cherry-picked by the researcher. The latter material, containing clear and targeted linguistic features, may be easier for students, even those of lower-level proficiency, to understand and interpret (Boontam & Phoocharoensil, 2018; Boulton, 2010; H. Lee et al., 2017, 2019; Moon & Oh, 2018). As a matter of fact, many of the student complaints about corpus-aided treatments are mainly about the time and effort needed to understand and analyze the numerous difficult sentences that an electronic corpus randomly throws up (M. Chen & Flowerdew, 2018; Liu & Jiang, 2009; Wu et al., 2018). Likewise, “the cognitive burden of using new technology (the concordancing software tool) during DDL may [have also] inhibit[ed] learning” (Moon & Oh, 2018, p. 51) in the present study. This may even apply to low-proficiency students (H. Lee et al., 2019) who must struggle with linguistic knowledge while learning to cope with an electronic corpus. Likewise, when compiling the printed material, the present researcher not only chose precisely sentences that were complete and stand-alone but also excluded those containing information too difficult to understand, for instance, carrying too much sophisticated vocabulary or requiring field-specific knowledge. However, the paper-based DDL material used in the study of Lin (2016) clearly did not consider such principles, thus probably increasing the difficulty of the task for his DDL students (Boontam & Phoocharoensil, 2018; Moon & Oh, 2018; Yoon & Hirvela, 2004), especially the lower-proficiency learners.
An important reason that low- and intermediate-level students benefited more from DDL grammar than did high-level students should be underlined. First, the advanced DDL students may have been somewhat more confined to their past learning experience. That is, they may have been able to gain greater benefit from the TDA, since the former’s being assigned to the higher-level group at the beginning of the current study resulted from their past learning model (Boulton, 2010). That being said, it does not mean that the DDL approach is inappropriate for higher-level students, because they also made significant gains with DDL. Rather, perhaps more of the higher-level students, such as those in the current study, may benefit more from traditional teaching than can those relatively unsuccessful language learners using a past learning method. Similarly, this reasoning does not suggest that all lower-level students would benefit from DDL treatments. Instead, for those specific low-proficiency students, such as those in the current study, who had no successful learning experience with the TDA, DDL can serve as an alternative and rather more effective pedagogical option.
Possible reasons should also be discussed for the significant improvement in the grammar performance of the low-level students in the CG, but not of the other students of higher-level proficiencies. One possibility is that, through the CG treatment where the main focus was on reading skills and appreciating English articles, the participants might at least have improved their general reading abilities and/or vocabulary. Such enhanced skills, however basic they may seem, may be advantageous for any language learners, especially those who are less successful, since they may be more likely to undergo greater struggles in more than one area in a learning a foreign language than higher-level student are. Therefore, after being empowered to read slightly more effectively and/or to have more vocabulary, the lower-level CG students might have been able to do better than they originally would. This may be more deeply appreciated if one considers that reading was the main skill they had for comprehending the test items; better knowledge of vocabulary probably might also have helped them to better understand the meaning of item statements. Although this may sound like a side issue in this study, the discussion here seems able to suggest that general or broad reading may also be an effective way to help low-level language students to improve. In contrast, to enable higher-level learners to break through their originally strong bottleneck in learning a foreign language, more influential or intensive pedagogical treatments may be needed.
While the discussion tends to acknowledge DDL’s usefulness in the teaching of grammar, the results drawn from the questionnaires somewhat undermine this tendency: the DDL approach did not improve the learning motivation and self-efficacy of the TG students with different proficiencies; nor were differences found between the TG and the CG at all levels. This is in line with Lin (2016), who also found in DDL no superior effect to the traditional approach in terms of the same affective factors examined here. Although the correspondence of the two studies seems to cast some doubt on the effects of DDL on students’ learning attitudes, this may be explained by the fact that both of the studies lasted only 3 weeks, which is perhaps too short a time to nurture such important affective perceptions. This explanation is particularly likely given that, before participating, none of the participants of either study had any experience of learning with the corpus-based inductive approach. Instead, they had long been exposed to learning grammar using the TDA, the mainstream practice even in today’s English grammar classes in Taiwan (H. C. Lee, 2013; Lin & Lee, 2019; Smith, 2011; Tamney & Chiang, 2002). This very experience of the past may have caused those students to take much more time to adopt affirmative attitudes about a wholly new approach such as DDL, rather than picking up linguistic knowledge through it. The observation of M. Chen and Flowerdew (2018, p. 355) lends support to this argument: DDL learners are given more responsibilities in exploring answers from the concordance by themselves, thus “mak[ing] the learning process appear to be more challenging and take longer.”
Finally, it should be acknowledged that the current study has certain limitations that require readers to treat its findings carefully, and it awaits contributions from future researchers. First, the sample size of this study justifies confidence in judging its findings, but it may still fail to represent the whole population of EFL grammar students. Future researchers may consider participants other than college students and explore how such students with different proficiencies react to DDL instruction. This line of inquiry would be more interesting when examining whether paper-based DDL and hands-on DDL with a computer would have different effects on students with different proficiency levels. Likewise, this study determined students’ levels of language proficiency only by means of grammar tests, neglecting other equally important language skills such as vocabulary/collocation. This leaves a gap for future researchers to fill. Furthermore, the current study focused solely on learners’ levels of language proficiency and did not take into account the participants’ learning preferences. This is worth pursuing, given the possibility that before the experiment, those who were deemed high achievers might prefer or be better able to benefit from their past learning model (i.e., TDA) than the low achievers. Understanding whether or not the DDL approach is beneficial for different learning styles can make it easier for future practitioners to tailor its application in class to more or less receptive students. Last but not least, in this study the TG learned grammar with DDL for a total of only 6 hr, which may have prevented them from sufficiently experiencing its practice, and thus suppressed some of its effects, positive or negative, on their affective perceptions. Future studies are thus suggested that they consider providing students with more intensive or longer hours of DDL learning so as to reveal richer and more accurate treatment effects on this dimension.
Conclusion
This study empirically examines the effects of the DDL approach on learners with different levels of language proficiency. The study findings are particularly meaningful in that they empower the present researcher to verify that DDL is pedagogically suitable for enhancing the linguistic knowledge of university-level grammar learners, regardless of their proficiency. However, it must not be forgotten that this success seems to capitalize on the camera-ready DDL material prepared by the teacher. Any inference that most of the students would continue to benefit from hands-on DDL by using an electronic corpus by themselves should be treated with caution. This weighty issue is especially noteworthy for rigorous educators and practitioners who prioritize the importance of nurturing learner attitudes (e.g., motivation, self-efficacy, or others), since this study has not been able to offer robust evidence in favor of or against DDL on this matter.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This article was written with funding support from Taiwan’s Ministry of Science and Technology (MOST 108-2410-H-032-027; MOST 109-2410-H-032-063).
