Abstract
This study has the aim of improving the English speaking ability of Taiwanese college freshmen by a video featuring connected speech instruction. Forty-eight students from a private university in northern Taiwan participated in the study, which lasted for 7 weeks. Pre- and post-tests were used to assess their speaking performance in terms of connected speech before and after the experimental treatment. Entry and exit questionnaires were also used to investigate students’ learning attitudes. The results show that such instruction was significantly effective for improving the English language learners’ connected speech skills. Positive results were also observed in the outcomes of the questionnaires, showing significantly enhanced learning attitudes to English speaking. It is hoped that the study results may offer language teachers some insights into the practice of video-aided learning in English speech classes, particularly its efficacy for connected speech.
Introduction
Over the past few decades, multimedia technology has rapidly grown in popularity in the field of foreign language teaching because of its numerous advantages (Abrams, 2002; Al-Jarf, 2004; Chun, 2016; H. C. Huang, 2015; Kubler, 2018; Meskill & Anthony, 2005; Salaberry, 2001; Warschauer, 2000). Of all the educational technologies, video-enhanced instruction has been one of the most widely discussed in both research and educational settings (Chun, 2016; Davis & Vincent, 2019; C. Lai et al., 2018; Y. J. Lin & Wang, 2018). This may be because videos often contain authentic aids (Mayora, 2009), rich audio-visual information (e.g., sound effects, facial expressions, body language, and other visual clues), and are full of cultural references (Canning-Wilson & Wallace, 2000; Galbraith & Rodriguez, 2018; Ketcha, 2019; Y. J. Lin & Wang, 2018). The visual and special nature of videos can easily convert complicated concepts into straightforward ideas for learners to retain, making video-based materials a strong reinforcement of second language acquisition (L.-F. Lin, 2010, 2011).
The multiple benefits of video-based instruction are in line with Bax’s (2011) view of the normalization of technology in language education, which refers to the stage where educational technologies are so commonly applied in language learning settings that users are unaware of their roles as effective elements in the learning process. Specifically, Bax (2011) proposed a neo-Vygotskian perspective on pedagogical technology in language classrooms, recognizing the complicated interaction between technology and the classroom’s social activities. While Vygotsky’s communicative theory helps educators study how computers assist in learning, a neo-Vygotskian perspective emphasizes learning and cognitive development (Bax, 2011). As a broad approach, it includes “cultural psychology, sociocognitive-developmental theory and sociohistorical theory” (Bax, 2011, p. 6), highlighting the fact that learning is a social communicative process, not an individualized one (Mercer & Fisher, 1997). Indeed, a learning context as described above is judged with specific potential to develop EFL learners’ language skills, such as speaking (L.-F. Lin, 2010, 2011; Watkins & Wilkins, 2011). Many previous studies (Hişmanoğlu, 2006; C. K. Hsu, 2015; Y. H. Lai, 2010; Mayora, 2009; Sun & Yang, 2015; Weyers, 1999; Wu et al., 2017) have also generated evidence in favor of using video technology to teach speaking skills, indicating that it can help learners by enhancing self-confidence, motivation, long-term listening comprehension, vocabulary learning, and pronunciation skills.
Despite the strong endorsement of the use of educational technology, its effects have not yet been fully exhausted in contexts such as Taiwan (Gu et al., 2021), particularly in terms of students’ English speaking proficiency in general and their specific performance in connected speech. To begin with, Taiwan is an EFL milieu with limited exposure to English. It is thus much more difficult to acquire adequate speaking competence than it is for those in an ESL milieu or whose mother tongue has a language system similar to English. As noted by Gilakjani (2011), the lack of motivation, limited exposure to the target language, insufficient emphasis by language teachers on pronunciation, and mother-tongue interference with English sounds and rules are four common obstacles that EFL learners may encounter in an EFL learning classroom. With regard to this, video-based instruction may help create authentic, linguistic input to improve Taiwanese EFL students’ speech development. It would seem especially helpful to look at whether or not a video featuring connected speech instruction may be effective in a Taiwanese setting. Connected speech, as Field (2003) defines it, confers intelligibility which features the acoustic content of the oral message recognized by interlocutors. This is often seen in the speech of native English speakers, who generally talk fast and without breaks, full of connected speech features such as contraction, intrusion, elision, assimilation, and weak forms (Brown, 1990; Cauldwell, 2002; Field, 2003). However, as Y. H. Lai (2010) and Liang (2015) found, Chinese-speaking EFL learners, like Taiwanese, often had difficulties in pronouncing such sounds, particularly when tense and lax vowels were involved, due to the fact that they were not in the Chinese language system. This finding was later supported in the paper by Wong et al. (2019). They submitted that difficulties in connected speech usually stem from the articulatory variations between one’s mother tongue and the target foreign language. Unfortunately, for the purpose of demonstrating clear and comprehensible speech, Taiwanese language teachers tend to articulate every single English word at the cost of connected speech. In such circumstances, most EFL learners may “have considerable difficulty in understanding what is being said” by native speakers in a real-life context, as Brown (1990, p. 6) warned. This stark reality is what Taiwanese EFL students have been wrestling with (D.-C. Hsu, 2015). Therefore, it is proposed here that authentic connected speech must be taught in Taiwan’s EFL classrooms, and the effects of using videos featuring this specific speaking skill are worth investigation.
Connected Speech and Speaking Ability
Many researchers have perceived that speaking is a complex and remarkably challenging cognitive skill. It requires various mechanisms to operate simultaneously. One area that is particularly difficult for learners to master is connected speech. According to Weinstein (2001), connected speech occurs while one is speaking at a natural speed, in which impromptu pronunciation is altered by adjacent words or sounds. Rosa (2002) indicates that connected speech, such as reduced forms, is common in spoken English, and can be identified in all registers regardless of the speech rate. Brown and Kondo-Brown (2006) describe connected speech as the result of “the continuous chains in normal spoken language and conversation as compared with the typical linguistic analysis of individual phonemes analyzed in isolation” (p. 284). They note that connected speech exists in “all levels of speech” (p. 5), from formal conversations to small talk.
The importance of connected speech is nowadays given increasing weight in the teaching of pronunciation and has led to other relevant issues being addressed in various pronunciation textbooks (e.g., Hagen, 2000; Weinstein, 2001). In spoken language, phonological processes such as reduction, elision, assimilation, and contraction are the four main pillars in constructing connected speech. According to Griffee (1995, p. 28), “connected speech is the natural way we speak, linking together and emphasizing certain words, rather than each word standing alone.” Regardless of how it actually works in native speakers’ speech, in English courses, connected speech may be uncommon for teachers to utter or for students to hear in their audio materials, which focus on providing comprehensible speech. “Many learners are accustomed to hearing a very careful, clear pronunciation of words, such as native speakers might use when talking very emphatically or saying words in isolation” (Rixon, 1986, p. 38). Therefore, this emphasis could result in students lacking the knowledge of connected speech and cause frustration in conversations with native speakers.
Instruction in connected speech has been well recognized as an effective way to help learners comprehend rapid speech better (J. D. Brown & Hilferty, 2006; Celce-Murcia et al., 1996; Matsuzawa, 2006). If the way that a non-native speaker talks is word by word, unconnectedly, his or her language may sound fragmented and unnatural and could exhaust the listener (H. Brown, 2001; Celce-Murcia et al., 1996). It should be noted that constantly practicing the essential features of connected speech in the target language is said to help non-native language learners obtain more native-like pronunciation and more understandable speech (Brown & Kondo-Brown, 2006). Hence, instruction in the features of connected speech will not only raise language learners’ awareness of the existence of these features but also help them advance their ability to use connected speech. As J. D. Brown (2006) asserts, it is crucial for learners to accommodate their registers and styles to the target language. To attain the goal of more mastery over connected speech and a more native-like delivery, it is vital to understand and know how to use the features of speech.
English teachers may consider using video-based materials to enhance Taiwanese students’ speaking abilities. However, in the specific EFL context of Taiwan, no empirical studies have emerged on teaching connected speech by means of video-based treatments. In fact, connected speech remains an under-investigated area in Taiwan’s academia. The two most relevant Taiwanese studies about connected speech instruction over the last decade are Kuo et al. (2013) and D.-C. Hsu (2015), but neither of them records the use of video-based materials. Kuo et al. empirically examined the performance of three groups of college students, having one taught with explicit connected speech-focused instruction; another with stress-focused instruction; and the third with no prosodic treatment. Kuo et al.’s results showed that those who received connected speech instruction outperformed in rhythm those in the other groups. Similarly, D.-C. Hsu investigated the effects of connected speech instruction on the listening and speaking performance of junior high school students. His results show that the participants who were taught about connected speech improved their listening abilities more than did their counterparts who received no treatment of this kind. Both studies confirm the values of teaching connected speech instruction to Taiwanese EFL students, but more relevant studies are urgently needed to shed light on the field, especially those employing the aid of video-based treatment. The present study was thus given the aim of contributing to the knowledge of the field by addressing the pedagogical effects of watching videos featuring connected speech instruction for Taiwanese EFL college students.
Research Questions
The Present Study
Participants
Recruited for the teaching experiment was a convenience sample from two intact Freshmen English Lab courses, which aimed at fostering students’ general English listening and speaking abilities, including pronunciation. Both classes were taught by the same teacher (one of the researchers) and met for 2 hr per week for 18 weeks. Initially, 93 Taiwanese freshmen consented to the experiment, agreeing to attend the treatment, take the relevant tests, and fill out questionnaires. Nevertheless, the data of only 48 students who completed all the requirements were included in the final analysis, 28 of them from the Information and Library Science department and 20 from the Spanish department. Those who failed to complete a pre-test, a post-test, or a questionnaire had their data excluded. The remaining participants consisted of 21 females and 27 males, ranging in age from 18 to 20 years, with an average age of around 19. They had learned English for about 10 years before the experiment. The level of their general English proficiency was somewhere between low-intermediate and intermediate (i.e., about CEFR A2-B1).
Treatment
A 7-week video experiment was applied in this study. While Table 1 summarizes each week’s topic, content, and the clip used, details of the experiment are elaborated below. First, the present researchers cherry-picked suitable online clips that matched the topic of each week. All the clips were sourced from YouTube, one of the most popular online, free video platforms with modern students and educators (Gu et al., 2021; C. K. Hsu, 2015; Mayora, 2009; Sun & Yang, 2015). The clips chosen met a series of criteria. For example, they had to present the main features of connected speech each week. The correctness of the content and material presented in each clip was also examined before use, safeguarding that the pronunciation and connected speech presented in the video were accurate and clear. Next, the speaking speed of the clips were made appropriate for the target participants. In addition, to keep the attention of the participants, the selected clips were short.
The Teaching Unit of the Corresponding Weeks.
The clips finalized for the treatment contained various rules of connected speech for each corresponding week. Week 1 was about the general features of connected speech. At the same stage, the students were given an overview to show how sounds are linked in English. They were also taught how words ending in vowels or consonants were linked to the following word. Clip 1 shown in Week 1 featured these ideas. In Week 2, the participants learned through the teacher and Clip 2 about connecting past tense –ed, which can be pronounced as [t], [d] or [Id], to the word following them. In Weeks 3 and 4, the students were first taught about the three sounds of the plural “s” in English: [s], [z], or [ɪz]. They then learned the way to connect these sounds to the words that follow them. Clip 3 featured these rules. In Weeks 5 and 6, the participants were shown Clip 4 about features of pronunciation that could easily be neglected by some Taiwanese students, and the videos were about the pronunciation of tapping sounds and the different between the pronunciations of [th] and [s]. They also learned about linking these sounds, where appropriate, with the words that follow them. In the final two weeks (7 and 8), the teacher and the videos (Clips 5 and 6) helped the students to practice the points that they had learned about linking and connected speech. A review of the overall points also took place in the last two weeks.
In showing these videos in the classroom, the teacher adopted five elements of educational practice (Bax, 2011):
Data Collection Procedure and Instruments
This section starts by describing the procedure of implementing the instruments and then describes their content and quality in detail. To begin with, after consenting to the study, the participants first finished a questionnaire about their learning attitude on entry. They then completed a reading aloud pre-test (Pre-test A) which had two texts. Then, the teacher commenced the 7-week experimental treatment. When the treatment was concluded, the same learning attitude questionnaire was administered again as an exit questionnaire. Then, the participants completed two different post-tests. One was Post-test A, which had exactly the same texts as Pre-test A had. The other was Post-test B; it had two different texts from those of Pre-test A. The reasons for and details of the design of the instruments are addressed below.
Questionnaire
The self-developed questionnaire (see Appendix A), designed with a 5-point Likert-type scale, has 20 items that examined both students’ general attitudes to English speaking/learning and their specific perceptions of learning pronunciation and connected speech. These were investigated together because, as discussed in the introduction, Chinese-speaking EFL learners often find pronouncing such sounds as connected speech challenging (Y. H. Lai, 2010; Liang, 2015; Wong et al., 2019), which is likely to affect their receptivity to learning English speaking in general. A questionnaire that investigated how the participants liked learning about English speaking in general and pronunciation and connected speech in particular should thus best reflect the pedagogical effects of a video featuring connected speech on student speakers’ learning attitudes as a whole.
To safeguard the quality of the questionnaire, an exploratory factor analysis (EFA) was conducted. This was done by means of a pilot study that involved 88 other participants, a sample size complying with the suggested ratio of participants for piloting a questionnaire (Cattell, 1978), namely, 3 (at least) participants:1 (a questionnaire item). As Table 2 shows, the KMO (Kaiser–Meyer–Olkin) value was high (KMO = .934,
KMO Value and Bartlett’s Test.
GEPT Reading Aloud sections
The GEPT (General English Proficiency Test) is a five-level criterion-reference testing system offered in Taiwan. The function of the GEPT is to assess EFL learners’ proficiency in general English, with the aim of advocating the practice of lifelong learning and fostering the use of the communicative approach in the field of English learning and teaching. The intermediate level of GEPT, the level of basic English communicators who can handle most conversations on everyday topics (D. Huang, 2017), was used in this study in view of the participants’ proficiency (i.e., between low-intermediate and intermediate levels). Note that only the Reading Aloud section of the GEPT intermediate level speaking tests was applied in this study given that its goal was to assess the participants’ ability to reproduce connected speech. Participants were requested to finish reading all the tests (i.e., Pre-test A, Post-test A, and Post-test B) (see Appendix B) within 2 min. Each test had two short Reading Aloud Sections from two different GEPT intermediate level speaking tests. Pre-test A and Post-test A shared exactly the same texts, so as to carefully access whether the participants had improved their speaking in terms of the same connected speech contained in them. In addition to Post-test A, the participants also took Post-test B, which contained totally different texts from the pre-test, so as to further examine whether they could effectively apply what they learned to different texts.
Raters and Rating Criteria
Two raters were involved in assessing the student speakers’ performance. One was the teacher of the course, and the other was an experienced college lecturer who also taught English speaking and listening at the same experimental site. Before assessment, they had consulted an expert in the field and a native speaker of English regarding the words in the test paragraphs that would naturally be connected in speech. The two consultants marked all the connected speech in the tests, thus providing rating criteria for the raters to follow. For example, in Pre-test A,
Data Analysis
To address the study’s objective, the data collected were quantitatively analyzed using IBM SPSS Statistics 23. First, a set of paired-sample
Results
This section first presents the results of the overall questionnaires and each questionnaire dimension. It then reports on the results of the connected speech pre- and post-tests.
Results of the Questionnaire as a Whole
A paired sample
Paired Sample
Results of the Questionnaire Subscales
According to Table 4 on the dimension of self-efficacy, there was a statistically significant difference between the entry scores (
Paired
Results of Pre-Test A and Post-Test A
According to Table 5, generally, in Pre-test A, the participants achieved 24% accuracy in word linking. The accuracy rate increased to 40% in Post-test A (namely, a rise of 16%). Such an increase was statistically significant since a statistical difference was found between Pre-test A (
Paired
Figure 1 presents a detailed analysis by illustrating the accuracy rates of each type of connected speech taught in this experiment. Notably, the participants showed especially great improvement in the linking of [k] and [a], with an increase of 31%. However, the participants in this study improved less with regard to linking [f] to sounds such as [a], [o], and [u]; they achieved only 19% accuracy in Pre-test A and 29% in Post-test, a gain of 10%.

Accuracy rate of word linking in Pre-test A and Post-test A.
Results of Pre-Test A and Post-Test B
In this section, only the overlapping linking sounds of Pre-test A and Post-test B were examined to determine whether the students were able to apply the learned connected speech skills in a different speaking task, such as that of Post-test B. The results were fruitful, with an overall 28% increase (Table 6). The paired
Paired
Presented in Figure 2 is the specific linking of the same consonants (i.e., [t]-[t], [t]-[a], [t]-[i], [s]-[a], and [s]-[o]) examined in both Pre-test A and Post-test B. As shown, the participants made observable improvements in Post-test B, showing that they were capable of applying the learned speaking skills in different tests. With regard to accuracy, a 20% increase was found in omitting the same consonant [t]-[t]; a 67% increase was gained in connecting [t] and [i]; a 41% increase was gained in [t]-[i]; and a 23% increase was gained in [s]-[o]. However, no improvement was made by the participants in the linking of [t] and [a]; in fact, it declined by 10%.

Accuracy rate of specific linking sounds in Pre-test A and Post-test B.
Finally, this section shows the results of students’ performance by looking at all the linking sounds between the tests. As Table 7 shows, they made an overall 8% improvement from Post-test A to Post-test B. This was found to be a statistically significant difference between the tests (Pre-test A:
Paired
Discussion and Conclusion
The purpose of the current study was to investigate whether the use of video instruction could constitute an effective way of helping EFL students learn to properly connect words in speaking English. According to the results of the data analyses shown above, progress in articulating connected speech by learners was observed after the 7-week experiment in video instruction. The same positive results were also shown in the participants’ perception of learning spoken English. The findings merit discussion in the field.
First, the results of the questionnaire revealed that students’ preference for self-efficacy and motivation in learning English speaking skills significantly increased after the experiment, which not only verifies the success of this study but also lends support to the pedagogical practice of video-enhanced instruction, as described in previous studies (Galbraith & Rodriguez, 2018; Hişmanoğlu, 2006; Ketcha, 2019; Y. J. Lin & Wang, 2018; Peters & Webb, 2018; Weyers, 1999). In particular, this finding relates to the results obtained by Herron (1994) and C. K. Hsu (2015). The former reported that video materials can help improve comprehension and students usually consider them more entertaining and enjoyable, which may lead to better information retention. The latter observed improved learning motivation in students who learned through video-aided instruction. Together, these results enhance Bax’s (2011) normalization of technology, proving that multimedia technology, such as the online clips used in the present study, functions to intensify learning quality.
Nevertheless, although positive results can be found in students’ perception of self-efficacy and motivation, no significant gain was shown in students’ preference as far as learning English speaking skills was concerned. This may be attributed to the fact that Taiwan offers mostly EFL contexts, so it is difficult to create ample chances for learners to speak or use English outside of the classroom. Such an unpromising environment may have caused students to respond unfavorably to certain subscale questions assessing their preference, such as “I like speaking English” (Item 13) or “I like to seek opportunities to practice my English speaking skills in my everyday life” (Item 15).
In addition, according to previous studies, immersing language learners in multimedia instructional environments is regarded as a highly beneficial learning tactic (Mayer, 2005; Plass & Jones, 2005). The findings of the present study confirm this statement and further correspond to those of Watkins and Wilkins (2011), whose findings endorse the effectiveness of conducting video instruction in second-language classrooms. In the present study, the participants’ accuracy rates in reading aloud increased significantly after the experiment. This means that they were able to successfully connect more words. This lends support to the findings of Kuo et al. (2013) and D.-C. Hsu (2015), in that connected speech instruction is feasible and can be effective with Taiwanese EFL learners. In addition, the present finding further demonstrates that integrating video instructions for pronouncing connected speech in EFL speaking classes can also be pedagogically effective.
In reading aloud, the most frequently achieved linking sound is a consonant link to the same or a similar consonant. In other words, when a word ends in a consonant and the next word starts with the same or a similar consonant, the consonants are linked together and the consonant sound has to be pronounced once only. From what the researchers of the study observed, many Taiwanese students tended to skip the final consonant sounds in words, which made it remarkably easy for them to achieve the “consonant + consonant” linking, such as [t]-[t].
However, it should be pointed out that despite the overall improvement in each individual linking sound, no significant improvement was found on the linking of [t] and [a]. Judging from the participants’ recording, this is possibly because some of the participants somehow failed to recognize that certain words should be treated as a chunk when reciting, for example, “at a public square,” “get a seat,” and “bought an umbrella” in the tests. This assumption is based on the evidence that quite a few of the participants would have separated the words in these phrases when reading. For example, some would utter “at,” paused briefly, and then in one breath read “a public square.” Likewise, others would have read “get” and then “a seat.” Still others said “bought” and then “an umbrella.” The reasons why they might tend to identify some chunks more than others are unclear, but this may be an interesting line of inquiry for future researchers when they consider examining students’ performance of certain connected speech sounds such as [t] and [a] or other similar sounds.
Even so, after the experiment, students gained more confidence in speaking English in general. They were also motivated to attend English classes to learn more about English speaking skills and were motivated to imitate proper intonation and pronunciation in speaking English. However, such improved attitudes or learning activity seem to have been confined only to the class times, because the classrooms are to them the main locations for learning English, where they have opportunities to speak English, and where they have tests. In light of this, teachers may consider creating opportunities for students to engage in using English in everyday life that would enable them to apply outside classes the skills they had learned within them.
The findings of the current study suggest a positive answer to the first research question, “Does video instruction enhance Taiwanese college students’ attitudes to learning connected speech?” Indeed, video instruction can help students gain self-efficacy and motivation in learning connected speech. The results also indicate a positive answer to the second research question: “Is video instruction an effective way of improving Taiwanese college students’ performance in connected speech?” This suggests that video instruction is an effective way for Taiwanese college students to learn connected speech, and one which has positive effects on improving connected speech competence.
According to Griffee (1995, p. 28), “connected speech is the natural way we speak, linking together and emphasizing certain words, rather than each word standing alone.” Connected speech is considered an integral part of language. However, some language learners may not be aware of the fact that connected speech, in fact, occurs in every language, so it also appears in their own language. Therefore, it is essential for language teachers to approach the features of connected speech in their teaching and raise learners’ awareness of its existence in the target language to prepare them to achieve fluency in speaking the target language. Constantly practicing the essential features of connected speech in the target language is said to assist non-native language learners obtain more native-like pronunciation and more understandable speech (Brown & Kondo-Brown, 2006).
While the study design itself was valid and enriched the current knowledge of the field, future researchers may consider addressing the following issues to make further contributions. First, student speakers’ long-term ability remains uncertain. Future researchers may add a delayed post-test to evaluate whether students have internalized the knowledge of connected speech that they learned. Second, in addition to the implementation of a questionnaire, conducting interviews to gain in-depth perspectives on the learning of connected speech may shed a different light. Third, it should be acknowledged that the present study had no control group, which may cause the findings of the study to be treated with caution. Future studies may contribute by comparing the effects of a video featuring connected speech instruction with those of a conventional treatment. Similarly, whereas the present study looked at a sample from only one experimental site, future researchers may consider examining participants from diverse settings so as to obtain a more comprehensive view of the pedagogical effects of learning with a video featuring connected speech instruction. Last but not least, it should be noted that in the treatment, the teacher made an overall final comment on her students’ speaking. Although this seems to be a common pedagogical practice in most language classrooms, future researchers may like to consider the possible effects that this particular teacher behavior may have on student speakers’ improvement.
Finally, the findings provide some suggestions for language teachers who wish to enhance their students’ speaking competence in English, especially in the aspect of connected speech, which serves as one of the fundamental elements in speaking a language. It is advised that teachers integrate video instruction in connected speech into their curriculum design, because video is very entertaining and easily arouses learners’ interest. Instruction in connected speech can not only help learners gain fluency in speaking the target language but also provide them with listening skills and better understanding.
Footnotes
Appendix A
Appendix B
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research and/or authorship of this article: This article was written with funding support from Taiwan’s Ministry of Science and Technology (MOST 108-2410-H-032-027; MOST 109-2410-H-032-059; MOST 109-2410-H-032-063).
