Abstract
This study investigated Turkish preservice English language teachers’ self-efficacy beliefs over the course of a 4-year language teacher education (LTE) program. Teacher self-efficacy, or teachers’ confidence in their ability to perform specific teaching tasks, is a way of understanding teachers’ pedagogical capabilities. Drawing on survey data from 261 preservice teachers in two LTE programs in Türkiye, an exploratory factor analysis was conducted that resulted in a 34-item survey with 6 factors: (1) language instruction, (2) classroom proficiency and language use, (3) culture, (4) student-focused instruction, (5) technology, and (6) assessment. The newly developed survey accounted for 59.21% of variance and showed high internal consistency (α = .97). Utilizing this new survey, a cross-sectional analysis was conducted looking at teachers’ self-efficacy across their 4 years of study. This group of preservice teachers showed highest confidence in their ability to incorporate technology in their teaching, whereas they showed the least confidence in their ability for language instruction. Teachers’ self-efficacy did gradually increase over the 4-year program in some areas, but pairwise differences were not statistically significant. These results are discussed in relation to studies on LTE and teacher self-efficacy, which have historically shown mixed results. Pedagogical implications are discussed in relation to LTE programs and teachers’ personal development.
I Introduction
Language teaching, as a distinct area within education, has been the subject of ongoing discussions and changes, particularly regarding the preparedness and adaptability of in-/preservice teachers to adjust to emerging advancements, such as the increasing role of technology in language learning and the diverse needs of language learners (Hülshoff & Jucks, 2024; Matsumura & Tatsuyama, 2024). This requires language teachers to adapt to different circumstances, as language teaching necessitates different planning, classroom dynamics, strategies, and assessment compared with other subject areas (Thompson & Woodman, 2019). Given these pedagogical demands, a question arises about how confident teachers feel about their abilities to respond to these situations. Over the last 2 decades, research on the self-efficacy beliefs of language teachers has explored how in-service and preservice teachers perceive their capacity to teach a language within various space and time conditions (Wang et al., 2024). However, language teacher self-efficacy (LTSE) research has predominantly relied on research instruments that are inadequate to address the specific requirements of language classrooms (Karas et al., 2024). When current profound shifts in language teaching are considered, there is a need for domain- and context-specific self-efficacy instruments for language teachers to conduct more rigorous, relevant, and detailed analyses (Wyatt, 2018; Wyatt & Faez, 2024). Addressing this gap becomes significant in exploring language teachers’ confidence and capabilities amid dramatic changes in language teaching with a context- and content-focused lens.
Another issue arising from the ongoing changes is how language teacher education (LTE) accommodates the emerging necessities of learning spaces. The progress and capacity-building of language teacher candidates throughout their training are crucial for both material and emotional preparation for school settings. Hülshoff and Jucks (2024) suggest that preservice language teachers should be provided with opportunities to enhance their skills and self-efficacy beliefs about technology tools during both the initial and later stages of their programs. In addition, over their practicum experience, it is vital to mentor the professional development of language teachers to enhance their self-efficacy levels (T. Hoang & Wyatt, 2021; Matsumura & Tatsuyama, 2024). This elicits the question of how student teachers’ self-perceived competence towards complex dimensions of language teaching changes in different stages of their academic life while engaging in different courses, modules, and experiences in tertiary education. In light of these, focusing on how preservice language teachers’ self-efficacy levels evolve as they proceed in their training can help explore the influence and effectiveness of teacher education programs to equip their students with the necessary resources. Moreover, this may enable us to identify the factors that should be prioritized and optimized at different stages of training, helping prospective teachers build confidence in various aspects of contemporary classroom settings.
Within this framework, this research aims to make a valuable and up-to-date contribution to the field by providing insights into the embedded factors and discipline-specific aspects of self-efficacy beliefs among language teacher candidates. While exploring the underpinning structural domains of self-efficacy of language teachers, it also examines how their self-efficacy beliefs converge and diverge over the course of their teacher education programs.
II Measuring LTSE beliefs
LTSE beliefs have been predominantly measured via surveys (Wyatt, 2018). Early studies (e.g., Chacón, 2005; Eslami & Fatahi, 2008) relied heavily on the Teachers’ Sense of Efficacy Scale (TSES) developed by Tschannen-Moran and Woolfolk Hoy (2001). The TSES is a general education survey presented in two versions, a short 12-item and a long 24-item form, with three factors: student engagement, classroom management, and instructional strategies. The overuse of TSES, a general education measure for capturing LTSE beliefs, has been widely criticized (e.g., Faez & Wyatt, 2024). Ideally, self-efficacy beliefs should be measured by a survey that includes task-, domain-, and context-specific items (Bandura, 1997). However, even with some modifications, the TSES can only go so far in addressing domain- and context-specific LTSE issues.
Karas et al. (2024) synthesized 83 studies on LTSE up to 2023 and found that 78% (65 studies) used the TSES—either in its original (50) or modified (15) form, and this was prominent in the Turkish context with 11 of 13 Turkish studies using the TSES in some form. Only 18 studies employed domain- or context-specific surveys, focusing on areas such as general English teaching (T. Hoang & Wyatt, 2021; Karas, 2019), pronunciation (Zhang & Faez, 2024), or grammar instruction (Wyatt & Dikilitaş, 2021). Some of these specialized scales were tailored for particular teaching contexts such as French as a second language (L2) in Canada (Cooke & Faez, 2018) or adult English as a L2 in Ontario (Faez & Valeo, 2012) and were developed using various survey structures. A few studies have developed measures specific to the Turkish English as a foreign language (EFL) context; however, the underlying factors covered more general aspects of teaching and language instruction, such as instructional planning, language teaching, assessment, and professional development (Üstünbaş & Alagözlü, 2021; Yaman et al., 2013). Given the increased role of educational technologies and digital tools in language teaching, particularly over the past decade, the use of technology has been a neglected aspect of prior LTSE measures in this context. Thus, there was a need for an up-to-date, domain- and context-specific instrument that could be used with preservice teachers in Türkiye.
Just as a range of LTSE scales have been developed, their approaches and factor structure are even more varied. Some scales have been subject to factor analysis (e.g., Karas, 2019), whereas others have not (e.g., Kissau, 2012). In keeping with the focus of this study, we discuss the factor structure of general English teaching scales. Studies that have generated scales for teaching general English have addressed the specific tasks related to language teaching in their context. Akbari and Tavassoli’s (2014) 32-item scale consists of 7 subscales of (1) efficacy in classroom management and remedial action, (2) efficacy in classroom assessment and materials, (3) efficacy in skill and proficiency adjustment, (4) efficacy in teaching and correcting language components, (5) efficacy in age adjustment, (6) efficacy in social adaptation, and (7) core efficacy. Thompson and Woodman (2019) created a 15-item scale in the Japanese context across 5 subscales of (1) using English, (2) communicative teaching, (3) teamwork, (4) student achievement, and (5) managing workload. Karas (2019) developed a 26-item scale across 6 subscales of (1) classroom proficiency, (2) learner-focused instruction, (3) assessment, (4) language instruction, (5) culture, and (6) materials. In Vietnam, T. Hoang and Wyatt (2021) developed a 23-item scale consisting of 6 subscales of (1) motivational English instruction, (2) developing English teaching materials, (3) communicative English, (4) managing classroom activities, (5) managing misbehaviors, and (6) teaching exam-oriented English.
There are certainly similarities and differences among the factor structures of existing LTSE scales. Some of these differences stem from the specific contexts in which the studies were conducted. For example, classroom management, a subscale in the original TSES, is not prominently featured in other LTSE scales (e.g., Karas, 2019; Thompson & Woodman, 2019). Other variations arise from the nature of language teaching itself and the ways in which researchers, whether adopting a global or task-specific perspective, design their items. One challenge lies in the inherent interconnectedness of L2 teaching and target language proficiency, since language functions as both the content and the medium of instruction (Freeman, 2017; Richards, 2017). Earlier studies often measured proficiency and efficacy using separate scales, whereas more recent work has integrated aspects of proficiency into efficacy measurements. For instance, Nishino (2012) includes an L2 self-confidence subscale; Karas (2019) uses classroom proficiency; and Thompson and Woodman (2019) incorporate using English as a subscale. Culture also features prominently in several scales (e.g., Cooke & Faez, 2018; Karas, 2019; Swanson, 2012), whereas materials efficacy appears in a few (e.g., Akbari & Tavassoli, 2014; Karas, 2019). More unique dimensions, such as age adjustment (Akbari & Tavassoli, 2014) and teamwork (Thompson & Woodman, 2019), are specific to individual scales. Thus, the content and structure of LTSE efficacy scales are as diverse and context-dependent as the language programs/teachers they aim to measure.
III Development of self-efficacy beliefs
The formation and development of self-efficacy beliefs are characterized by a nonlinear process shaped by various sources. As underscored by Bandura (1997), an individual’s self-efficacy beliefs are inherent in 4 major sources: (1) enactive mastery experiences, the individual’s reflection on past performance outcomes; (2) vicarious experiences, self-evaluation based on the observation of others’ performances; (3) verbal persuasion, others’ feedback on one’s performance; and (4) physiological and affective states, how the individual feels while evaluating oneself. During the formation of LTSE beliefs, these sources can be self-evaluative based on one’s own teaching performances, observation of other teachers’ teaching performances, colleagues’ feedback, and one’s emotional states during self-reflection. While teachers’ past teaching performances in the form of enactive mastery experiences are noted as the most salient source of teacher efficacy, this may not necessarily be the case with novice and preservice teachers due to not having enough mastery experience to reflect on (Tschannen-Moran & Woolfolk Hoy, 2007).
Although teachers’ self-efficacy beliefs were once thought to be fixed constructs that get stabilized upon formation of their basic structure (Chacón, 2005; Pajares, 1992), this misconception has been challenged, and self-efficacy beliefs are increasingly recognized as dynamic constructs that are prone to changes (Wyatt, 2013). However, exploration of the changes in LTSE beliefs is an under-researched strand that has only started to gain momentum more recently. For instance, findings of the limited research into changes in in-service LTSE beliefs evidenced how these beliefs can develop through processes such as professional development programs (Ortaçtepe & Akyel, 2015) and research experience (Wyatt & Dikilitaş, 2016).
LTSE beliefs of preservice teachers are equally important. Given that self-efficacy beliefs of teachers, especially those of preservice teachers, are malleable and can be responsive to efficacy building procedures (Henson, 2002; Woolfolk & Hoy, 1990), learning about the changes in LTSE beliefs is quite valuable for the refinement of teacher educational processes. However, the changes in LTSE beliefs of preservice teachers and their efficacy development are under-researched (Wyatt, 2018). In line with the potential impact of mastery experiences based on Bandura’s (1997) self-efficacy framework, prior studies mostly explored the development of LTSE during the teaching practicum (e.g., Atay, 2007; Yüksel, 2014). This limited body of research has provided empirical evidence for the dynamicity and variability of LTSE beliefs over the practicum period. However, the development of different domains of LTSE beliefs appears to be uneven.
Most of the studies on preservice teachers’ LTSE development relied on data gathered through the original form, or a slightly adapted version, of the TSES by Tschannen-Moran and Woolfolk Hoy (2001). One study by Atay (2007), for instance, traced the preservice EFL teachers’ efficacy changes in three general domains of teaching and demonstrated divergent developmental trajectories for these beliefs in different domains and tasks of teaching. She revealed a growth in efficacy for classroom management and student engagement, but an unexpected decline in efficacy for instructional strategies over the practicum period. Administering the TSES three times over the 2-semester practicum period, Yüksel (2014) displayed a weakening of overall efficacy beliefs following the school observation period, and then an advancement of efficacy beliefs at the end of the final semester during which preservice teachers practiced teaching. In line with the sources of self-efficacy beliefs (Bandura, 1997), this implies a decrease of efficacy due to the adverse effect of negative self-reflections related to vicarious experiences and the consolidating effect of mastery experiences, respectively. It also highlights the value of “sensitive mentoring that draws attention to their strengths and capacity to grow with a view to building self-efficacy beliefs” during the teaching practicum (Wyatt & Faez, 2024, p. 3). By tracking efficacy changes in both the practicum period and the induction year, Şahin and Atay (2010) evidenced a growth of efficacy in general after the practicum, but a decline after the induction year (though not significant). As to the domain-specific development of efficacy beliefs, whereas efficacy for student engagement and classroom management grew significantly after the practicum, efficacy for instructional strategies did not have significant growth. Following another route, Cabaroglu (2014) integrated action research engagement in the practicum period and evidenced the consolidation and a significant growth of efficacy for all three domains of the TSES. These changes in LTSE beliefs of preservice EFL teachers have evidenced the dynamicity and fluidity of efficacy beliefs over the teaching practicum period. However, these were limited to general teaching skills, and efficacy for domains and tasks specific to language teaching has largely been neglected.
In addition to the above studies in Türkiye (for a full review, see Ölmez-Çağlar, 2024), a few recent studies in other contexts similarly set out to gauge the development of LTSE beliefs of preservice EFL teachers. Relying on qualitative data, a case study in Bulgaria showed the ways reflected-upon teaching experiences can consolidate one’s practical knowledge and result in a growth in LTSE beliefs during the practicum (Markova, 2024). In the Vietnamese EFL context, T. Hoang and Wyatt (2021) implemented an instrument with items adapted from the TSES, and unlike the above studies, those developed based on the context and the domain of EFL teaching. Although LTSE beliefs for all domains included in the instrument appeared to grow after the practicum, the most significant growth was seen in domains of efficacy for general pedagogical skills measured by the constructs and items in the TSES. However, there was not a similar substantial growth in efficacy for domain-specific sides of EFL teaching, and these areas were found to be less prone to changes. Although a significant increase was apparent in efficacy for motivational English instruction over the practicum period, no significant change was found in efficacy for developing English materials and teaching communicative English after the teaching practicum.
To summarize, prior research in various contexts has provided different findings regarding the development of and changes in preservice EFL teachers’ efficacy beliefs in different domains of teaching and have done so by often exploring the teaching practicum period. However, a scrutiny of the entire preservice teacher education process with a focus on the development of domain- and context-specific LTSE beliefs, not just more general TSE beliefs, is neglected. As preservice teachers gain both theoretical and practical knowledge through their initial LTE, not just the practicum period, an inquiry into LTSE beliefs during a 4-year undergraduate LTE program can indeed offer fuller insights into how the efficacy beliefs specific to EFL teaching can develop over the years in preservice education.
IV Methodology
This study was guided by the following research questions.
RQ1. What factors underlie the LTSE beliefs of preservice English language teachers in Türkiye?
RQ2. What are preservice teachers’ levels of self-efficacy across 4 years of undergraduate training?
1 Research context and participants
Undergraduate English language teaching (ELT) programs in Türkiye are housed within faculties of education and serve as the major pathway for prospective English language teachers. These programs are currently offered by 58 state and 14 private universities in Türkiye. For admission to these undergraduate programs, high school graduates take a nationwide university entrance exam that is centrally organized and held once a year. To gain admission to English-language-oriented departments such as ELT, they take an English language test as well as a comprehensive test consisting of sections for Turkish language, social sciences, math and sciences. Students gain admission to relevant departments based on their scores and ranking determined through the test scores and their high school GPAs. Among state universities, the ELT programs at both University A and University B have been ranked in the top ten, based on their latest minimum admission scores. The minimum level of English proficiency required of preservice English language teachers at both universities is B2, as defined by the Common European Framework of Reference (CEFR). Most students admitted to undergraduate ELT programs at these universities are exempt from the preparatory class by taking a skills-based English proficiency test aligned with the CEFR. The rest of the students study at the English preparatory school to reach the B2 level before progressing to the 4-year faculty coursework.
A 4-year ELT program in Türkiye involves 8 semesters of compulsory and elective courses with a total number of 240 European Credit Transfer and Accumulation System (ECTS) credits. Under the coordination of the Turkish Council of Higher Education, undergraduate ELT programs consistently follow uniform standards for preservice English LTE, mainly through required courses. However, universities and their relevant boards can also determine the specific courses, such as electives, the placement of courses in the curriculum, and credits in these teacher education programs. The ELT programs include a wide array of pedagogical and field-specific courses with theoretical and practical sides as well as some courses for the improvement of general knowledge. They also offer a 1-year teaching practicum in the fourth year of undergraduate education. During the final year, students take two teaching practicum courses, one per semester. Each student is allocated to a practicum school under the Ministry of Education and practices teaching with a mentor teacher from that school. Each student is also supervised by an academic from their university during this process. As part of the practicum, preservice teachers observe the classes of mentor teachers and practice teaching English in these classes. Upon graduation, they can be appointed to both state and private schools as teachers of English at K-12 levels. However, to get appointed in K-12 state schools, they also take a comprehensive high-stakes test and a field-specific test about ELT.
Participants for this study were preservice teachers studying in two 4-year ELT programs in Türkiye. In total, there were N = 261 participants with n = 200 from University A and n = 61 from University B. The two state universities are located in highly populated, metropolitan cities in the northwest and southwest of Türkiye. Although university regions do not explicitly influence the quality of preservice education, the top 10 ELT programs among state universities in Türkiye, ranked by admission scores, are all located in metropolitan cities. This may also affect successful students’ university choices in these cities, but not necessarily in their home regions. The majority of participants were female (n = 157), followed by males (n = 99), and a small number who elected not to identify their gender (n = 5). Participants were spread across the 4 years of study in the program with n = 60 first year students, n = 89 second year students, n = 55 third year students, and n = 57 fourth year students.
2 Research instrument and procedure
This study adopted a quantitative design. Data were collected from an online survey developed by the researchers. The survey consisted of 3 main sections. Part 1 of the survey asked participants to provide background information (e.g., gender and year of study). Part 2 of the survey investigated participants’ language proficiency, which is reported in a different study, and asked teachers to self-report their English proficiency using the CEFR self-assessment scale. Furthermore, to assess teachers’ actual proficiency, they reported the results of their most recent Yabancı Dil Testi (YDT) [Foreign Language Test], an important part of the nationwide standardized university entrance exam taken by students seeking to study in programs related to foreign languages. Finally, the last section of the survey included 44 LTSE items across a range of language teaching areas. Initial self-efficacy items were drawn from Karas (2019). In Karas (2019), items were initially generated from various TESOL standards documents (e.g., TESOL International Association, 2008, 2015); then, items were sent to experts for review and the initial survey was piloted with graduate students, with further revisions based on this feedback. After factor analysis, a 26-item, 6-factor survey was finalized with the following factors: (1) classroom proficiency, (2) learner-focused instruction, (3) assessment, (4) language instruction, (5) culture, and (6) materials (factors further explained in the following). However, the instrument used by Karas (2019) was not specific to the Turkish context, so further items were generated by the team of researchers to address any potential missing areas and make the instrument more suited to the Turkish context. Specifically, items related to the use of technology in language classrooms were included, reflecting the global increase in online teaching, with this new subscale of Technology serving as the seventh potential subscale before analysis. Items were presented on a sliding scale from 1 (cannot do at all) to 6 (highly confident can do) and the entire survey was presented in English through the online survey tool Qualtrics. Participants were invited to complete the survey via an email link and during some of their classes while they completed their degrees. Once participants consented to be part of the study, they used their own devices and could complete the survey in their own time; participation was completely voluntary with participants allowed to withdraw at any point by simply not finishing the survey.
3 Analysis
In order to assess the structure of the survey, an exploratory factor analysis (EFA) was conducted. With communalities mostly in the mid to high range (i.e., .5 and above; see Table 1), the sample size of N = 261 was sufficient to conduct EFA (Fabrigar et al., 1999; Field, 2018). The Keyser–Meyer–Olkin (KMO) measure of sampling adequacy was .954 and Bartlett’s Test of Sphericity was significant, again indicating the data were appropriate for EFA. Multicollinearity was also inspected via the correlation matrix and the determinant of the R matrix was above the minimum value of .00001 outlined by Field (2018) and none of the items were too highly correlated with each other. To determine the number of factors to retain, numerous criteria were used as suggested by the literature (e.g., Fabrigar et al., 1999; Henson & Roberts, 2006; Plonsky & Gonulal, 2015). As an initial retention criterion, the Kaiser principle of eigenvalues above 1 was utilized. However, because this is not always reliable (Costello & Osborne, 2005), the scree plot, percentage of variance, and interpretability were also considered (Fabrigar et al., 1999; Loewen & Gonulal, 2015). The initial 44 items were entered for analysis using principal axis factoring. Direct oblimin rotation was utilized to allow items to correlate (Field, 2018).
Final factor analysis loadings and communalities.
Note: LI = language instruction; CP + LU = classroom proficiency and language use; Cul = culture; SFI = student-focused instruction; Tech = technology; Assess = assessment; Com = communalities.
Using this newly formed scale, RQ2 was answered by taking the descriptive levels of self-efficacy for the whole group and for each year of study. The individual item means, subscale means, and overall mean are listed. To determine whether there were significant differences for the preservice teachers’ levels of self-efficacy across the 4-year program, a series of analyses of variance (ANOVAs) were conducted with each individual subscale and again the overall scale. If a significant difference was found with the ANOVA, pairwise comparisons were conducted with the Bonferroni correction.
V Results
After confirming the data was appropriate for EFA, the initial analysis saw 6 factors emerge based on eigenvalues; however, the scree plot did not provide further clarification and the factors were not easily interpretable. To determine which items to keep, two criteria were utilized: factor loadings and interpretability. First, a benchmark of .3 was used for factor loadings (Field, 2018; Floyd & Widaman, 1995). However, the .3 benchmark was not rigid and we considered interpretability and the importance of certain items as well because EFA should not be solely about statistical benchmarks (Fabrigar et al., 1999). Some items did not load onto any factors and a few items cross-loaded with other factors. Many of the problematic items were related to using and adapting materials for instruction, which emerged as an unexpected factor in Karas (2019). In total, 10 items were removed, and analysis was rerun and a 5-factor solution emerged based on eigenvalues. However, the results were not interpretable and a 6-factor solution appeared to better suit the data. Thus, a 6-factor solution was forced. The eigenvalue for the sixth factor was just below the cutoff of 1 (.88) and the scree plot did not offer further clarification. However, this solution was easily interpretable and accounted for a good percentage of variance (59.21%). Following Fabrigar et al. (1999) who note factor retention is a “substantive issue as well as a statistical issue” (p. 281), the 6-factor solution with 34 items was adopted. See Table 1 for the final 34-item factor results and structure.
1 Factor 1: language instruction (8 items)
The first factor to emerge included items that were related to language teaching; thus, the subfactor was labeled as language instruction. There were initially 10 language instruction items, but 2 of the oral language items (Items 13 and 16) loaded with Factor 2 (outlined next). The factor loadings for this item were generally strong and 6 of the 8 items were above .4. Item 7 was only slightly below .4 (.392). However, Item 8, which focused on teachers’ confidence to teach vocabulary was just below the .3 level (.290) and also cross-loaded onto Factor 5. The items on Factor 5 relate to using technology to enhance instruction, so it did not conceptually fit with this factor. Removing the item entirely was considered, but this would leave a clear conceptual gap for the language instruction subscale as teaching vocabulary is a key element for language teachers all around and specifically in Türkiye. Thus, considering its loading was just marginally below the arbitrary cutoff of .3, and its conceptual importance (Fabrigar et al., 1999; Loewen & Gonulal, 2015), the item was maintained on Factor 1. The internal consistency of the subscale was α = .90 and this was not enhanced if any item was removed; thus, it was determined to keep the vocabulary item and have 8 items for language instruction.
2 Factor 2: classroom proficiency and language use (9 items)
The second factor was labeled as classroom proficiency and language Use. The initial scale from Karas (2019) included items that related to teachers’ confidence in their abilities to teach English in English, which draws on the notion of English-for-teaching (Freeman et al., 2015). These items, with the exception of 1 item which was removed (see Appendix in the online Supplemental Material), labeled as classroom proficiency in Karas (2019), formed Factor 2 as expected, but 2 items related to oral language instruction unexpectedly loaded onto this factor as well instead of Factor 1. With the exception of Item 15 which pertains to providing written corrective feedback in English, the other items on this factor involved orally using English in the classroom, therefore there was sufficient conceptual basis to maintain the 2 oral language instruction items on this subscale. Thus, this factor looks at teachers’ confidence to use English as the medium of instruction and provide instruction on English language in the classroom. It showed high internal reliability of α = .91.
3 Factor 3: culture (4 items)
The third factor contained items focused on teachers’ confidence to enact culturally informed instruction. This factor had items that asked teachers to assess their self-efficacy to promote intercultural awareness and cultural diversity, as well as integrate cultural elements when teaching and effectively make cultural comparisons. This subscale initially contained 6 items, but 2 problematic items were removed because of poor loadings. This left a 4-item subscale with good internal consistency α = .85.
4 Factor 4: student-focused instruction (4 items)
Factor 4 investigates teachers’ confidence in their capabilities to enact learner focused instruction. There were initially 5 items on this subscale, but an item related to collaborating with student families showed poor loading and was removed. It is likely that these preservice teachers did not have much, if any, experience dealing with student families, which may have contributed to that item being problematic. After its removal, this factor’s items included teachers’ self-efficacy to motivate students, adapt lessons for differentiated instruction, develop student self-efficacy for English, and use students’ first-language skills advantageously. With 4 items, the internal consistency was good (α = .81).
5 Factor 5: technology (4 items)
The fifth factor comprised items related to teachers’ self-efficacy to use technology to enhance language instruction. This includes teachers’ self-efficacy to use appropriate educational technology to design teaching materials, monitor students, and support students learning English. This subscale initially included 4 items and factor analysis reaffirmed this as all 4 items loaded onto Factor 5 with an internal consistency of α = .84.
6 Factor 6: assessment (5 items)
The final factor contained items focused on teachers’ confidence for assessment. This includes teachers’ self-efficacy for designing and selecting appropriate assessments but also creating rubrics and making appropriate use of assessment results. Initially, this included 6 items, but 1 item was removed because it did not load onto any factor. The 5-item subscale showed strong internal consistency with α = .89.
The new survey accounted for 59.21% of variance. The overall scale also showed high internal consistency of α = .97. The factors were also correlated with one another (Table 2), suggesting that the overall scale is also suitable to measure overall self-efficacy at a more global level.
Factor correlation matrix.
Note: LI = language instruction; CP + LU = classroom proficiency and language use; Cul = culture; SFI = student-focused instruction; Tech = technology; Assess = assessment.
7 Levels of self-efficacy
Participants’ overall score for all items was M = 4.60 (SD = 0.75, 95% CI = 4.51/4.69). Tables 3–8 present each individual item score and the subscale scores. All of the subscale scores were close together, but the highest was for technology (M = 4.75), whereas the lowest score was for language instruction (M = 4.43). The highest individual item score was found on classroom proficiency and language use (i.e., “I can present information in English,” M = 5.03), whereas the lowest individual item score was found on language instruction and focused on teachers’ efficaciousness to teach the sound system of English (i.e., “I can teach the sound system of English (phonology),” M = 4.02).
Language instruction.
Classroom proficiency and language use.
Culture.
Student-focused instruction.
Technology.
Assessment.
8 Levels of self-efficacy across the LTE program
The second research question investigated how preservice teachers’ self-efficacy develops over their 4-year BA ELT program. The full N = 261 participants were spread across the 4 years with n = 60 in first year of study, n = 89 in the second year, n = 55 in the third year, and n = 57 in their fourth year. Table 9 presents the full descriptive information for each subscale and mean scores for each year. Analyzing the mean scores, most scores were between 4 and 5 with the fourth year mean for technology being the only score to go above 5. For the subscales of classroom proficiency and language use and culture, the scores gradually increased in a linear fashion from first to fourth year. However, this was not the case for the other subscales or the overall scale. For student-focused instruction, technology, assessment, and the overall score, participants’ self-efficacy went up from first to second year but then saw a minor dip in scores as participants were in their third year. For language instruction, scores increased from first to third year, but the scores went down in the final year.
Preservice teacher self-efficacy across LTE.
Note: *p ⩽ .05.
To determine if there were any significant differences between participants’ self-efficacy ratings based on year of study, a series of ANOVAs were conducted with all of the six subscales and the overall scale. Full ANOVA results can be found in Table 9. With the exception of the assessment subscale, none of the ANOVA results indicated significant differences, although the technology subscale did approach significance (p = .08). Because the ANOVA results did not show significant differences beyond assessment, no further post hoc results are reported for those analyses. Looking at the assessment subscale, post hoc analysis using Bonferroni correction did not show any significant differences between the pairwise comparisons of the specific years of study. The difference between first and second year students approached significance (p = .085), as did the difference between first and fourth year students (p = .070), but no pairwise comparison was statistically significant. This cross-sectional analysis shows that, even after 4 years of teacher training, this group of preservice teachers in Türkiye did not see significant differences in their teaching confidence as they progressed through their LTE program.
VI Discussion
1 LTSE factor structure
An important consideration in examining LTSE is identifying the factors and items that constitute the scale used to measure it. In developing the survey for this study, particular emphasis was placed on including items related to: (1) language instruction, (2) teacher language proficiency, (3) culture, (4) learner-focused instruction, (5) use of technology, (6) instructional materials, and (7) language assessment. Notably, items relating to teacher language proficiency, referred to by Freeman et al. (2015) as English-for-teaching, were included to account for teachers’ ability to teach English with English, which is an important consideration in ELT. Furthermore, the growing role of technology in education, particularly in the post-COVID context, also motivated the inclusion of technology-related items. After the original 44-item survey, grouped under 7 subscales, was administered to participants, an EFA yielded a refined 34-item scale organized into 6 subscales: (1) language instruction (8 items), (2) classroom proficiency and language use (9 items), (3) culture (4 items), (4) student-focused instruction (4 items), (5) technology (4 items), and (6) assessment (5 items). While the resulting factor structure was generally consistent with Karas (2019), some distinctions were observed.
The 3-item “Materials” subscale from Karas (2019), which was expanded to 5 items in the initial 44-item survey, failed to produce strong loadings in the EFA and was subsequently excluded. This discrepancy may reflect participant differences, mostly practicing teachers in Karas (2019) versus preservice teachers in the current study, highlighting how concerns around instructional materials may be more pressing for those already in the classroom. Another underlying reason could be the ability to find ready-made materials and adapt them with ease using a range of digital technologies and artificial intelligence (AI) tools, which can make materials development and adaptation less of a concern for preservice teachers today. However, teachers also play a pivotal role in decision-making when developing instructional materials through AI integration (Lo, 2025), with their AI literacy serving as a key factor (Wu et al., 2025). In line with this, technology emerged as a distinct and reliable subscale in this study, unlike in most existing LTSE instruments. All 4 technology-related items showed strong internal consistency and loadings, underscoring the increasing importance of digital tools in language education. This was consistent with recent research on preservice teachers in other contexts that underscored self-efficacy regarding the use of information and communication technologies for language instruction as a distinct LTSE domain (e.g., N.H. Hoang, 2024; Hülshoff & Jucks, 2024). Furthermore, 2 items initially classified under language instruction, related to teaching oral English language skills and teaching pronunciation, loaded more strongly with the classroom proficiency and language use subscale, suggesting that teachers’ confidence for teaching listening and speaking, including pronunciation, is closely aligned with their self-efficacy for language use in practice.
The 6-factor structure developed through this study presents a theoretically and empirically grounded survey for assessing LTSE beliefs in general ELT. It reflects both established dimensions of teacher efficacy and newer priorities, such as technology integration, relevant to contemporary educational contexts. Importantly, teacher language proficiency is a long-recognized cornerstone of effective language teaching (e.g., Richards, 2010). However, despite its foundational role, methodologically, proficiency has been measured separately from LTSE beliefs, most often with self-report surveys that do not consider language proficiency in relation to specific teaching tasks (see Faez et al., 2021). By explicitly incorporating teacher language proficiency within the LTSE survey, this new instrument builds on the growing trend (e.g., Thompson & Woodman, 2019) to account for teachers’ confidence to enact teaching tasks in English and consider English as content but also the medium of instruction. Alongside emerging areas such as technology use, the present structure offers a comprehensive tool for future research and practical applications in LTSE studies. However, this study focused on preservice teachers in Türkiye, and researchers must consider their own context-specific needs and the domain-specificity of ELT that they seek to measure before adopting this survey for their own study (see Karas et al., 2024 for further discussion).
2 Levels of self-efficacy
The overall efficacy of participants in each year of their study for all subscales falls somewhere between 4 and 5, except for year 4 students and technology which is marginally above 5 (M = 5.02). These levels of efficacy are in line with other self-efficacy reports. Synthesizing 83 studies of LTSE, Karas et al. (2024) determined a benchmark of 70–80% range for “normal” efficacy levels. In general, it was found that language teachers, across a range of contexts and using a variety of different LTSE measures, on average after converting to percentage, do not often measure their self-efficacy above 80% or below 70%. The subscale of technology in year 4 was the only one to go above 5, which is higher than 80% and hence considered “high” relative to other subscales. Given the rise in technology usage and that these participants are university students, this is not surprising. Thus, these preservice teachers’ descriptive levels of self-efficacy are very much in line with other (preservice) teachers across LTSE studies, further suggesting a common trend that teachers do not often assess their self-efficacy using the highest scale points, potentially because it could be deemed as overconfident. Furthermore, they do not often use the lower survey scale points either because it could be interpreted as incompetence.
Capturing preservice LTSE beliefs over 4 years of their LTE program shows further evidence of the dynamic and fluctuating nature of self-efficacy beliefs (Wyatt, 2013). A possible explanation for this may be the overall structure of LTE programs in this context, with a 4-year curriculum involving courses primarily focused on improving preservice teachers’ proficiency in English (e.g., reading skills and writing skills), general pedagogical knowledge (e.g., educational psychology and instructional technologies) and domain-specific knowledge (e.g., English literature and linguistics) in the first 2 years, and proceeding with a heavier focus on practice in the third and fourth years through courses on general pedagogy (e.g., classroom management and measurement and evaluation in education) and domain-specific practical knowledge and skills (e.g., teaching language skills and teaching English to young learners) as well as a 2-semester teaching practicum. The slight decreases in self-efficacy in year 3 for student-focused instruction, technology, assessment, and the overall scale may be due to this predominant shift in focus from theory to practice, implying a transition from a student’s perspective to a teacher's perspective in language learning and teaching. Such a shift can be more challenging and lead to a mismatch between expectations and classroom realities at the start of the teaching practicum (Yin, 2019) and in the induction year (Voss & Kunter, 2020). In the current study, the lack of such a decrease and existence of even a slight increase in overall efficacy in year 4 during the practicum period suggests that these beliefs were not undermined by classroom realities, possibly due to a balanced integration of theory and practice in the LTE programs, as underscored in prior research (Allen & Wright, 2014; Hennissen et al., 2017).
A fluctuating pattern was evident for all subscales except for the subscales of classroom proficiency and language use and culture. These two subscales showed a linear and steady increase from year 1 to year 4. This finding is interesting as it might suggest that the development of these two areas of LTSE beliefs might be different from other subscales. One way to interpret this finding is that these two subscales are intertwined with general language proficiency compared to other subscales. Participants may be more concerned about achieving greater proficiency in English and advancing their cultural awareness compared to other areas, not just due to the important role these competencies play in their future profession, but also due to the status of English as a global language (Crystal, 2003). However, this observation should be interpreted cautiously as the increase in these areas is rather small and nonsignificant, but we offer this analysis as a way to move beyond mere null hypothesis testing (see Plonsky, 2015).
Most studies that have examined LTSE beliefs over time have used the TSES (e.g., Atay, 2007) or have done so in relation to completing a practicum (e.g., T. Hoang & Wyatt, 2021; Şahin & Atay, 2010). The current study examines LTSE beliefs over time, through both the 4-year coursework and the teaching practicum in the final year. Although prior research using the TSES has evidenced growth in self-efficacy as a result of the teaching practicum, the advancement was observed in classroom management and student engagement, but not in instructional strategies (Atay, 2007; Şahin & Atay, 2010). In T. Hoang and Wyatt’s (2021) sample, a significant increase in self-efficacy beliefs was identified only for the areas related to general pedagogical capabilities measured using items from the TSES. However, such significant growth was not observed in domain- and context-specific aspects of language teaching (e.g., efficacy in teaching communicative English), suggesting that confidence in capabilities in these areas exhibits lower malleability compared with general pedagogical skills. With a significant change only in efficacy for assessment (albeit nonsignificant following the post hoc analysis) and efficacy for technology use approaching significance, the results of the current study demonstrate partial alignment with the divergence in the development of self-efficacy for general pedagogical skills and those for domain- and context-specific aspects of language teaching. In line with prior research (Cabaroglu, 2014; T. Hoang & Wyatt, 2021), this can be explained by preservice teachers’ tendency to pay more attention to the enhancement of general pedagogical abilities compared to those specific to language teaching. Taken together, results related to the development of self-efficacy beliefs indicate that preservice language teachers’ overall, general pedagogical, domain- and context-specific self-efficacy beliefs fluctuate and exhibit distinct developmental patterns, requiring meticulous scrutiny, both holistically and in detail. Of particular importance is the lack of significant differences in LTSE beliefs during the 4-year teacher training, warranting close attention and further exploration for the refinement of LTE programs.
VII Conclusion, implications, and limitations
This study has unpacked the underlying factors of LTSE and provided a domain-specific perspective with a newly developed research instrument. First, it has confirmed that self-efficacy items must be customized in terms of task, domain, and context for a valid measurement (Bandura, 1997; Wyatt, 2018). The results offer a 6-factor structure that gauges various dimensions of LTSE with task-specific items related to ELT (domain) in Türkiye (context). Second, the subscales of our model revealed not only well-known aspects, such as culture and assessment, but also more contemporary elements, such as student-focused instruction and technology. This theoretically implies a contribution to self-efficacy studies with an exploration of what newly emerging factors constitute LTSE in the current landscape. Practically, it offers an up-to-date LTSE instrument that can be adapted to different contexts or further developed in future conditions by other researchers. However, to further validate the instrument, we recommend future researchers consider conducting a confirmatory factor analysis to additionally test the factor structure and assess potential measurement invariance across groups.
Furthermore, this study tracked the development of LTSE beliefs of language teacher candidates studying at two undergraduate ELT programs in Türkiye. Although an increase of LTSE levels was observed for all subscales and the overall scale when comparing students in year 1 and year 4 (i.e., teachers were more confident at the end of the program compared with the beginning), this development was not linear and was not statistically significant. While language teacher candidates had relatively higher levels of LTSE regarding classroom proficiency and language use, culture, and technology, their LTSE levels were comparably lower in language instruction, assessment, and student-focused instruction. Yet, in all subfactors, teacher candidates’ LTSE scores were above 4, even for year 1 and year 2 students. Although many of the results were not statistically significant and should be interpreted cautiously, this implies that preservice teachers already develop a certain level of self-efficacy when they start undergraduate programs, which might stem from stringent admission requirements, competitive nationwide exams, and a 1-year preparation for English at the university. However, no significant change was observed in later stages after taking upper-level courses, diverse academic modules, and attending the practicum. This suggests that ELT programs might have some constraints to enhance their students’ LTSE beliefs to very high levels during teacher education. In particular, in the areas of language instruction, assessment, and student-focused instruction, further considerations and innovations can be made to meaningfully leverage the LTSE of teacher candidates to more advanced levels. One possibility is for ELT programs to use LTSE surveys as checklists to help teacher candidates understand key areas of L2 instruction and motivate reflection. General LTSE surveys such as this one, or more specific surveys that focus on different areas of language instruction (e.g., grammar, Wyatt & Dikilitaş, 2021; and pronunciation, Zhang & Faez, 2024), can be provided to teachers during their LTE experience. The surveys can help teachers understand what is required of language teachers and help them identify areas of confidence, and perhaps more importantly, areas where they need further assistance from teacher educators. These realizations can serve as an impetus for further reflection and areas of low confidence can potentially be addressed throughout preservice teachers’ LTE program. Identifying and addressing areas of low self-efficacy can be helpful for teacher educators, but also, for preservice teachers themselves who can take agency in their own development and attempt to improve their confidence for teaching tasks that are perceived as difficult.
This research is a cross-sectional study of two ELT programs in two different cities in Türkiye, which relied on quantitative self-report data. While the centralized teacher candidate selection processes provide insights and transferable knowledge, our study acknowledges the limitation of the small sample that cannot be widely generalized to other national and international contexts. Moreover, the nature of this research is constrained to analyze a snapshot of the LTSE of teacher candidates who began their education in different time periods. Although we compared students from various years of study, our comparison is limited by our participants’ different admission years, educational experiences, and so on. To overcome these limitations, more longitudinal studies adopting mixed-methods and focusing on the same participants’ progress at an ELT program may help the field monitor in greater detail how language teacher candidates develop their self-efficacy over the 4 years in higher education in the context of Türkiye.
Supplemental Material
sj-docx-1-ltr-10.1177_13621688261432443 – Supplemental material for Preservice English language teacher self-efficacy in Türkiye: A cross-sectional analysis
Supplemental material, sj-docx-1-ltr-10.1177_13621688261432443 for Preservice English language teacher self-efficacy in Türkiye: A cross-sectional analysis by Michael Karas, Farahnaz Faez, Funda Ölmez Çağlar and Selçuk Emre Ergüt in Language Teaching Research
Footnotes
Data availability
To protect participant identities, data are not publicly available
Ethical approval
This study received ethical approval from The Office of Human Research Ethics at Western University. Participants were presented with a Letter of Information before completing the online survey and could refuse to participate by closing their web browsers.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Supplemental material
Supplemental material for this article is available online.
