Sage Journals: Discover world-class research

Abstract

Second and foreign language (L2) learning boredom has triggered a spate of studies in recent years. Researchers have also developed instruments that tap into this emotion. However, such tools contain many items, may be culture-specific, or have a disputed factor structure. To address these shortcomings, we aimed to develop and validate a short version of the 23-item Boredom in Practical English Classes-Revised (BPELC-R) Scale. A dataset from 1,254 students in degree programs in English from different countries (i.e., Hungary, Iran, Iraq, Poland) was used. Data were split into two groups, and the first sample was used to develop the short-form measure, with principal component analysis (PCA) resulting in a unidimensional model. Through ant-optimization algorithms and traditional item analysis, 10 items were retained that constituted the Short-Form Foreign Language Classroom Boredom Scale (S-FLCBS). Based on data from the second sample, the tool was characterized by acceptable internal consistency reliability, as well as discriminant and convergent validity. The analysis also yielded evidence for measurement invariance with respect to age and gender, with limited invariance found for country.

Keywords

foreign language learning boredom measurement of L2 boredom negative emotions short-form development validation

I Introduction

Boredom was first investigated empirically in educational psychology, where it was identified as one of the most frequently and most intensely experienced academic emotions affecting students at different educational levels in relation to a wide variety of school subjects (Pekrun et al., 2014; Tulis & Fulmer, 2013; Tze et al., 2014). This emotion is viewed as an extremely complex phenomenon not only because it has a variety of antecedents (Daschmann et al., 2014), manifests itself in different ways (Goetz et al., 2014), is exhibited with varying levels of intensity, and interacts with other variables (Ally, 2008), but also because it consists of multiple interrelated components, encompassing affective, cognitive, expressive, motivational, and physiological factors (Nett et al., 2010; Pekrun, 2006; Scherer & Moors, 2019). It has also been described as a “silent” emotion whose detrimental effects can be easily overlooked or even ignored (Fahlman, 2009), and which can be erroneously equated with laziness, anxiety, or depression (Macklem, 2015). Irrespective of how exactly boredom is conceptualized and what theoretical lens is employed to account for its pervasive occurrence (see Pawlak, Zawodniak, & Kruk, 2020a; Pawlak, Kruk, & Zawodniak, 2024), there is a consensus that, due to its close affinity to disengagement, dissatisfaction, inattention, and lack of interest (Fahlman, 2009), its impact on learning processes and their outcomes is mostly negative, an assumption that has been confirmed empirically (e.g., Camacho-Morles et al., 2021; Pekrun et al., 2016; Tze et al., 2016).

In view of the potential of boredom to illuminate the causes of success and failure in any endeavor that human beings might undertake, it is not surprising that the construct has attracted the interest of researchers investigating second and foreign language (L2) learning and teaching. Also in this context, an agreed-upon definition of boredom is hard to come by, and the emotion has been characterized, for example, as “a state of disengagement caused by lack of interest and involvement” (Kruk et al., 2021, p. 21) or as “a negative emotion with extremely low degree of activation/arousal that arises from ongoing activities” (Li et al., 2023, p. 235). While boredom was introduced into second language acquisition (SLA) research by Chapman (2013), it was not until several years later that this negative emotion became the focus of more intensive empirical scrutiny spearheaded in the Polish educational context (e.g., Kruk, 2016; Kruk & Zawodniak, 2017, 2018; Pawlak et al., 2020a, 2020, Pawlak, Kruk, Zawodniak, et al., 2020, Pawlak, Kruk, & Zawodniak, 2022). Initially, such research predominantly focused on the causes and manifestations of this negative emotion as experienced by L2 learners as well as, much less frequently, the coping strategies that can be drawn on to prevent or combat it (e.g., Derakhshan et al., 2021, 2022; Nakamura et al., 2021; Pawlak, Derakhshan, et al., 2022). Some of the early studies employed mixed methods to trace changes in the intensity of boredom in single English classes and sequences of such classes, also seeking to uncover the factors that accounted for such fluctuations (e.g., Pawlak, Zawodniak, & Kruk et al., 2020a, Pawlak, Kruk, & Zawodniak, 2022). Another crucial step in the study of boredom involved attempts to reveal the antecedents of this emotion, uncover its factor structure, and develop related measurement tools (e.g., W. Chen et al., 2024; Li et al., 2023; Pawlak, Zawodniak, et al., 2020, Pawlak, Kruk, Zawodniak, & Pasikowski, 2022). The availability of such scales, in turn, has provided an impetus for a flurry of empirical investigations seeking to determine the links between boredom experienced in L2 classroom settings, other emotions, in particular anxiety and enjoyment, as well as an array of individual difference (ID) factors (e.g., mindsets, willingness to communicate, L2 grit, L2 self, engagement) and target language (TL) attainment (e.g., Dewaele & Li, 2021; Dewaele, Botes, & Greiff, 2023; Fathi et al., 2023; Kruk, Pawlak, Taherian, et al., 2023; Lan et al., 2023; Li et al., 2025; Li & Wei, 2023; Pawlak, Kruk, & Zawodniak, et al., 2022; Solhi et al., 2025; Taherian et al., 2024; Zhang et al., 2024; Zhao et al., 2023).

This proliferation of studies testing various complex models in which boredom is included alongside a growing number of variables or L2 learning outcomes inevitably puts in the spotlight vital issues related to the accurate measurement of this negative emotion for the simple reason that the integrity of such models hinges upon the nature and quality of the research instruments used. One such issue is related to the fact that different tools have been developed to tap into boredom in different cultural and educational milieus, reflecting diverse underlying factor structures and thus generating results that are difficult to compare across studies. Another problem pertains to the fact that the two scales that have been drawn upon in the bulk of empirical investigations of boredom—that is, those proposed by Pawlak, Kruk, Zawodniak et al. (2020) and Li et al. (2023), are relatively long, comprising 23 and 32 items, respectively. Considering the fact that such scales are typically employed together with measures of other constructs, such as different ID factors, participants are often expected to complete lengthy questionnaires, which is bound to negatively affect completion rates, increase costs of data collection, make the process of administration a daunting challenge, and ultimately pose considerable threats to reliability and validity (see Dörnyei & Dewaele, 2022; Galesic & Bosnjak, 2009; Heene et al., 2014; Rolstad et al., 2011; Schoeni et al., 2013). Thus, it is fully warranted to develop and validate short forms of existing boredom scales, as has been done for enjoyment (Botes et al., 2021) and anxiety (Botes et al., 2022). In line with this reasoning, Li et al. (2024) have recently developed the short form of Li et al.’s (2023) instrument in the Chinese context. However, this instrument is intended for a very specific cultural and educational setting. With this in mind, the study reported in this paper sought to develop and validate a short version of the Boredom in Practical English Classes-Revised Scale (BPELC-R; Pawlak, Kruk, Zawodniak, et al., 2020). This was the very first instrument proposed to specifically tap into L2 boredom and, while it was created to collect data from English majors in Poland, it has since been employed in other settings (e.g., Hungary, Iran, Turkey) and has the potential to provide insights into this negative emotion in various contexts and at different educational levels.

II Measurement of L2 boredom

Since boredom is a relative newcomer to the field of SLA, it should not come as a surprise that different approaches to its measurement have been adopted. Some empirical investigations (e.g., Csizér et al., 2024; Li, 2021) have opted for a domain-general approach to boredom, relying in particular on the relevant subscale of the Achievement Emotions Questionnaire (AEQ; Pekrun et al., 2011). However, in line with recent trends in SLA research where ID factors such as grit are considered in a domain-specific manner at different levels of granularity (e.g., Pawlak, Fathi, & Kruk, et al., 2024; Teimouri et al., 2022), most scholars have chosen to view boredom in L2 learning as a construct that is distinct from boredom that might be experienced by individuals in other educational settings or walks of life. The early data-collection instruments embracing this view comprised modifications of the 28-item Boredom Proneness Scale (BPS) developed by Farmer and Sundberg (1986) in the field of educational psychology. They include Chapman’s (2013) German Class Boredom Proneness Scale (GCBPS) and Kruk’s (2016) English Classroom Boredom Proneness Scale (ECBPS). The changes involved relatively minor transformations of the original BPS items so that they reflected the context of learning German (GCBPS) and learning English (ECBPS), respectively. A slightly modified version of the ECBPS was subsequently employed in the studies carried out by Kruk and Zawodniak (2017) as well as Pawlak, Zawodniak, and Kruk (2020b). In both cases, the modifications were mostly aimed at further adjusting the items to the context of learning English. While in the former study, the 28-item Boredom in Practical English Language Classes Questionnaire (BPELCQ) was used to uncover the relationship between boredom experienced in practical English classes and general boredom susceptibility, tapped into by means of the BPS, among Polish students majoring in English (N = 174), in the latter the 27-item Boredom in Practical English Language Classes (BPELC) questionnaire was employed to explore differences in the levels of boredom experienced in learning English by 2nd- and 3rd-year Polish students majoring in English (N = 111). It should be noted that internal consistency reliability measured with Cronbach’s alpha was only reported for the ECBPS (α = .76) and its iterations, that is, the BPELCQ (α = .81) and the BPELC (α = .91). Moreover, these tools provided a general picture of boredom, with no attempt being made to delve into the underlying structure of this construct.

In order to shed more light on the antecedents of boredom, Pawlak, Zawodniak, and Kruk (2020b) administered the 27-item BPELCQ (Kruk & Zawodniak, 2017) to 107 Polish university students majoring in English. Following initial analyses and the removal of four items, the data underwent exploratory factor analysis (EFA), which allowed the identification of two factors underlying the occurrence of boredom during practical classes in degree programs in English: (F1) disengagement, monotony, and receptiveness (e.g., “It would be very hard for me to find an exciting task in language classes”), and (F2) lack of satisfaction and challenge (e.g., “I feel that I am working below my abilities most of the time in my language classes”). The resulting instrument was the BPELC-R scale, characterized by high levels of internal consistency, as indicated by the Cronbach’s alpha values: .91 for the entire instrument, .89 for F1, and .89 for F2.

The BPELC-R scale (Pawlak, Kruk, Zawodniak, et al., 2020) has since been successfully employed in other English as a foreign language (EFL) contexts: in Iran (e.g., Kruk et al., 2022) and Turkey (e.g., Coşkun & Yüksel, 2021), with data gathered from 412 and 680 EFL learners, respectively. The Persian version of the scale retained its two-factor structure and revealed a high level of reliability, measured with Cronbach’s alpha and McDonald’s omega, both for the entire tool (α = .87, ω = .88) and its two factors (F1: α = .88, ω = .89; F2: α = .87, ω = .88). In the Turkish context, the BPELC-R scale also retained its two-factor structure, but two items were eliminated, with the resulting instrument, the Boredom in English Language Classes Scale, encompassing 21 items. The Cronbach alpha values were .76, .79, and .74 for the whole scale, F1 and F2, respectively (Coşkun & Yüksel, 2021). Importantly, Kruk, Pawlak, Elahi Shirvan, et al. (2023) revisited the construct validity of BPELC-R through exploratory structural equation modeling (ESEM) using data from 549 Polish students majoring in English. Compared to confirmatory factor analysis (CFA), which failed to provide acceptable fit indices and discriminant validity, ESEM (specifically, bifactor ESEM) offered a better assessment and representation of the construct validity of the BPELC-R scale. Overall, Pawlak, Kruk, Zawodniak, et al.’s (2020) study made an important contribution to our understanding of L2 boredom. This is evident in the fact that several recent studies have employed the BPELC-R scale to examine this negative emotion, often in connection with other variables, in particular ID factors (e.g., Fathi et al., 2023; Kruk et al., 2023; Pawlak, Kruk, Csizér, & Zawodniak, 2024; Pawlak, Zarrinabadi, & Kruk, 2024; Taherian et al., 2024).

An effort to uncover the factorial structure of boredom and to create an instrument to tap into this construct has also been made in the Chinese context. Specifically, using questionnaire and interview data obtained from Chinese undergraduate students, non-English majors, and their teachers, Li et al. (2023) developed the 32-item Foreign Language Learning Boredom Scale (FLLBS), successfully validating the instrument by means of EFA and CFA. Seven factors were identified: (a) foreign language class boredom, (b) underchallenging task boredom, (c) PowerPoint presentation boredom, (d) homework boredom, (e) teacher-dislike boredom, (f) general learning trait boredom, and (g) overchallenging or meaningless task boredom. Li et al. (2023) reported high internal consistency reliability measured with Cronbach’s alpha: .95 for the entire scale, and .94, .91, .85, .90, .84, .90, and .77 for the seven factors, respectively. As is the case with the BPELC-R scale (Pawlak, Kruk, Zawodniak, et al., 2020), the FLLBS has greatly contributed to unraveling the role of boredom in L2 learning. This is because this tool as a whole or some of its subscales have been used in a number of studies focusing on emotions as well as other constructs (e.g., Botes et al., 2024; Dewaele, Botes, & Greiff, 2023; Lan et al., 2023; Li & Wei, 2023; Zhao & Wang, 2025). What is particularly relevant to the focus of the present paper, Li et al. (2024) used data aggregated from previous studies involving secondary school students, non-English majors, as well as expert opinions to develop and validate a short form of the FLLBS. EFA drawing on data from 320 and CFA using data from 3,341 participants offered a basis for the short form of the FLLBS (FLLBS-SF), which comprised 11 items and encompassed three factors: (a) foreign language activity boredom, (b) foreign language classroom boredom, and (c) general learning boredom.

Yet another attempt to develop a tool for tapping into L2 boredom was recently made by Mousavian Rad et al. (2024). Using data from a sample of university students, 139 for EFA and 991 for CFA, they developed and validated the 47-item Precursors of Students’ Boredom in EFL Classes (PSBEC) Scale. They identified 11 factors underlying L2 boredom: (a) teaching practices, (b) excessive class control, (c) inattentive behavior, (d) overchallenge, (e) underchallenge, (f) intrinsic values, (g) extrinsic values, (h) negative affective factors, (i) boredom proneness, (j) classroom-related factors, and (k) curriculum design. The PSBEC scale also demonstrated good convergent validity when measures of foreign language enjoyment and perceived teacher enthusiasm were taken into account, as well as predictive validity when participants’ academic achievement was considered. While the scale might represent another important step forward in disentangling the highly intricate construct of boredom, it has yet to be used more broadly.

Worth mentioning at this juncture is also the pioneering study conducted by Pawlak, Kruk, Zawodniak, and Pasikowski (2022), who investigated boredom experienced by L2 learners outside of educational settings and developed the 21-item Boredom in Learning English Outside of School (BLEOS) questionnaire. EFA, based on the data gathered from 107 Polish students majoring in English, resulted in the identification of three factors underlying after-class boredom: (F1) unwillingness to learn English and inability to find (interesting) tasks; (F2) lack of creativity, focus, and involvement; and (F3) altered perception of time, underused language abilities, and monotony. According to the researchers, these factors reflect the complex nature of boredom in out-of-class situations which, on the one hand, can be triggered by tasks and activities that are closely linked to what transpires in the classroom (e.g., homework, exam preparation) and, on the other hand, can also be induced when learners decide to take steps to improve their mastery of the TL on their own initiative, drawing on the multiple affordances available to them outside the classroom. Pawlak, Kruk, Zawodniak, et al. (2022) reported satisfactory internal consistency reliability for the BLEOS, as indicated by Cronbach’s alpha values of .90 for the entire scale, and .88, .77, and .74 for F1, F2, and F3, respectively. It should be noted that Pawlak, Solhi, et al. (2024) revisited the construct validity of BLEOS by means of ESEM drawing on data from 433 Polish students majoring in English. The study offered evidence for the effectiveness of the bifactor ESEM framework in view of the fact that the model provided strong support for the factorial structure of the BLEOS scale.

III Rationale for the current study and research questions

In view of the proliferation of instruments tapping into ID factors in the field of SLA research and the fact that several tools already exist for measuring L2 boredom, it is warranted to ask why yet another scale needs to be developed for this purpose. One reason for the development of a short version of the 23-item BPELC-R is quite obvious, has been highlighted by researchers who have sought to tackle a similar endeavor (e.g., Botes et al., 2021, 2022; Li et al., 2024), and has also been mentioned in the Introduction to this paper. Specifically, when shorter scales are used, the likelihood of participants actually completing the questionnaire increases, responses are less likely to be haphazard, logistical issues are diminished, and the obtained data have higher reliability and validity (Clark & Watson, 1995; Dörnyei & Dewaele, 2022; Galesic & Bosnjak, 2009; Heene et al., 2014; Rolstad et al., 2011; Schoeni et al., 2013). Thus, it is justified to make an effort to use sound statistical procedures to reduce the number of items included in the BPELC-R scale so that it can be confidently employed in studies where several scales are utilized to tap the variables under investigation.

More importantly perhaps, a question arises why another short form of a boredom scale is needed if one such instrument has been developed and validated by Li et al. (2024). In our view, there are at least four reasons why this undertaking is worthwhile and can in fact be considered necessary. First, the FLLBS-SF was developed based on data from Chinese learners of English and, considering cultural and contextual differences, its utility in other parts of the world such as Europe has yet to be demonstrated. Second, although impressive in size, the dataset in Li et al.’s (2024) study consisted of data from participants who were secondary school learners and university students majoring in subjects other than English. There are grounds to assume that university students majoring in English (but in other foreign languages as well) constitute a distinctive population, not least because they have made a deliberate choice to attain high levels of proficiency in the TL, with important consequences for their motivation, engagement, persistence, and also the emotions they might experience. Third, one of the subfactors of the FLLBS-SF reflecting foreign language activities (micro level) conflates to some extent boredom that learners experience inside and outside of the classroom. Drawing on previous research (e.g., Pawlak, Derakshan, et al., 2022), it is reasonable to assume that while some of the mechanisms underlying this negative emotion are similar in different contexts, after-class boredom is distinct in many ways from in-class boredom, with the effect that a dedicated tool is needed to adequately capture it. Fourth, another subfactor of the FLLBS-SF is believed to encompass general learning boredom (macro level), being akin to some degree to general boredom proneness (Farmer & Sundberg, 1986). Previous research shows, however, that even though the general tendency to succumb to this aversive emotion is positively related to boredom learners experience in L2 classes, the two constructs cannot be equated (e.g., Pawlak, Kruk, Zawodniak, et al., 2020, Pawlak, Kruk, et al., 2022). On the whole, in relation to the last two points, the validation study reported in this paper is based on the assumption that the instruments developed to tap into boredom (but also many other ID factors, including emotions) should be domain-specific (see Li & Yang, 2024; Pawlak, Fathi, & Kruk, 2024).

Based on the rationale presented above, the study aimed to develop a reduced version of the BPELC-R, referred to here as the Short-Form Foreign Language Classroom Boredom Scale (S-FLCBS), to examine its psychometric properties and determine its measurement invariance for different groups. Specifically, the following research questions (RQs) were formulated:

RQ1: What is the factor structure of the S-FLCBS?

RQ2: What are the psychometric properties of the S-FLCBS in terms of measurement model fit, scale reliability, and convergent and discriminant validity evidence?

RQ3: Is the S-FLCBS measurement invariant in relation to gender, age group, and country?

IV Method

1 Participants

The sample consisted of N = 1,254 participants recruited from Poland (n = 550), Iran (n = 241), Hungary (n = 342), and Iraq (n = 121). All of them were enrolled in degree programs in English in the four countries, which aimed to develop a mastery level of the TL to be used successfully in a variety of professional contexts. While there were some inevitable differences in how the programs were structured or the exact requirements needed for successful completion, all of them included an intensive English course typically divided into different modules (e.g., grammar, speaking, writing) as well as a number of content courses that were also taught in English (e.g., linguistics, literature, electives). Participants were recruited through snowball sampling using the authors’ professional contacts, with an aggregate dataset from previous empirical investigations. The sample was predominantly female (n = 903) as is a consistent trend in foreign language learning classes (Chaffee et al., 2020), with n = 349 male participants and n = 2 not providing gender. The average age of participants was 22.23 years (SD = 4.84), with a mean of 11.21 years (SD = 4.23) spent learning English. Their English proficiency oscillated between the B2 and C1 levels according to the Common European Framework of Reference for Languages (Council of Europe, 2001), with the students themselves rating it as intermediate and high, respectively.

2 Instruments

The data employed for the purpose of this study were collected by means of several scales probing into the variables under investigation. All assessments were self-report scales, and all items were measured on a 5-point Likert scale (1 = strongly disagree, 5 = strongly agree). Beyond the BPELC-R, additional self-report measures were included to assess convergent and divergent validity. Basic demographic information was also gathered (e.g., age, sex, nationality). The scales are described below, and the rationale for their inclusion is provided whenever deemed necessary.

1. BPELC-R (α = .890, ω = .896). The original 23-item scale examines boredom in the EFL classroom as a two-dimensional construct (Pawlak, Kruk, Zawodniak, et al., 2020). The first dimension captures disengagement, monotony, and repetitiveness with 14 items (e.g., “I often do not feel like doing anything in English classes”; α = .876, ω = .880). The second dimension captures lack of satisfaction and challenge with nine items (e.g., “I am seldom excited about my English classes”; α = .632, ω = .658). Five items out of the total of 23 were negatively worded (e.g., “I always feel entertained in my English language classes”).

2. L2 Grit Scale (α = .839, ω = .810). The two-factor, nine-item scale developed by Teimouri et al. (2022) was used to examine domain-specific grit within the context of language learning. The two factors were perseverance of effort (five items; e.g., “I am a diligent English language learner”; α = .877, ω = .879) and consistency of interest (four items; e.g., “I think I have lost interest in learning English”; α = .807, ω = .815). Previous research has established a negative relationship between L2 grit and L2 boredom (see Pawlak, Zarrinabadi, et al., 2024), and thus L2 grit was utilized to examine divergent validity.

3. Language Learning Curiosity Scale (α = .796, ω = .796). Curiosity in the EFL class was examined via the two-factor scale created by Mahmoodzadeh and Khajavy (2019). The two factors were language curiosity as a feeling of interest (four items; e.g., “I wonder how well I can speak English when meeting a native English speaker”; α = .703, ω = .752) and language curiosity as a feeling of deprivation (seven items; e.g., “When I have a language question in mind, I cannot rest without knowing the answer”; α = .704, ω = .706). Previous research has confirmed a negative relationship between curiosity and boredom in general education as well as within the context of L2 learning (see Eren & Coskun, 2016; Kruk & Zawodniak, 2018); thus, language learning curiosity was included as a divergent validity measure.

4. Language Learning Enjoyment Scale (α = .871, ω = .879). This unidimensional, eight-item scale was constructed with items extracted from the 21-item Foreign Language Enjoyment Scale (Dewaele & MacIntyre, 2014) and the Achievement Emotions Questionnaire (Pekrun et al., 2011). Previous research has confirmed a negative relationship between L2 boredom and foreign language enjoyment (FLE; see Dewaele, Botes, & Greiff et al., 2023; Li, 2022); therefore, FLE was utilized as a divergent validity measure.

5. Short-Form Foreign Language Anxiety Scale (α = .895, ω = .898). This unidimensional, eight-item measure was developed by MacIntyre (1992) as a short form of the 33-item scale developed by Horwitz et al. (1986). The scale was recently validated by Botes et al. (2022) and includes items such as “Even if I am well prepared for FL class, I feel anxious about it.” Previous research has confirmed a consistent positive association between L2 boredom and foreign language classroom anxiety (FLCA; see Dewaele et al., 2022, Dewaele, Botes, & Greiff, 2023), as such, the measure was used to examine convergent validity.

6. Motivated Behavior Scale (α = .882, ω = .884). This unidimensional, 10-item scale, developed by Taguchi et al. (2009), was utilized to capture motivation to learn EFL, with items such as “I am prepared to expend a lot of effort in learning English.” Previous research has found that more motivated language learners tend to experience less boredom in the L2 classroom (see Kruk, 2016b; Zhao et al., 2023). Therefore, the scale was included as a measure of divergent validity.

English was used in all the questions and scales in view of the high proficiency level of participants. Since the dataset comes from several previous studies, data collection procedures may have differed to some extent but composite questionnaires were administered online in all cases. In all of these investigations, students were requested to sign a consent form, were informed that they could withdraw from the study at any time, and were ensured that their data would only be employed for research purposes.

3 Data analysis

Data analyses followed seven sequential steps, following the criteria specified in Botes et al. (2021, 2022) and Marsh et al. (2005). A flowchart of the steps is provided in Figure 1. The validation steps undertaken in this study were guided by established procedures for short-scale construction and psychometric validation (e.g., Botes et al., 2021, 2022; Marsh et al., 2005).

Figure 1.

Research methods flowchart.

a Step 1: Splitting the dataset

The dataset was randomly split in two using SPSS Version 28.0.1.1. The first sample was utilized to explore the data and develop the S-FLCBS (PCA and item selection), whereas the second sample was used to confirm the structure, reliability, and validity of the scale, in line with best practice recommendations (see Hagtvet & Sipos, 2016; Marsh et al., 2005). The two datasets were compared using independent-samples t-tests to ensure that no statistically significant differences were present.

b Step 2: Exploring the factor structure

Using SPSS, a PCA was performed using oblique (promax) rotation. Factor selection was performed based on the Kaiser criterion (eigenvalue > 1) and a visual interpretation of the scree plot (Tabachnick & Fidell, 2001). PCA was selected to uncover the factor structure as the extraction method creates a simplified description of data that has been found to be particularly useful in the creation of short-form scales (McGuire et al., 2010). Oblique rotation was selected as underlying factors in the two-dimensional BPELC-R scale were assumed to correlate (Field, 2013). Factor loadings were interpreted as low (< .40), intermediate (.40 to .60), and high (> .60; see Kline, 2014).

c Step 3: Short-form item selection

Individual factors and items were considered on the basis of the PCA conducted in Step 2 using the first sample of the data. The strongest items were selected based on the interitem correlations, factor loadings, and general theory underlying the construct of foreign language boredom (Botes et al., 2021). The manually selected items were further validated by comparing and contrasting them with items chosen on a purely mathematical basis through an ant colony optimization (ACO) algorithm (Olaru et al., 2015). The ACO offers an alternative to the manual selection of items in the development of a scale’s short-form by employing “probabilities to create a set of items that cannot be improved upon in terms of pre-specified criteria” (Botes et al., 2021, p. 864). Thus, the algorithm has the ability to identify items that would lead to the shortest possible route to a close-fitting measurement model (Dörendahl & Greiff, 2020). The “shortform” R package with “MplusAutomation” was used to implement the ACO algorithm (Raborn & Leite, 2018). Furthermore, inconsistencies between the manually selected items and the ACO selected items were resolved by considering the theoretical rationale underlying L2 boredom, item statistics, and psychometric rationale of the S-FLCBS.

d Step 4: Confirming the structure of the short form

Using Sample 2 of the dataset, the structure and items of the measurement model of the S-FLCBS proposed in Step 3 were examined. The measurement model was tested via the “lavaan” package in RStudio (Rosseel, 2012), with maximum likelihood estimation and standard errors. Model fit was determined by the fit indices of the root mean square error of approximation (RMSEA; close fit < .05, reasonable fit < .08), the standardized root mean square residual (SRMR; close fit < .05, reasonable fit < .08), comparative fit index (CFI; close fit > .90, reasonable fit > .95), and Tucker–Lewis index (TLI; close fit > .90, reasonable fit > .95; see Kenny, 2020). Factor loadings, modification indices, and errors were further interpreted to determine fit.

e Step 5: Reliability and validity

Using Sample 2 of the dataset, reliability as well as convergent and divergent validity were examined. Reliability was measured through internal consistency via McDonald’s omega and Cronbach’s alpha. Convergent validity was examined based on positively correlating variables within the known nomological network of L2 boredom. Specifically, FLCA was utilized to confirm convergent validity (see Dewaele, Botes, & Grieff et al., 2023). Divergent validity was examined based on negatively correlating variables within the nomological network, namely, L2 grit (see Pawlak, Zarrinabadi, et al., 2024), language learning curiosity (see Kruk & Zawodniak, 2018), FLE (see Dewaele, Botes, & Meftah, et al., 2023), and motivated behavior (see Zhao et al., 2023). The thresholds suggested by Plonsky and Oswald (2014) were used to interpret the strength of the correlations.

f Step 6: Recombining the dataset

The dataset was recombined (Sample 1 + Sample 2) in order to have the necessary statistical power for invariance testing.

g Step 7: Invariance testing

Utilizing the full dataset, invariance testing was used to examine the generalizability of the newly developed S-FLCBS. Invariance was analyzed across genders, age groups, and countries using the “lavaan” package in RStudio (Rosseel, 2012). Measurement invariance examines the consistency of measures across groups and whether group membership affects the properties of a measure (Meredith, 1993). As such, if a scale is fully invariant, individuals from different groups (e.g., Polish and Iranian participants) who have the same standing on the construct (i.e., FLLB) will also have the same observed score (Millsap, 2011). Measurement invariance was tested using increasingly restrictive models across the different groups. Firstly, the factor structure across groups was examined through configural invariance testing, where all parameters were freely estimated across groups. Secondly, metric invariance was established by stipulating the factor loadings to be invariant across groups. The confirmation of metric invariance implies that participants across different groups respond to items in a similar way, with, for example, male and female participants not differing in their interpretation of an item (Botes et al., 2022). Lastly, scalar invariance was tested by specifying both factor loadings and item intercepts to be equal across groups. A scalar invariant measure implies that direct comparisons of latent means can be made across groups (Meredith, 1993). Invariance was established by comparing each more restrictive model with the previous less restrictive one based on differences in fit indices. Fit index differences were interpreted according to guidelines by Cheung and Rensvold (2002) and F.F. Chen (2007), where ∆CFI ⩽ −.010, ∆RMSEA ⩽ .015, ∆SRMR ⩽ .030 (for metric invariance), and ∆SRMR ⩽ .015 (for scalar invariance) indicate an invariant model.

V Results

1 Step 1: Splitting the dataset

The dataset was randomly split into two samples. Demographic information of Sample 1 and Sample 2 can be found in Table 1. Descriptive statistics for all variables can be found in Table 2. Independent-samples t-tests confirmed that there were no statistically significant differences between Sample 1 and Sample 2.

Table 1.

Demographic information for Samples 1 and 2.

	Sample 1	Sample 2
Sample size	633	621
Gender (% female)	73.10	70.90
Age M (SD)	22.33 (4.53)	22.14 (5.14)

Table 2.

Descriptive statistics.

Variable	n	M	SD	t value	p value
BPELC-R				−0.172	.863
Sample 1	633	2.576	0.609
Sample 2	621	2.582	0.613
L2 grit				0.254	.800
Sample 1	633	3.917	0.686
Sample 2	621	3.907	0.709
Curiosity				1.155	.249
Sample 1	633	4.096	0.579
Sample 2	621	4.057	0.599
FLE				−1.107	.268
Sample 1	633	3.777	0.800
Sample 2	621	3.834	0.749
FLCA				0.116	.907
Sample 1	633	3.020	0.977
Sample 2	621	3.013	1.049
Motivation				0.144	.886
Sample 1	633	3.692	0.722
Sample 2	621	3.686	0.783

Note. BPELC-R = Boredom in Practical English Classes-Revised Scale; FLE = foreign language enjoyment; FLCA = foreign language classroom anxiety.

2 Step 2: Exploring the factor structure

With Sample 1, the full 23-item BPELC-R scale was examined via PCA. Four items were identified as problematic, namely, Item 7 (“I get a kick out of most things I do in a language class”), Item 9 (“I can usually find something interesting to do in my language classes”), Item 12 (“I would like to have more challenging things to do in my English classes”), and Item 23 (“In situations where I have to wait [e.g., for everyone to finish their task], I get very restless). All four items had nonsignificant or negative correlations with other items in the interitem correlation matrix. The items were removed in order to prevent weak items from unduly influencing the results, and a second PCA was conducted using the remaining 19 items.

Three factors emerged from the PCA (see Table 3). However, all items loaded significantly onto the first factor, with cross-loadings shown by the three reverse-scored items. As reverse-scored items are known to result in atypical responses that may create difficulties in uncovering factor structures (Carlson et al., 2011; Woods, 2006), these additional two factors were disregarded in favor of a unidimensional solution. In addition, the scree plot further suggested a single factor solution (see Figure 2), with the first factor (eigenvalue = 7.15, R² = .38) showing a clear drop-off and “elbow” structure (Field, 2013).

Table 3.

Principal component analysis of the 19-item BPELC-R with a promax rotation.

Item	Factor 1	Factor 2	Factor 3
1. Time always seems to be passing slowly in my language classes.	.656¹
2. I often find myself at loose ends in a language class.	.541²
3. I often have to do meaningless things in my language classes.	.736¹
4. I always feel entertained in my English classes.*	.464¹		.627²
5. I often have to do repetitive or monotonous things in my language classes.	.651²
6. It takes more stimulation to get me going in English classes than most students from my group.	.578¹
8. I am seldom excited about my English language classes.	.486¹
10. I often do not feel like doing anything in English classes.	.722²
11. It would be very hard for me to find an exciting task in language classes.	.749²
13. I feel that I am working below my abilities most of the time in my language classes.	.479¹
14. I am more interested in other subjects than in practical English classes.	.489¹
15. If I am not doing something interesting/exciting during English classes, I feel tired and bored.	.560¹
16. It takes a lot of change and variety to keep me really satisfied during my English classes.	.670²
17. It seems that English classes are the same all the time; it is getting boring.	.743²
18. It is easy for me to concentrate on the activities in my English language classes.*	.405¹	.576¹
19. During language classes, I often think about unrelated things.	.661²
20. Having to listen to my English language teachers present material bores me tremendously.	.742²
21. I actively participate in English classes.*	.411¹	.628²
22. Much of the time I just sit around doing nothing in my English classes.	.699²

Note. *Reverse-scored items. All loadings < .40 are omitted. BPELC-R = Boredom in Practical English Classes-Revised Scale.

¹Acceptable loading: .40 to .60; ²high loading: > .60.

Figure 2.

Scree plot.

A unidimensional solution was therefore selected as the factor structure underlying the 19-item BPELC-R. The S-FLCBS was thus constructed as a unidimensional scale as well. The two-factor structure of the original BPELC-R was therefore not supported in the short version of the scale.

3 Step 3: Development of the S-FLCBS

Based on the results of Step 2 with Sample 1, specifically regarding interitem correlations and factor loadings, 10 items were manually selected for the S-FLCBS (see Table 4). All items had moderate to high interitem correlations.¹

Table 4.

10-item S-FLCBS scale.

Item
1. Time always seems to be passing slowly in my language classes.3. I often have to do meaningless things in my language classes.5. I often have to do repetitive or monotonous things in my language classes.10. I often do not feel like doing anything in English classes.11. It would be very hard for me to find an exciting task in language classes.16. It takes a lot of change and variety to keep me really satisfied during my English classes.17. It seems that English classes are the same all the time; it is getting boring.19. During language classes, I often think about unrelated things.20. Having to listen to my English language teachers present material bores me tremendously.22. Much of the time I just sit around doing nothing in my English language classes.

Item

1. Time always seems to be passing slowly in my language classes.3. I often have to do meaningless things in my language classes.5. I often have to do repetitive or monotonous things in my language classes.10. I often do not feel like doing anything in English classes.11. It would be very hard for me to find an exciting task in language classes.16. It takes a lot of change and variety to keep me really satisfied during my English classes.17. It seems that English classes are the same all the time; it is getting boring.19. During language classes, I often think about unrelated things.20. Having to listen to my English language teachers present material bores me tremendously.22. Much of the time I just sit around doing nothing in my English language classes.

Note. S-FLCBS = Short-Form Foreign Language Classroom Boredom Scale.

Subsequently, the manually selected items were compared to the items selected by the ACO algorithm. The algorithm was set to select 10 items in order to create the best-fitting unidimensional measure of L2 boredom on a purely mathematical basis. The ACO algorithm selected the exact same items as those that were manually selected, with one exception. The algorithm selected Item 6 (“It takes more stimulation to get me going in English classes than most students from my group”) as opposed to the manually selected Item 17 (“It seems that English classes are the same all the time, it is getting boring”). The decision to remove Item 6 and add Item 17 instead was based on theoretical reasoning. First, Item 6 implies a social comparison (“than most students from my group”), which makes responses to this item dependent on the individual classroom context, and therefore should be avoided when aiming for an invariant measure. Item 17 was included because the wording explicitly refers to the construct of interest (“it is getting boring”), which provides the scale with a salient anchor since this item does not rely on second-order operationalization of L2 boredom by addressing the construct of interest directly.

4 Step 4: Confirming the structure of the short form

Using Sample 2, CFA of the measurement model of the proposed S-FLCBS was conducted (see Figure 3). The overall fit statistics indicated close to reasonable fit, χ²(35) = 133.420, p < .001, with CFI (.960) and SRMR (.034) both indicating close fit, whereas RMSEA (.067) and TLI (0.936) indicated reasonable fit (Kenny, 2020). All item loadings were satisfactory. As such, the measurement model proposed in the previous steps using Sample 1 was confirmed in Sample 2.

Figure 3.

Measurement model.

5 Step 5: Reliability and validity

The internal consistency of the S-FLCBS was found to be acceptable (α = .888; ω = .890). Convergent and divergent validity was confirmed via bivariate correlations with L2 Grit, Language Learning Curiosity, FLE, FLCA, and motivated behaviour (see Table 5). The bivariate correlations between the S-FLCBS and the original 23-item BPELC-R were also examined. Convergent validity was confirmed via the statistically significant relationships between the S-FLCBS and FLCA, as well as the long-form BPELC-R. FLLB, as measured through the newly developed short-form, was positively related to FLCA (r = .211; p < .001 ), as well as very highly positively correlated to the long-form BPELC-R (r = .954; p <.001 ). In turn, divergent validity was confirmed via the statistically significant negative correlations between the S-FLCBS and L2 Grit, Language Learning Curiosity, FLE, and Motivated Behaviour (-.503 ≤ r ≤ -.222; p < .001; see Table 5). We are therefore confident that the S-FLCBS would be an accurate measure of FLLB on the basis of its model fit, reliability, and relationships to other constructs within the known nomological network.

Table 5.

Correlation matrix.

	1.	2.	3.	4.	5.	6.	7
1. L2 grit	-	.388**	.578**	−.198**	.660**	−.525**	−.503**
2. Curiosity		-	.414**	.144**	.522**	−.234**	−.222**
3. Enjoyment			-	−.316**	.649**	−.517**	−.489**
4. Anxiety				-	−.017**	.244**	.211**
5. Motivation					-	−.410**	−.376**
6. BPELC-R						-	.954**
7. S-FLCBS							-

Note. BPELC-R = Boredom in Practical English Classes-Revised Scale; S-FLCBS = Short-Form Foreign Language Learning Boredom Scale.

p < .001.

6 Steps 6 and 7: Recombining the dataset and invariance testing

Multigroup CFA models were used to test invariance across genders, age groups, and countries (see Table 6). Firstly, the S-FLCBS was found to be fully invariant across genders, meeting all necessary cutoffs (i.e., ∆CFI ⩽ −.010, ∆RMSEA ⩽ −.015, ∆SRMR ⩽ −.030). Therefore, male (n = 349) and female (n = 903) participants understood and completed the S-FLCBS in similar ways. Secondly, the S-FLCBS was also found to be fully invariant across the three constructed age groups: teenagers (< 20 years old; n = 285), young adults (20 < x ⩽ 24 years old; n = 834), and adults (⩾ 25 years old; n = 158). Therefore, participants from the three age groups showed invariant factor structures, factor loadings, and item intercepts. Lastly, invariance was examined across participants from three countries: Poland (n = 550), Hungary (n = 342), and Iran (n = 241).² Configural invariance (equal factor structures) and metric invariance (equal factor loadings) were supported; however, scalar invariance (equal item intercepts) failed the cutoff guidelines by Cheung and Rensvold (2002) and F.F. Chen (2007; see Table 6). As such, the S-FLCBS meets the requirements for weak measurement invariance (up to metric invariance) but does not show evidence of strong measurement invariance across nationality groups (Meredith, 1993). We therefore conclude that gender and age comparisons can reliably be made with the S-FLCBS; however, cross-country comparisons ought to be approached with caution.

Table 6.

Invariance testing results.

Invariance model	χ²	df	p value	CFI	RMSEA	SRMR	∆CFI	∆RMSEA	∆SRMR
Invariance across genders
Configural	257.722	70	<.001	.962	.065	.033
Metric	264.470	79	<.001	.962	.061	.038	<−.001	−.004	−.005
Scalar	283.680	88	<.001	.960	.060	.036	−.001	−.001	−.002
Invariance across age groups
Configural	338.922	105	<.001	.952	.074	.038
Metric	363.374	123	<.001	.951	.069	.049	−.001	−.005	−.009
Scalar	401.304	141	<.001	.947	.067	.048	−.004	−.002	−.001
Invariance across countries
Configural	326.585	105	<.001	.956	.075	.038
Metric	366.681	123	<.001	.952	.072	.057	−.004	−.003	−.019
Scalar	623.225	141	<.001	.905	.095	.069	−.050	−.023	−.012

Note. Indicators that failed to meet the cut-off guidelines are boldfaced.

VI Discussion

The present investigation aimed to develop a short version of Pawlak et al.’s (2020) BPELC-R scale, examine the reliability and validity of the tool, and determine whether it is characterized by measurement invariance in terms of gender, age, and nationality. The analyses conducted allowed a considerable reduction in the number of statements included in the initial instrument, allowing the development of the S-FLBCS, comprising a total of 10 items. The psychometric properties of the tool are discussed below, taking as a point of reference the three RQs that guided the study.

When it comes to RQ1, PCA run on one part of the dataset (Sample 1) provided support for a single-factor solution. This structure of L2 boredom is also reflected in the 10-item S-FLCBS and was corroborated through CFA conducted with the other part of the dataset (Sample 2). Results of the PCA as well as the interitem correlation matrix were used to manually select 10 items for inclusion in the S-FLCBS. These manually selected items refer directly to boredom (e.g., Item 17: “It seems that English classes are the same all the time, it is getting boring”) as well as to the experience of being bored (e.g., Item 5: “I do not feel like doing anything in my English classes”). This manual selection of items was further corroborated with the ACO algorithm, with one exception. The ACO algorithm selected Item 6 (“It takes more stimulation to get me going in English classes than most of my group”), whereas we manually selected Item 17 (“It seems that English classes are the same all the time, it is getting boring”). This item was primarily selected on theoretical grounds as it directly referred to boredom and did not include a peer comparison caveat. In addition, it should be noted that none of the items selected were reverse-scored, as the use of reverse-scored items as attention checks is not possible in the S-FLCBS. Although we are confident that the 10 items selected for the S-FLCBS reflect the underlying theory regarding boredom in L2 learning, admittedly, the reduction in the number of items likely reduced the level of depth with which boredom is measured in comparison to the long form of the scale.

Beyond the items selected, the identification of a unidimensional solution to the FLLBS was somewhat unexpected. On the one hand, this is an admittedly surprising finding in view of the fact that previous studies have identified more than one factor underlying the construct under investigation. The most obvious case in point is the BPELC-R (Pawlak, Kruk, Zawodniak, et al., 2020), which constituted a point of departure for the present empirical investigation and which reflects a two-factor structure of boredom that was confirmed in a number of subsequent studies (e.g., Coşkun & Yüksel, 2021; Kruk et al., 2022, 2023). After-class boredom was also found to be multidimensional, as reflected in the BLEOS, which includes three factors (Pawlak, Kruk, Zawodniak, et al., 2022, Pawlak, Solhi, et al., 2024). Other researchers have identified even more factors underlying this negative emotion, as evident in the FLLBS developed by Li et al. (2023) in China, which includes seven subscales, or the PSBEC validated by Mousavian Rad et al. (2024) in the context of Iran, which comprises 11 subscales. Importantly, Li et al.’s (2024) recent SF-FLLBS is also characterized by a three-factor structure even though the tool only consists of 11 items. On the other hand, it could reasonably be argued that there is an overarching higher order dimension of boredom that might trump the individual contribution of the two lower order factors included in the BPELC-R: disengagement, monotony, and receptiveness; and lack of satisfaction and challenge. The importance of this higher order factor may have become more pronounced when the number of items was considerably reduced in the S-FLCBS, all the more so because, in contrast to other instruments (e.g., Li et al., 2024), including the BPELC-R (Pawlak, Kruk, Zawodniak, et al., 2020), the newly developed instrument is only intended to tap into boredom as it is experienced during formal foreign language instruction. This reasoning has some empirical support since recent studies examining different facets of in-class and after-class boredom with state-of-the-art statistical procedures (e.g., Elahi Shirvan et al., 2024; Kruk et al., 2022; Pawlak, Kruk, Csizér, et al., 2023) have demonstrated that the global factor of boredom may play an important role in the measurement theory of this construct, also in relation to other emotions (e.g., foreign language enjoyment).

The reduction in the number of factors from the long-form BPELC-R to the short-form FLCBS needs to be considered by future researchers when selecting a measure of boredom in their studies. Reducing both the factor structure and the number of items inevitably reduces the extent to which boredom is measured (i.e., from a fine-grained two-factor construct to a broader unidimensional one). Researchers are therefore faced with a trade-off. On the one hand, a reduced number of items results in quicker administration and the possibility to include a greater number of constructs in data collection. On the other hand, the complexity level of the construct resulting from the use of the measure is also reduced. This trade-off is not uncommon in the field of psychometrics and is often seen in personality scale development. For example, the International Personality Item Pool has been utilized to develop and validate both the 24-item, six-subfacet measure of conscientiousness (see Maples et al., 2014) and the four-item, unidimensional “mini-measure” of conscientiousness (see Donnellan et al., 2006). Both scales provide a valid and reliable measure of conscientiousness, but the long-form version inevitably provides more fine-grained information, whereas the short form can be quickly and easily administered in complex studies with multiple variables. The choice between long-form and short-form measures therefore needs to be made by researchers depending on the need and design of a particular study: Is it necessary to “zoom in” to examine the construct and understand the intricate complexities of each underlying facet or is it the aim to “zoom out” and examine the construct as a broader measure within a larger nomological network of variables.

In regard to RQ2, which pertained to the psychometric properties of the S-FLCBS, the analyses demonstrated that the instrument has acceptable reliability and validity, which bodes well for its use in future empirical studies. More specifically, Cronbach’s alpha and McDonald’s omega values (α = .888, ω = .890), which, interestingly, were almost identical to those identified for the two factors underpinning the BPELC-R (Pawlak, Kruk, Zawodniak, et al., 2020), indicate satisfactory internal consistency of the newly developed scale. Bivariate correlations also provided convincing evidence for convergent and divergent validity of the S-FLCBS. When it comes to the former, the scale correlated positively (albeit with a small effect size, r = .211) with the FLCAS, which is in line with theoretical assumptions as both boredom and anxiety represent negative emotions. It also largely corroborates previous research findings (e.g., Dewaele, Botes, & Grieff, 2023; Li & Wei, 2023) even if, possibly due to reliance on different data collection instruments, the strength of the correlation was lower in the present study. Even more importantly, the S-FLCBS was strongly related to the BPELC-R (r = .954), which, on the one hand, should hardly come as a surprise given that the former constitutes a reduced form of the latter, and, on the other, clearly indicates that the new scale successfully taps into the negative emotion of boredom. In relation to divergent validity, small-to-medium negative correlations were revealed between the S-FLCBS and variables believed to play a positive role in L2 learning processes and outcomes within the nomological network, that is, in the order of the magnitude of the relationship detected, L2 grit (r = −.503), FLE (r = −.489), language learning curiosity (r = −.376), and motivated learning behavior (r = −.222). Also in this case, the results are comparable to those of previous empirical studies, with the important caveat that different tools may have been used to collect the requisite data and different dimensions of these constructs may have been investigated (e.g., Dewaele, Botes, & Maftah et al., 2023; Kruk, 2016; Li & Wei, 2023; Pawlak, Kruk, Zawodniak, et al., 2022; Zhao et al., 2023). Taken together, the results demonstrate that the S-FLCBS is a valid and reliable research instrument that can be confidently used to tap into the negative emotion of in-class boredom.

Finally, in relation to RQ3, multigroup CFA models calculated for the entire dataset (Samples 1 and 2) confirmed measurement invariance of the S-FLCBS with respect to age (i.e., teenagers vs. young adults vs. adults), gender (i.e., male vs. female participants), and country (i.e., Hungary vs. Iran vs. Poland). One important qualification to these encouraging results is that while the analysis provided necessary evidence for configural (factor structures), metric (factor loadings), and scalar (item intercepts) invariance for age and gender, this was not the case for nationality, where scalar invariance was not supported. Thus, it is possible to claim only weak measurement invariance of the S-FLCBS (Meredith, 1993) when this research instrument is employed to collect data on in-class boredom in different countries. One plausible explanation for this finding could be that boredom is perceived and manifested in somewhat different ways in various cultural and educational contexts, which may affect how the items included in the scale are understood and responded to. Obviously, such a situation does not only apply to measuring boredom and has been observed in relation to other ID factors, such as grammar learning strategies, in which case, numerous items needed to be eliminated in the process of validating the requisite tools (see e.g., Pawlak, Derakshan, et al., 2023). One way or another, these results indicate that caution should be exercised when the S-FLCBS is employed to compare students from diverse national backgrounds with respect to their in-class boredom (perhaps alongside other ID variables). Further research is needed to examine cross-cultural differences in the experience and conceptualization of boredom. In addition, future research utilizing the S-FLCBS with diverse participants needs to examine measurement invariance across groups before embarking on any cross-cultural, group-level comparison studies.

Inevitably, the present study has some limitations that should be acknowledged. First, the level of English proficiency was not taken into account in the analysis and this could potentially have been an important variable when validating the S-FLCBS and also when determining the extent to which it is measurement invariant. However, since participants came from different national contexts and the datasets originated from independent studies, there was no foolproof way of ensuring that information in this regard would be valid and comparable (i.e., different assessment criteria, different scales used to elicit self-perceived proficiency). Second, due to the nature of the dataset, the sample sizes from different countries were not sufficient for the necessary analyses in all cases, which resulted, for example, in the exclusion of the students from Iraq, which could have impacted the results for measurement invariance. Third, again owing to the way the dataset was collated, this study did not include a qualitative component that would have allowed further validation of the scale by, for example, asking students representing different ages, genders, and nationalities to provide their insights on items that should ultimately be retained in the S-FLCBS. Fourth, since the data collected in this study were cross-sectional in nature, no test–retest validity could be determined to ensure the stability of boredom as measured with the S-FLCBS. For this reason, research examining the test–retest reliability of the S-FLCBS with longitudinal data is needed.

VII Conclusion

The empirical investigation reported in the present paper allowed the development and validation of the reduced version of the BPELC-R scale (Pawlak, Kruk, Zawodniak, et al., 2020), that is, the 10-item S-FLCBS, as well as offering evidence for the measurement invariance of boredom in relation to age, gender, and, to some extent, nationality. As a result, researchers embarking on studies into boredom in L2 learning and teaching have at their disposal yet another valid and reliable data collection instrument that they can include in composite questionnaires tapping other emotions and ID factors when complex designs need to be employed. In fact, the S-FLCBS represents a valuable addition to the FLLBS-SF (Li et al., 2024) not only because it is likely to be more suitable in some national and educational settings, but also because it squarely focuses on boredom experienced when learning a foreign language (not only English) in instructional settings. That said, it would be instructive to further validate the scale with students of other nationalities and also to make it more domain-specific by, for example, adjusting it to reflect the experience of boredom when learning specific TL areas (e.g., grammar, reading, writing) as well as specific contexts in which L2 learning and teaching may take place (e.g., computer-assisted, intermediate-level English language learning, etc.). In addition, there are grounds to assume that, thanks to its brevity and the way in which the items are worded, in contrast to the BPELC-R, the utility of the S-FLCBS could extend beyond English majors and thus the instrument could be successfully used with L2 learners in other university programs and at lower educational levels, such as secondary school. These assumptions, however, need to be verified empirically in further studies.

Footnotes

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Science Center, Poland (NCN 2022/45/B/HS2/00187, 2023–2025).

ORCID iDs

Mirosław Pawlak

Elouise Botes

Mariusz Kruk

Notes

References

Ally

(2008). What is wrong with current theorizations of ‘boredom’? Acta Academica, 40(3), 35–66.

Barchard

K.A.

Grob

K.E.

Roe

M.J.

(2017). Is sadness blue? The problem of using figurative language for emotions on psychological tests. Behavior Research Methods, 49, 443–456. https://doi.org/10.3758/s13428-016-0713-5

Botes

Dewaele

J.-M.

Greiff

(2021). The development and validation of the short form of the Foreign Language Enjoyment Scale. The Modern Language Journal, 105(4), 858–876. https://doi.org/10.1111/modl.12741

Botes

Dewaele

J.-M.

Greiff

Goetz

(2024). Can personality predict foreign language classroom emotions? The devil’s in the detail. Studies in Second Language Acquisition, 46(1), 51–74. https://doi.org/10.1017/S0272263123000153

Botes

van der Westhuizen

Dewaele

J.M.

MacIntyre

Greiff

(2022). Validating the Short-Form Foreign Language Classroom Anxiety Scale. Applied Linguistics, 43(5), 1006–1033. https://doi.org/10.1093/applin/amac018

Camacho-Morles

Slemp

G.R.

Pekrun

Loderer

Hou

Oades

L.G.

(2021). Activity achievement emotions and academic performance: A meta-analysis. Educational Psychology Review, 33(3), 1051–1095. https://doi.org/10.1007/s10648-020-09585-3

Carlson

Wilcox

Chou

C.P.

Chang

Yang

Blanchard

Marterella

Kuo

Clark

(2011). Psychometric properties of reverse-scored items on the CES-D in a sample of ethnically diverse older adults. Psychological Assessment, 23(2), 558–562. https://doi.org/10.1037/a0022484

Chaffee

K.E.

Lou

N.M.

Noels

K.A.

Katz

J.W.

(2020). Why don’t “real men” learn languages? Masculinity threat and gender ideology suppress men’s language learning motivation. Group Processes & Intergroup Relations, 23(2), 301–318. https://doi.org/10.1177/1368430219835025

Chapman

K.E.

(2013). Boredom in the German foreign language classroom (Publication No. 3566370) [Doctoral dissertation, University of Wisconsin-Madison]. ProQuest Dissertations & Theses.

10.

Chen

F.F.

(2007). Sensitivity of goodness of fit indexes to lack of measurement invariance. Structural Equation Modeling: A Multidisciplinary Journal, 14(3), 464–504. https://doi.org/10.1080/10705510701301834

11.

Chen

Sun

Yang

(2024). Understanding Chinese second language learners’ foreign language learning boredom in online classes: Its conceptual structure and sources. Journal of Multilingual and Multicultural Development, 45(8), 3291–3307. https://doi.org/10.1080/01434632.2022.2093887

12.

Cheung

G.W.

Rensvold

R.B.

(2002). Evaluating goodness-of-fit indexes for testing measurement invariance. Structural Equation Modeling: A Multidisciplinary Journal, 9(2), 233–255. https://doi.org/10.1207/S15328007SEM0902_5

13.

Clark

L.A.

Watson

(1995). Constructing validity: Basic issues in objective scale development. Psychological Assessment, 7(3), 309–319. https://doi.org/10.1037/1040-3590.7.3.309

14.

Coşkun

Yüksel

(2021). Examining English as a foreign language students’ boredom in terms of different variables. Acuity: Journal of English Language Pedagogy, Literature and Culture, 7(1), 19–36. https://doi.org/10.35974/acuity.v7i2.2539

15.

Council of Europe. (2001). Common European framework of reference for languages: Learning, teaching, assessment. Cambridge University Press.

16.

Csizér

Smid

Zólyomi

Albert

. (2024). Motivation, autonomy, and emotions in foreign language learning: A multi-perspective investigation in Hungary. Multilingual Matters.

17.

Daschmann

E.C.

Goetz

Stupnisky

R.H.

(2014). Exploring the antecedents of boredom: Do teachers know why students are bored? Teaching and Teacher Education, 39, 22–30. https://doi.org/10.1016/j.tate.2013.11.009

18.

Derakhshan

Kruk

Mehdizadeh

Pawlak

(2021). Boredom in online classes in the Iranian EFL context: Sources and solutions. System, 101, Article 102556. https://doi.org/10.1016/j.system.2021.102556

19.

Derakhshan

Kruk

Mehdizadeh

Pawlak

(2022). Activity-induced boredom in online EFL classes. ELT Journal, 76(1), 58–68. https://doi.org/10.1093/elt/ccab072

20.

Dewaele

J.-M.

Botes

Greiff

(2023). Sources and effects of foreign language enjoyment, anxiety, and boredom: A structural equation modeling approach. Studies in Second Language Acquisition, 45(2), 461–479. https://doi.org/10.1017/S0272263122000328

21.

Dewaele

J.-M.

Botes

Meftah

(2023). A three-body problem: The effects of foreign language anxiety, enjoyment, and boredom on academic achievement. Annual Review of Applied Linguistics, 43, 7–22. https://doi.org/10.1017/S0267190523000016

22.

Dewaele

J.-M.

(2021). Teacher enthusiasm and students’ social-behavioral learning engagement: The mediating role of student enjoyment and boredom in Chinese EFL classes. Language Teaching Research, 25(6), 922–945. https://doi.org/10.1177/13621688211014538

23.

Dewaele

J.-M.

MacIntyre

P.D.

(2014). The two faces of Janus? Anxiety and enjoyment in the foreign language classroom. Studies in Second Language Learning and Teaching, 4(2), 237–274. https://doi.org/10.14746/ssllt.2014.4.2.5

24.

Donnellan

M.B.

Oswald

F.L.

Baird

B.M.

Lucas

R.E.

(2006). The mini-IPIP scales: Tiny-yet-effective measures of the Big Five factors of personality. Psychological Assessment, 18(2), 192–203. https://doi.org/10.1037/1040-3590.18.2.192

25.

Dörendahl

Greiff

(2020). Are the machines taking over? Benefits and challenges of using algorithms in (short) scale construction. European Journal of Psychological Assessment, 36(2), 217–219. https://doi.org/10.1027/1015-5759/a000597

26.

Dörnyei

Dewaele

J.-M.

(2022). Questionnaires in second language research: Construction, administration, and processing. Routledge. https://doi.org/10.4324/9781003331926

27.

Elahi Shirvan

Taherian

Kruk

Pawlak

(2024). Testing associations between global and specific levels of foreign language enjoyment and foreign language boredom: The moderator role of L2 savoring beliefs using bifactor exploratory structural equation modeling. International Journal of Multilingual and Multicultural Development. Advance online publication. https://doi.org/10.1080/01434632.2024.2396054

28.

Eren

Coskun

(2016). Students’ level of boredom, boredom coping strategies, epistemic curiosity, and graded performance. The Journal of Educational Research, 109(6), 574–588. https://doi.org/10.1080/00220671.2014.999364

29.

Fahlman

S.A.

(2009). Development and validation of the Multidimensional State Boredom Scale [Unpublished doctoral dissertation]. York University.

30.

Farmer

R. F.

Sundberg

N. D.

(1986). Boredom proneness: The development and correlates of a new scale. Journal of Personality Assessment, 50(1), 4–17.

31.

Fathi

Pawlak

Kruk

Naderi

(2023). Modelling boredom in the EFL context: An investigation of the role of coping self-efficacy, mindfulness, and foreign language enjoyment. Language Teaching Research. Advance online publication. https://doi.org/10.1177/13621688231182176

32.

Field

(2013). Discovering statistics using SPSS (5th ed.). Sage.

33.

Galesic

Bosnjak

(2009). Effects of questionnaire length on participation and indicators of response quality in a web survey. Public Opinion Quarterly, 73(2), 349–360. https://doi.org/10.1093/poq/nfp031

34.

Goetz

Frenzel

A.C.

Hall

N.C.

Nett

Pekrun

Lipnevich

(2014). Types of boredom: An experience sampling approach. Motivation and Emotion, 38, 401–419. https://doi.org/10.1007/s11031-013-9385-y

35.

Hagtvet

Sipos

(2016). Creating short forms for construct measures: The role of exchangeable forms. Pedagogika, 66(6), 689–713. https://doi.org/10.14712/23362189.2016.346

36.

Heene

Bollmann

Bühner

(2014). Much ado about nothing, or much to do about something? Effects of scale shortening on criterion validity and mean differences. Journal of Individual Differences, 35(4), 245–249. https://doi.org/10.1027/1614-0001/a000146

37.

Horwitz

E.K.

Horwitz

M.B.

Cope

(1986). Foreign language classroom anxiety. The Modern Language Journal, 70(2), 125–132. https://doi.org/10.1111/j.1540-4781.1986.tb05256.x

38.

Kenny

D.A.

(2020, October 6). Measuring model fit. http://davidakenny.net/cm/fit.htm

39.

Kline

(2014). An easy guide to factor analysis. Taylor & Francis.

40.

Kruk

(2016a). Investigating the changing nature of boredom in the English language classroom: Results of a study. In Dłutek

Pietrzak

(Eds.), Nowy wymiar filologii (Vol. 1, pp. 252–263). Płock.

41.

Kruk

(2016b). Variations in motivation, anxiety and boredom in learning English in Second Life. The EuroCALL Review, 24(1), 25–39. https://doi.org/10.4995/eurocall.2016.5693

42.

Kruk

Pawlak

Elahi Shirvan

Soleimanzadeh

(2023). Revisiting boredom in practical English language classes via exploratory structural equation modeling. Research Methods in Applied Linguistics, 2(1), Article 100038. https://doi.org/10.1016/j.rmal.2022.100038

43.

Kruk

Pawlak

Elahi Shirvan

Taherian

Yazdanmehr

(2022). A longitudinal study of foreign language enjoyment and boredom: A latent growth curve modeling. Language Teaching Research, 29(3), 1007–1038. https://doi.org/10.1177/13621688221082303

44.

Kruk

Pawlak

Taherian

Yüce

Shirvan

M.E.

Barabadi

(2023). When time matters: Mechanisms of change in a mediational model of foreign language playfulness and L2 learners’ emotions using latent change score mediation model. Studies in Second Language Learning and Teaching, 13(1), 39–69. https://doi.org/10.14746/ssllt.37174

45.

Kruk

Pawlak

Zawodniak

(2021). Another look at boredom in language instruction: The role of the predictable and the unexpected. Studies in Second Language Learning and Teaching, 11(1), 15–40. http://doi.org/10.14746/ssllt.2021.11.1.2

46.

Kruk

Zawodniak

(2017). Nuda a praktyczna nauka języka angielskiego [Boredom in practical English language classes]. Neofilolog, 49(1), 115–131. https://doi.org/10.14746/n.2017.49.1.07

47.

Kruk

Zawodniak

(2018). Boredom in practical English language classes: Insights from interview data. In Szymański

Zawodniak

Łobodziec

Smoluk

(Eds.), Interdisciplinary views on the English language, literature and culture (pp. 177–191). Uniwersytet Zielonogórski.

48.

Lan

Zhao

Gong

(2023). Motivational intensity and willingness to communicate in L2 learning: A moderated mediation model of enjoyment, boredom, and shyness. System, 117, Article 103116. https://doi.org/10.1016/j.system.2023.103116

49.

(2021). A control-value theory approach to boredom in English classes among university students in China. Modern Language Journal, 105(1), 317–334. https://doi.org/10.1111/modl.12693

50.

(2022). Foreign language learning boredom and enjoyment: The effects of learner variables and teacher variables. Language Teaching Research, 29(4), 1499–1524. https://doi.org/10.1177/13621688221090324

51.

Dewaele

J.-M.

(2023). Foreign language learning boredom: Conceptualization and measurement. Applied Linguistics Review, 14(2), 223–249. https://doi.org/10.1515/applirev-2020-0124

52.

Feng

(2025). Boredom and achievement in L2 learning: A meta-analysis. Applied Linguistics Review, 6(5), 2373–2399. https://doi.org/10.1515/applirev-2024-0266

53.

Feng

Zhao

Dewaele

J.-M.

(2024). Foreign language learning boredom: Refining its measurement and determining its role in language learning. Studies in Second Language Acquisition, 46(3), 893–920. https://doi.org/10.1017/S0272263124000366

54.

Wei

(2023). Anxiety, enjoyment, and boredom in language learning amongst junior secondary students in rural China: How do they contribute to L2 achievement? Studies in Second Language Acquisition, 45(1), 93–108. https://doi.org/10.1017/S0272263122000031

55.

Yang

(2024). Domain-general grit and domain-specific grit: Conceptual structures, measurement, and associations with the achievement of German as a foreign language. International Review of Applied Linguistics in Language Teaching, 62(4), 1513–1537. https://doi.org/10.1515/iral-2022-0196

56.

MacIntyre

P.D.

(1992). Anxiety and language learning from a stages of processing perspective [Unpublished doctoral dissertation]. The University of Western Ontario.

57.

Macklem

G.L.

(2015). Boredom in the classroom: Addressing student motivation, self-regulation, and engagement in learning. Springer.

58.

Mahmoodzadeh

Khajavy

G.H.

(2019). Towards conceptualizing language learning curiosity in SLA: An empirical study. Journal of Psycholinguistic Research, 48, 333–351. https://doi.org/10.1007/s10936-018-9606-3

59.

Maples

J.L.

Guan

Carter

N.T.

Miller

J.D.

(2014). A test of the International Personality Item Pool representation of the Revised NEO Personality Inventory and development of a 120-item IPIP-based measure of the Five-Factor model. Psychological Assessment, 26(4), 1070–1084. https://doi.org/10.1037/pas0000004

60.

Marsh

H.W.

Ellis

L.A.

Parada

R.H.

Richards

Heubeck

B.G.

(2005). A short version of the Self-Description Questionnaire II: Operationalizing criteria for short-form evaluation with new applications of confirmatory factor analyses. Psychological Assessment, 17(1), 81–102. https://doi.org/10.1037/1040-3590.17.1.81

61.

McGuire

B.E.

Morrison

T.G.

Hermanns

Skovlund

Eldrup

Gagliardino

Kokoszka

Matthews

Pibernik-Okanović

Rodríguez-Saldaña

de Wit

Snoek

F.J.

(2010). Short-form measures of diabetes-related emotional distress: The Problem Areas in Diabetes Scale (PAID)-5 and PAID-1. Diabetologia, 53(1), 66–69. https://doi.org/10.1007/s00125-009-1559-5

62.

Meade

A.W.

(2005, April 14–17). Sample size and tests of measurement invariance [Paper presentation]. 20th Annual Conference of the Society for Industrial and Organizational Psychology, Los Angeles, CA.

63.

Meredith

(1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika, 58(4), 525–543. https://doi.org/10.1007/BF02294825

64.

Millsap

R.E.

(2011). Statistical approaches to measurement invariance. Routledge. https://doi.org/10.4324/9780203821961

65.

Mousavian Rad

S.E.

Roohani

Mirzaei

(2024). Developing and validating precursors of students’ boredom in EFL classes: An exploratory sequential mixed-methods study. Journal of Multilingual and Multicultural Development, 45(8), 3010–3027. https://doi.org/10.1080/01434632.2022.2082448

66.

Nakamura

Darasawang

Reinders

(2021). The antecedents of boredom in L2 classroom learning. System, 98, Article 102469. https://doi.org/10.1016/j.system.2021.102469

67.

Nett

U.E.

Goetz

Daniels

L.M.

(2010). What to do when feeling bored? Students’ strategies for coping with boredom. Learning & Individual Differences, 20(6), 626–638. https://doi.org/10.1016/j.lindif.2010.09.004

68.

Olaru

Witthöft

Wilhelm

(2015). Methods matter: Testing competing models for designing short-scale Big-Five assessments. Journal of Research in Personality, 59, 56–68. https://doi.org/10.1016/j.jrp.2015.09.001

69.

Pawlak

Derakhshan

Mehdizadeh

Kruk

(2022). Boredom in online English language classes: Mediating variables and coping strategies. Language Teaching Research, 29(2), 509–534. https://doi.org/10.1177/13621688211064944

70.

Pawlak

Derakhshan

Mehdizadeh

Kruk

(2023). Yet another look at strategies for learning grammar: Validating the Grammar Learning Strategy Inventory in the Iranian EFL context. System, 118, Article 103139 https://doi.org/10.1016/j.system.2023.103139

71.

Pawlak

Fathi

Kruk

(2024). The Domain-Specific Grammar Grit Questionnaire: A cross-cultural validation study. Journal of Multilingual and Multicultural Development. Advance online publication. https://doi.org/10.1080/01434632.2024.2322692

72.

Pawlak

Kruk

Csizér

Zawodniak

(2024). Investigating in-class and after-class boredom among advanced learners of English: Intensity, interrelationships and learner profiles. Applied Linguistics Review, 15(6), 2537–2564. https://doi.org/10.1515/applirev-2022-0150

73.

Pawlak

Kruk

Zawodniak

(2022). Investigating individual trajectories in experiencing boredom in the language classroom: The case of 11 Polish students of English. Language Teaching Research, 26(4), 598–616. https://doi.org/10.1177/1362168820914004

74.

Pawlak

Kruk

Zawodniak

(2024). Teachers reflecting on boredom in the language classroom. Equinox Publishing.

75.

Pawlak

Kruk

Zawodniak

Pasikowski

(2020). Investigating factors responsible for boredom in English classes: The case of advanced learners. System, 91, Article 102259. https://doi.org/10.1016/j.system.2020.102259

76.

Pawlak

Kruk

Zawodniak

Pasikowski

(2022). Examining the underlying structure of after-class boredom experienced by English majors. System, 106, Article 102769. https://doi.org/10.1016/j.system.2022.102769

77.

Pawlak

Solhi

Elahi Shirvan

Kruk

Taherian

(2024). Revisiting after-class boredom via exploratory structural equation modeling. International Review of Applied Linguistics in Language Teaching, 62(4), 1827–1851. https://doi.org/10.1515/iral-2022-0151

78.

Pawlak

Zarrinabadi

Kruk

(2024). Positive and negative emotions, L2 grit and perceived competence as predictors of L2 motivated behavior. Journal of Multilingual and Multicultural Development, 45(8), 3188–3204. https://doi.org/10.1080/01434632.2022.2091579

79.

Pawlak

Zawodniak

Kruk

(2020a). Boredom in the foreign language classroom: A micro-perspective. Springer.

80.

Pawlak

Zawodniak

Kruk

(2020b). The neglected emotion of boredom in teaching English to advanced learners. International Journal of Applied Linguistics, 30(3), 497–509. https://doi.org/10.1111/ijal.12302

81.

Pekrun

(2006). The control-value theory of achievement emotions: Assumptions, corollaries, and implications for educational research and practice. Educational Psychology Review, 18(4), 315–341. https://doi.org/10.1007/s10648-006-9029-9

82.

Pekrun

Goetz

Frenzel

A.C.

Barchfeld

Perry

R.P.

(2011). Measuring emotions in students’ learning and performance: The Achievement Emotions Questionnaire (AEQ). Contemporary Educational Psychology, 36(1), 36–48. https://doi.org/10.1016/j.cedpsych.2010.10.002

83.

Pekrun

Hall

N.C.

Goetz

Perry

R.P.

(2014). Boredom and academic achievement: Testing a model of reciprocal causation. Journal of Educational Psychology, 106(3), 696–710. https://doi.org/10.1037/a0036006

84.

Pekrun

Vogl

Muis

K. R.

Sinatra

G. M.

(2016). Measuring emotions during epistemic activities: The Epistemically-related emotion scales. Cognition and Emotion, 31(6), 1268–1276. https://doi.org/10.1080/02699931.2016.1204989

85.

Plonsky

Oswald

F.L.

(2014). How big is “big”? Interpreting effect sizes in L2 research. Language Learning, 64(4), 878–912. https://doi.org/10.1111/lang.12079

86.

Raborn

A.W.

Leite

W.L.

(2018). ShortForm: An R package to select scale short forms with the ant colony optimization algorithm. Applied Psychological Measurement, 42(6), 516–517. https://doi.org/10.1177/0146621617752993

87.

Rolstad

Adler

Rydén

(2011). Response burden and questionnaire length: Is shorter better? A review and meta-analysis. Value in Health, 14(8), 1101–1108. https://doi.org/10.1016/j.jval.2011.06.003

88.

Rosseel

(2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48(2), 1–36. https://doi.org/10.18637/jss.v048.i02

89.

Scherer

K.R.

Moors

(2019). The emotion process: Event appraisal and component differentiation. Annual Review of Psychology, 70, 719–745. https://doi.org/10.1146/annurev-psych-122216-011854

90.

Schoeni

R.F.

Stafford

McGonagle

K.A.

Andreski

(2013). Response rates in national panel surveys. The Annals of the American Academy of Political and Social Science, 645(1), 60–87. https://doi.org/10.1177/0002716212456363

91.

Solhi

Derakhshan

Ünsal

(2025). Associations between EFL students’ L2 grit, boredom coping strategies, and emotion regulation strategies: A structural equation modeling approach. Journal of Multilingual and Multicultural Development, 46(2), 224–243. https://doi.org/10.1080/01434632.2023.2175834

92.

Tabachnick

B.G.

Fidell

L.S.

(2001). Using multivariate statistics. Allyn & Bacon.

93.

Taguchi

Magid

Papi

(2009). The L2 motivational self system among Japanese, Chinese and Iranian learners of English: A comparative study. In Dörnyei

Ushioda

(Eds.), Motivation, language identity and the L2 self (pp. 66–97). Multilingual Matters.

94.

Taherian

Shirvan

M.E.

Yazdanmehr

Kruk

Pawlak

(2024). A longitudinal analysis of informal digital learning of English, willingness to communicate and foreign language boredom: A latent change score mediation model. Asia-Pacific Education Researcher, 33, 997–1010. https://doi.org/10.1007/s40299-023-00751-z

95.

Teimouri

Plonsky

Tabandeh

(2022). L2 grit: Passion and perseverance for second-language learning. Language Teaching Research, 26(5), 893–918. https://doi.org/10.1177/1362168820921895

96.

Tulis

Fulmer

S.M.

(2013). Students’ motivational and emotional experiences and their relationship to persistence during academic challenge in mathematics and reading. Learning and Individual Differences, 27, 35–46. https://doi.org/10.1016/j.lindif.2013.06.003

97.

Tze

V.M.C.

Daniels

L.M.

Klassen

R.M.

(2016). Evaluating the relationship between boredom and academic outcomes: A meta-analysis. Educational Psychology Review, 28, 119–144. https://doi.org/10.1007/s10648-015-9301-y

98.

Tze

V.M.C.

Klassen

R.M.

Daniels

L.M.

(2014). Patterns of boredom and its relationship with perceived autonomy support and engagement. Contemporary Educational Psychology, 39(3), 175–187. https://doi.org/10.1016/j.cedpsych.2014.05.001

99.

Woods

C.M.

(2006). Careless responding to reverse-worded items: Implications for confirmatory factor analysis. Journal of Psychopathology and Behavioral Assessment, 28(3), 186–191. https://doi.org/10.1007/s10862-005-9004-7

100.

Zhang

Saadeian

Fathi

(2024). Testing a model of growth mindset, ideal L2 self, boredom, and WTC in an EFL context. Journal of Multilingual and Multicultural Development, 45(8), 3450–3465. https://doi.org/10.1080/01434632.2022.2100893

101.

Zhao

Lan

Chen

(2023). Motivational intensity and self-perceived Chinese language proficiency: A moderated mediation model of L2 enjoyment and boredom. Language Teaching Research. Advance online publication. https://doi.org/10.1177/13621688231180465

102.

Zhao

Wang

(2025). The role of enjoyment and boredom in shaping English language achievement among ethnic minority learners. Journal of Multilingual and Multicultural Development, 46(3), 668–680. https://doi.org/10.1080/01434632.2023.2194872

Development and validation of the Short-Form Foreign Language Classroom Boredom Scale (S-FLCBS)

Abstract

Keywords

I Introduction

II Measurement of L2 boredom

III Rationale for the current study and research questions

IV Method

1 Participants

2 Instruments

3 Data analysis

a Step 1: Splitting the dataset

b Step 2: Exploring the factor structure

c Step 3: Short-form item selection

d Step 4: Confirming the structure of the short form

e Step 5: Reliability and validity

f Step 6: Recombining the dataset

g Step 7: Invariance testing

V Results

1 Step 1: Splitting the dataset

2 Step 2: Exploring the factor structure

3 Step 3: Development of the S-FLCBS

4 Step 4: Confirming the structure of the short form

5 Step 5: Reliability and validity

6 Steps 6 and 7: Recombining the dataset and invariance testing

VI Discussion

VII Conclusion

Footnotes

Funding

ORCID iDs

Notes

References