Abstract
In line with current trends in the field to reduce the number of items in scales and to provide thorough validation evidence, the study aimed to develop and validate a short-form version of the 21-item Boredom in Learning English Outside of School (BLEOS) scale. A multinational dataset of n = 1,133 participants was divided into two samples: the first sample was used to develop the Short-form Boredom in Learning English Outside of School (S-BLEOS) scale while the second sample was used to validate the scale. Development efforts included examining the underlying factor structure of the 21-item scale and selecting items through a combination of factor loadings, item reliabilities, and inter-item correlations. Validation efforts included testing the resultant measurement model through confirmatory factor analysis, as well as establishing convergent and discriminant validity by contrasting the S-BLEOS with existing boredom and other individual difference variables in the known foreign language learning nomological network. The development of the S-BLEOS resulted in a unidimensional model of 10 items. The measurement model indicated a close fit, with convergent and discriminant validity established. Importantly, the S-BLEOS was shown to be a construct that was related to but distinct from in-class language learning boredom. In addition, invariance testing was carried out across gender, age, and nationality groups, with the S-BLEOS determined to be fully invariant. As such, the S-BLEOS can be used to measure out-of-class language learning boredom for adolescents and adults and can be fairly administered across genders, age groups, and nationalities.
Keywords
I Introduction
Boredom has been the focus of empirical inquiry in educational psychology as one of the most pervasive and intensely experienced academic emotions, which can have a deleterious effect on the learning of different school subjects, even if it has been often ignored, being mistaken for apathy, laziness or depression (Camacho-Morles et al., 2021; Macklem, 2015; Pekrun et al., 2014; Tulis & Fulmer, 2013; Tze et al., 2014). The construct is exceedingly complex, not only because it entails various affective, cognitive, expressive, motivational, and physiological factors (Nett, 2010; Scherer & Moors, 2019), but also because it can be triggered by a range of causes, come in different shapes and colors, and vary considerably in intensity (Daschmann et al., 2014; Goetz et al., 2014). While it has been less than a decade since research into boredom in second and foreign language (L2) education got under way in earnest, mainly thanks to the work undertaken in the Polish educational setting (e.g. Kruk, 2016; Kruk & Zawodniak, 2018; Pawlak, Kruk, et al., 2020; Pawlak, Zawodniak, et al., 2020), empirical studies have proliferated with lightning speed. This line of inquiry has moved beyond its initial focus on examining the causes of this negative emotion, its fluctuations over time and underlying structure to exploring its complex links to other academic emotions and individual difference (ID) variables as well as attainment, increasingly relying on longitudinal designs and complex statistical procedures (e.g. Dewaele et al., 2023a, 2023b; Fathi et al., 2023; Kruk & Pawlak, 2022; Kruk et al., 2023; Lan et al., 2023; C. Li et al., 2023, 2025; L. Li et al., 2025; Nakamura et al., 2021; Pawlak, Derakhshan et al., 2022; Pawlak & Kruk, 2022; Pawlak, Kruk et al., 2020; Solhi et al., 2023; Taherian et al., 2024; Zhao & Wang, 2023, 2025; Zhang et al., 2022).
However, the bulk of research into L2 boredom has focused on this negative emotion as it is experienced in educational settings, that is, during L2 classes, largely ignoring its causes, manifestations and effects in learning an additional language in a wide range of out-of-class situations (Pawlak et al., 2022, 2023). This is surely unfortunate given the fact that a huge chunk of L2 learning, especially English, presently occurs outside language classrooms, ‘in the wild’ or ‘extramurally’, thanks to widespread access to the media as well as the ubiquity of computer games and other types of digital technologies (De Wilde & Eyckmans, 2017; Sundqvist, 2024). In fact, as research indicates, informal digital learning of English (Guo & Lee, 2023), but also of languages other than English (Liu et al., 2024), is playing an increasingly important role and this trend can only be expected to gain momentum in view of the growing impact of artificial intelligence (AI) (Koç & Savaş, 2025). Although, on the face of it, there should be no place for boredom in out-of-school contexts since they allow students to manifest a considerable degree of autonomy, agency, and self-direction, this assumption would be quite unwarranted. For one thing, extramural learning of additional languages is often closely related to classroom-based L2 learning as students work on homework assignments or take steps to better prepare for upcoming tests or examinations. While this scenario is perhaps easy to understand, boredom can also set in when learners make their entirely own choices about what, when and how to learn, thus thwarting their well-intentioned efforts to enhance target language (TL) skills. For example, watching shows with the original soundtrack and L2 subtitles or using AI-based software may also become tedious at some point, even if these activities are not directly linked to schoolwork.
To situate this distinction theoretically, the control-value theory of achievement emotions (Pekrun, 2006; Pekrun et al., 2010) offers a useful explanatory framework, according to which boredom arises when learners appraise low control over the task and/or low value of the activity. In classroom contexts, control is largely external and teacher-driven, whereas after class it is nominally transferred to the learner. Nevertheless, boredom may still occur when self-selected or teacher-imposed out-of-school tasks are perceived as unchangeable, trivial, or lacking in personal relevance. This interpretation is consistent with the attentional theory of boredom proneness (Eastwood et al., 2007) and with under- and over-stimulation models (Hill & Perkins, 1985; Larson & Richards, 1991), which explain how lapses of attention and mismatches between challenge and skill can generate disengagement outside the classroom.
In light of the above considerations, there is an evident need for more empirical studies that would explore different aspects of what Pawlak et al. (2022) referred to as ‘after-class boredom’. If such studies were to mirror the main lines of inquiry in the case of in-class boredom – with their emphasis on examining links between after-class boredom, other ID variables, and TL attainment with the help of advanced statistical procedures – then a necessary condition is reliance on measures of this construct that are not only valid and reliable, but also practical with respect to data collection. When we examine the existing tools, it becomes clear that they do not fully meet all of these criteria. The Foreign Language Learning Boredom Scale (FLLBS; C. Li et al., 2023) includes a subscale reflecting ‘homework boredom’, measured by means of four items but, as indicated above, they cannot give justice to the complexity of out-of-class boredom, let alone the fact that none of these items are included in the recently published short version of the scale (C. Li et al., 2024). On the other hand, the Boredom in Learning English Outside of School (BLEOS) scale, developed by Pawlak, Kruk, et al. (2024), is specifically intended to tap into different aspects of after-class boredom, and it has been shown to be valid and reliable. Nonetheless, it consists of 21 items, which makes it impractical to employ together with several other scales, which would result in a lengthy questionnaire with highly negative consequences for completion rates, the logistics of data collection and ultimately also validity and reliability (Dörnyei & Dewaele, 2022; Galesic & Bosnjak, 2009; Heene et al., 2014; Rolstad et al., 2011). Thus, the present article reports a study which aimed to develop and validate a shorter version of the BLEOS, to examine its psychometric properties and to explore its measurement invariance.
II Literature review
1 Boredom in educational and out-of-school contexts
Boredom is a low-arousal, negative emotional state characterized by feelings of dissatisfaction, disengagement, and the perception that time passes slowly (Fahlman, 2009). It often arises when individuals perceive their environment or activities as lacking in novelty, stimulation, or meaningfulness, resulting in reduced motivation and cognitive engagement, making tasks feel tedious or unworthy of effort (Eastwood et al., 2012; Hill & Perkins, 1985). Factors contributing to boredom include a perceived lack of control over tasks, insufficient challenge, and repetitive or monotonous activities (Hill & Perkins, 1985; Pekrun, 2006). Although typically seen as detrimental, boredom can sometimes prompt individuals to seek new experiences or engage in creative thinking to alleviate discomfort, potentially acting as a catalyst for positive change (Craven & Frick, 2024; Daniels et al., 2015). However, when boredom becomes chronic, it may lead to avoidance behaviors, reduced performance, and negative psychological outcomes, highlighting its potentially harmful effects across various domains of life (Eastwood et al., 2012).
In L2 learning, boredom emerges as a complex and multifaceted emotional state that occurs when learners feel disengaged, dissatisfied, and perceive classroom activities as uninspiring or insufficiently challenging (Kruk et al., 2021). In this setting, boredom is marked by a lack of interest, passivity, and withdrawal from assigned tasks, which significantly hinders L2 learning by reducing motivation, diminishing cognitive engagement, and ultimately impairing performance. It is particularly prevalent when instructional content is seen as monotonous, irrelevant, or overly repetitive, leading to frustration, restlessness, and a lack of enthusiasm for learning (C. Li et al., 2023; Pawlak, Zawodniak, et al., 2020). Factors contributing to L2 boredom can be both external, such as teaching methods that lack variety or fail to encourage active participation, and internal, including learners’ motivational orientations, emotion regulation skills, or overall disposition toward L2 learning (C. Li et al., 2023). Accordingly, the occurrence of boredom can be explained within the framework of the control-value theory of achievement emotions (Pekrun, 2006; Pekrun et al., 2010), which holds that boredom arises when learners perceive low control over an activity or attach little value to it. From the perspective of self-determination theory (Deci & Ryan, 1985, 2000), the onset of this negative emotion reflects frustration of the basic needs for autonomy, competence and relatedness, while flow theory (Csikszentmihalyi, 1990) links it to an imbalance between challenge and skill. When learners perceive they have little control over the tasks they are expected to complete or consider these tasks to lack value or relevance (Pekrun, 2006), boredom can result in mental withdrawal, reduced effort, and resistance, ultimately hindering progress in L2 learning (Zawodniak et al., 2023). While L2 boredom is generally considered detrimental, it can occasionally encourage learners to reassess their learning strategies and seek more engaging study methods. However, its negative impacts, such as decreased participation, diminished motivation, and lower learning outcomes, underscore the need for diverse, meaningful, and appropriately challenging instructional strategies to sustain interest and promote effective L2 development (Al-Amri, 2025; Mahmoudi-Gahrouei et al., 2025; Pawlak, Zawodniak, et al., 2020).
As highlighted in the introduction, boredom in L2 learning is not confined to the classroom but also occurs outside formal educational settings. It can take on such forms as ‘leisure boredom’ and ‘homework-related boredom’, which have been extensively studied in educational psychology and sociology to understand the negative emotions learners experience during self-directed activities (Dettmers et al., 2011; Shaw et al., 1996). Leisure boredom arises when individuals feel they have too much free time but cannot find meaningful or engaging activities to fill it. It is defined as a negative emotional state triggered when self-selected activities are seen as lacking significance or intrinsic motivation (Iso-Ahola & Weissinger, 1990). When the need for stimulation or optimal arousal fails to be met, feelings of incompetence or an inability to use leisure time constructively may develop, leading to dissatisfaction and disengagement (Ellis & Witt, 1994). Leisure boredom often reflects an unwillingness to participate in activities that do not provide personal satisfaction or are chosen to meet external expectations rather than personal interests (Hill & Perkins, 1985). Homework-related boredom, in turn, involves the emotional responses learners manifest toward tasks that are assigned to be completed outside school. Unlike classroom activities, which are typically structured and guided by teachers, homework requires self-regulation and often lacks immediate support or feedback, making it more prone to boredom, especially when assignments are perceived as irrelevant, overly challenging, or lacking personal significance (Trautwein, 2007; Zeidner et al., 2005). The degree of boredom experienced during homework depends largely on the perceived quality and challenge of the tasks, with assignments seen as unimportant or excessively demanding being more likely to evoke negative emotions such as boredom, anxiety, and frustration, while well-designed, appropriately challenging tasks are less likely to do so (Dettmers et al., 2010; Warton, 2001).
Building on such work, Pawlak et al. (2022) introduced the notion of ‘after-class boredom’, specifically referring to the boredom experienced by L2 learners in uninstructed settings outside the formal classroom environment. This type of boredom is marked by several interrelated factors, including a reluctant attitude towards learning, difficulty in finding engaging activities, passivity, poor attentional control, a sense of monotony, and a feeling that one’s language abilities are not being fully utilized. It typically arises when learners must independently manage their L2 learning without the structure and guidance provided in the classroom, often resulting in diminished motivation and reduced effort. While both leisure and homework boredom involve low stimulation and reduced motivation, after-class boredom is usually experienced in self-regulated but learning-related situations, where learners work towards L2 goals on their own, outside formal instruction but still within an educational frame of reference. These different manifestations of boredom outside the classroom highlight the unique challenges learners face in maintaining motivation and engagement during self-directed or unstructured activities. Conceptually, after-class boredom can be viewed as an academically oriented counterpart of leisure boredom, positioned between formal instruction and free leisure, where control is self-directed but value and structure remain linked to learning objectives. By highlighting the interplay of these factors, the concept of after-class boredom sheds light on the challenges that learners face in maintaining interest and engagement in L2 learning tasks performed independently (Pawlak et al., 2022). Despite the importance of the construct, its role has been investigated in just a handful of studies conducted to date (e.g. Kruk & Pawlak, 2022; K. Li et al., 2025; Pawlak et al., 2022; Pawlak, Kruk, et al., 2024), only some of which have relied on validated data-collection tools. Existing work has largely identified components of after-class boredom and validated measures such as the BLEOS, with limited evidence on antecedents, motivational underpinnings, or links with other emotions. This points to the need for theory-driven research to clarify how after-class boredom develops and affects independent L2 learning. The development of the short form of the BLEOS represents an important step in this direction.
2 Measurement of L2 after-class boredom
As indicated above, research on L2 boredom has predominantly centered on classroom contexts, leaving the phenomenon of after-class boredom relatively underexplored. Key milestones in the study of after-class boredom include the development of the BLEOS questionnaire by Pawlak et al. (2022) and its subsequent validation by Pawlak, Solhi, et al. (2024). Additionally, the FLLBS, designed by C. Li et al. (2023), provides insights into after-class boredom, particularly through its focus on homework boredom, which reflects the experience of boredom related to tasks and activities performed outside the classroom. Although the FLLBS does not exclusively measure after-class boredom, its inclusion of homework boredom offers valuable insights into how boredom can occur beyond the classroom and serves as a useful point of reference when compared with the BLEOS in terms of construct coverage and psychometric scope.
A groundbreaking advancement in measuring after-class boredom was the development of the BLEOS (Pawlak et al., 2022). This 21-item instrument was specifically designed to capture the boredom experienced by L2 learners when engaging in English learning activities outside the classroom, such as self-study or independent practice. The BLEOS was administered to a sample of 107 Polish university students majoring in English (80 females and 27 males) enrolled in BA and MA programs. Exploratory factor analysis (EFA) allowed identification of a three-factor structure: (1) unwillingness to learn English and inability to find (interesting) tasks (F1), (2) lack of creativity, focus, and involvement (F2), and (3) altered perception of time, underused language abilities, and monotony (F3). Importantly, these factors indicate that after-class boredom can stem from both independent learning that is closely linked to what transpires in the classroom (e.g. homework, preparation for tests), and learning that reflects L2 learners’ own initiative to enhance their TL ability (e.g. using AI to improve speaking skills). The BLEOS demonstrated high internal consistency, with Cronbach’s alpha values of .90 for the entire scale and .88, .77, and .74 for F1, F2, and F3, respectively, which supports the reliability of the instrument. It should be emphasized at this juncture that the study relied on a relatively small and homogeneous sample of Polish university students, which limits the generalizability of the results and raises the possibility of cultural or contextual bias. No cross-validation in other educational settings was conducted, making it difficult to determine whether the identified factor structure would hold in different populations.
To further examine the validity of the BLEOS, Pawlak, Solhi, et al. (2024) conducted a follow-up study with a larger sample of 433 English majors (313 females and 120 males) from Polish universities. Utilizing exploratory structural equation modeling (ESEM), which offers a more nuanced exploration of complex constructs compared to traditional confirmatory factor analysis (CFA), the study confirmed the robustness of the scale. First, the bifactor ESEM model demonstrated a good fit, indicating that the BLEOS effectively captures both a general factor of after-class boredom and its specific dimensions. Second, measurement invariance testing revealed no significant differences between male and female participants. Third, criterion-related validity was supported, showing that the general factor of after-class boredom was negatively associated with students’ self-evaluation of their language proficiency. The results support the internal validity of the scale, but, yet again, the study relied on a single, Polish university sample, which restricts its generalizability. Further research in other contexts and among learners of different proficiency levels is required to confirm the wider usefulness of the instrument.
Compared with the BLEOS, C. Li et al.’s (2023) FLLBS is a 32-item instrument developed to assess various types of boredom experienced by a large sample of Chinese university English as a foreign language (EFL) students (over 2,000 participants across multiple institutions) and was validated through both exploratory and confirmatory factor analyses. Among the seven dimensions of L2 boredom identified in the scale, homework boredom is particularly pertinent, comprising four items that directly address the experience of boredom during tasks completed outside the classroom. The internal consistency of this subscale was high, with the value of Cronbach’s alpha equaling .90, suggesting it is a reliable measure of this aspect of after-class boredom. Unlike the BLEOS, which focuses exclusively on out-of-class learning, the FLLBS provides a multidimensional picture of L2 boredom but captures only a narrow slice of after-class experiences. Moreover, the dimension of homework boredom and the items it comprises were discarded in the short version of the FLLBS (FLLBS-SF; C. Li et al., 2024)), further limiting its relevance for independent L2 learning. In psychometric terms, the FLLBS benefits from a large, well-stratified national sample, yet its construct coverage of after-class boredom remains limited, whereas the BLEOS offers stronger theoretical alignment but lower practicality due to its length and single-context validation.
When considered together, the BLEOS and the FLLBS reveal complementary strengths and weaknesses. The BLEOS provides broader construct coverage and stronger theoretical grounding, whereas the FLLBS is shorter and supported by a large, well-validated sample of Chinese university students. Each tool has distinct advantages: the BLEOS captures the complexity of after-class boredom but is less practical for large-scale or cross-cultural research, while the FLLBS offers efficiency and sound statistical support but covers only a narrow aspect of the construct. This comparison highlights the need for a scale that combines comprehensive representation of after-class boredom with practical brevity and broader applicability.
Despite these contributions, research specifically targeting after-class boredom is scant and the data-collection instruments used for this purpose, while shedding invaluable light on this construct, are in need of further refinement to allow comparisons across different cultural and educational contexts. From a psychometric perspective, both construct coverage and test efficiency are central concerns. For example, the FLLBS demonstrates limited content validity, as it captures only one facet of after-class boredom (i.e. homework boredom), thus underrepresenting the broader construct domain. The issue is compounded in its short form, which omits boredom experienced in out-of-class learning activities altogether. Although this tool was not originally intended to measure after-class boredom comprehensively, such construct underrepresentation poses a challenge for validity and comparability. Conversely, while the BLEOS includes multiple dimensions of after-class boredom and thus ensures broader construct representation, its considerable length raises questions about test efficiency. Extended scales can increase participant burden, reduce response quality, and ultimately compromise the reliability and validity of the data (Koğar, 2020; Smith et al., 2000). In addition, both existing instruments were validated within single-country samples, which prevents evaluation of their cross-cultural stability and may limit their use in comparative research. Addressing these issues requires additional work on scale adaptation and validation in other linguistic and educational contexts. Therefore, the present study sought to reconcile this trade-off between psychometric precision and practical feasibility by developing a more compact yet theoretically representative version of the BLEOS suitable for diverse contexts. Accordingly, the following research questions were formulated:
Research question 1: What is the factor structure of the S-BLEOS?
Research question 2: What are the psychometric properties of the S-BLEOS?
Research question 3: Is the S-BLEOS measurement invariant in relation to gender, age, and nationality?
III Method
1 Participants
Participants were 1,133 learners of English as a foreign language in Poland (n = 550), Iran (n = 241), and Hungary (n = 342). All of them were university students enrolled in degree programs in English, which meant that their goal was to use English for professional purposes in the future. Depending on the year in the program, their TL proficiency ranged from B2 to C1 according to the Common European Framework of Reference for Languages (Council of Europe, 2001), with the students rating their proficiency as intermediate to high. The participants had an average of 11.66 years (SD = 3.89) of experience in learning English, the majority were female (n = 824), and their average age was 22.12 years (SD = 4.17). The sample was homogeneous in terms of academic background but diverse across national contexts. Participants were recruited through snowball sampling via the authors’ professional contacts in the three academic settings. This recruitment procedure and inclusion of students enrolled in English-related programs ensured broadly comparable proficiency levels and learning goals across groups. No further exclusion criteria were applied and demographic characteristics reflected the composition of the participating university populations.
2 Instruments and procedures
In addition to examining the full-length BLEOS scale, several additional scales were administered in order to establish convergent and divergent validity. Details about all the data collection tools are provided below:
Boredom in Learning English Outside of School (BLEOS; α = .909; ω = .908): The 21-item BLEOS scale examines language learning boredom outside of the classroom context (Pawlak et al., 2022). The original long-form BLEOS has three subfactors: unwillingness to learn English and inability to find (interesting) tasks (9 items, e.g. ‘I don’t really know what to learn after classes when it comes to English’, α = .840; ω = .840); lack of creativity, focus and involvement (7 items, e.g. ‘When I learn English after class I often think about unrelated things’, α = .815; ω = .808); and altered perception of time, underused language abilities and monotony (5 items, e.g. ‘Time always seems to pass slowly when I am learning English after class’, α = .712; ω = .719). All items are measured on a 5-point Likert scale from ‘strongly disagree’ to ‘strongly agree’. It should also be noted that the BLEOS scale contains eight reverse-scored items.
Short-form Foreign Language Classroom Boredom Scale (S-FLCBS; α = .898; ω = .899): The unidimensional, 10-item S-FLCBS, developed and validated by Pawlak et al. (2025), was used to examine in-class foreign language boredom and is a short-form of the Boredom in Practical English Language Classes – Revised (BPELC-R; Pawlak, Kruk, et al., 2020). The scale measures items such as ‘I often have to do meaningless things in my language classes’ on a 5-point Likert scale from ‘strongly disagree’ to ‘strongly agree’. The in-classroom boredom scale was included in the study in order to examine discriminant validity.
Grit Scale (α = .834; ω = .819): This scale was used to examine the general perseverance and passion of participants in setting and meeting non-domain specific goals in their lives (Duckworth et al., 2007). The two-factor, 12-item scale measures consistency of interest (6 items, e.g. ‘I often set a goal but later choose to pursue a different one’, α = .821; ω = .819) and perseverance of effort (6 items, e.g. ‘Setbacks don’t discourage me’, α = .791; ω = .781). All items were measured on a 5-point Likert scale from ‘strongly disagree’ to ‘strongly agree’ and all items in the consistency of interest subscale were reverse scored. Grit was included in the study as a measure of convergent validity, given previous research that has found a negative relationship between boredom and grit (Hosseini et al., 2023; Zhao & Wang, 2023).
L2 Grit Scale (α = .852; ω = .837): The domain-specific version of the Grit Scale, namely, the L2 Grit Scale (Teimouri et al., 2022), was used to measure grit in learning English. The scale has a similar design to the original Grit Scale, with two factors: perseverance of effort (5 items, e.g. ‘I am a diligent English language learner’; α = .882; ω = .882) and consistency of interest (4 items, e.g. ‘I think I have lost interest in learning English’; α = .826; ω = .835). All items are measured on a 5-point Likert scale from ‘strongly disagree’ to ‘strongly agree’. Similar to domain-general grit, L2 grit was included as a measure of discriminant validity, as a trend has emerged in the literature demonstrating a negative relationship between L2 Grit and FLLB (e.g. Csizér et al., 2024).
Language Learning Curiosity Scale (α = .792; ω = .776): The 11-item, two-factor scale developed by Mahmoodzadeh and Khajavy (2019) was used to examine curiosity in the language classroom, with the two factors of language curiosity as a feeling of interest (4 items; e.g. ‘I wonder how well I can speak English when meeting a native English speaker’; α = .707; ω = .735) and language curiosity as a feeling of deprivation (7 items; e.g. ‘When I have a language question in mind, I cannot rest without knowing the answer’; α = .696; ω = .690), measured on a 5-point Likert scale from ‘strongly disagree’ to ‘strongly agree’. Language learning curiosity was included as a discriminant validity measure since previous research has found a negative relationship between L2 boredom and L2 curiosity (see Eren & Coskun, 2016; Kruk & Zawodniak, 2018).
Language Learning Enjoyment Scale (α = .877; ω = .878): The scale was constructed with items from the 21-item Foreign Language Enjoyment Scale (Dewaele & MacIntyre, 2014) and the Achievement Emotions Questionnaire (Pekrun et al., 2011). Eight items were selected and modelled as a unidimensional measurement construct (e.g. ‘I enjoy being in my language class’), with a 5-point Likert scale from ‘strongly disagree’ to ‘strongly agree’. Again, language learning enjoyment was included as a measure of discriminant validity, as a negative correlation with language learning boredom has been previously established in the literature (see Dewaele et al., 2023a; C. Li, 2022).
Short-form Foreign Language Anxiety Scale (α = .904; ω = .905): A short-form version of the 33-item scale introduced by Horwitz et al. (1986), developed by MacIntyre (1992) and validated by Botes et al. (2022), was used to examine anxiety in the language classroom. The unidimensional, 8-item scale (e.g. ‘Even if I am well prepared for FL class, I feel anxious about it’) measures all items on a 5-point Likert scale from ‘strongly disagree’ to ‘strongly agree’ and contains two reverse-scored items. Previous research has found a positive correlation between L2 boredom and language anxiety (see Dewaele et al., 2023a, 2023b); thus, the measure was used to examine convergent validity.
Motivated Behavior Scale (α = .876; ω = .871): A unidimensional, 10-item scale examining motivation in the language classroom (Taguchi et al., 2009) was included as an additional measure of discriminant validity, as previous studies have found a negative relationship between L2 boredom and L2 motivation (see Kruk, 2016; Zhao et al., 2023). Items such as ‘I am prepared to expend a lot of effort in learning English’ were measured on a 5-point Likert scale from ‘strongly disagree’ to ‘strongly agree’.
All the instruments were administered in English, which was fully warranted given the level of TL proficiency of the participants. The data were collected online with the help of Google Forms. The participants provided informed consent to complete the questionnaires and were informed that they could request that their responses be removed from the database if they so wished.
3 Data analysis
Data analysis was conducted in line with the recommendations provided in Pawlak et al. (2025) and Botes et al. (2021, 2022), through a series of sequential steps (see Figure 1).

Research methods flowchart.
Step 1: Creating two samples: In order to create two samples, the first of which (Sample 1) would be utilized to develop the short-form and the second of which (Sample 2) would be used to validate it (Marsh et al., 2005), the sample was randomly split in two, using the SPSS 28.0.1.1 function of randomly selecting 50% of all cases for Sample 1 and the remaining cases being allocated to Sample 2. In order to ensure that the samples were not significantly different, a series of independent-samples t-tests were carried out and Cohen’s d effect sizes were calculated.
Step 2: Factor structure: Using Sample 1, the factor structure of the BLEOS was explored via principal component analysis (PCA) in JASP (0.18.3) with Promax rotation. PCA was employed because it offers a straightforward, interpretable summary of the data structure (Field, 2013). In addition, given its primary goal of reducing the number of observed variables while retaining maximal variance, PCA is particularly well suited for the development of short-form instruments (McGuire et al., 2010; Stevanovic, 2014). A factor solution was determined through the Kaiser criterion (Eigenvalue > 1) and the scree plot (Tabachnick & Fidell, 2001). Factor loadings were interpreted as low (< .4), intermediate (.4 to .6), and high (> .6; see Kline, 2014).
Step 3: Item selection: Item retention was guided by statistical criteria. Items were retained if they loaded ⩾ .40 on their intended factor and showed item-total correlations of ⩾ .40. Items with cross-loadings were not considered. Thus, using Sample 1, the short-form was developed.
Step 4: Testing the measurement model: The proposed short-form developed using Sample 1 was subsequently tested in Sample 2 via confirmatory factor analysis (CFA). The testing of the measurement model with the help of CFA was conducted in JASP (0.18.3) using maximum likelihood estimation and standard errors. Missing data were deleted listwise. The model fit was determined by the fit indices of the root mean square error of approximation (RMSEA; close fit < .05; reasonable fit < .08), the standardized root mean square residual (SRMR; close fit < .05; reasonable fit < .08), comparative fit index (CFI; close fit > .90; reasonable fit > .95), and Tucker-Lewis index (TLI; close fit > 90; reasonable fit > .95; see Kenny, 2020).
Step 5: Validation: Continuing with the use of Sample 2, the reliability and validity of the short-form scale were determined. Reliability was tested using McDonald’s omega and Cronbach’s alpha. Validity was examined by comparing the newly developed short-form with existing variables in the known nomological network of L2 boredom, that is, in-class L2 boredom, grit, L2 grit, language learning curiosity, language learning enjoyment, language learning anxiety, motivated behavior and self-perceived proficiency. In doing so, after-class boredom, as measured through the short-form BLEOS, could be directly compared to in-class boredom and a wider range of ID variables and L2 learning outcomes. For the ID factors compared to after-class boredom, bi-directionality was assumed (e.g. relationships between language boredom, enjoyment, and anxiety are persistently modelled as correlated in the literature; see Dewaele et al., 2023a).
Step 6: Recombining the dataset: The dataset was recombined (Sample 1 + Sample 2) in order to achieve the necessary statistical power for invariance testing.
Step 7: Invariance testing: Invariance testing was conducted to ensure the fairness of using the newly developed short-form with different groups in terms of, gender, age, and nationality (Meredith, 1993). Subgroup sizes were as follows: gender (male = 307, female = 824), age (teen (< 20 years old) = 211, young adult (20 > x > 25 years old) = 766, adult (> 25 years old) = 137), and nationality (Poland = 550, Hungary = 342, Iran = 241). As subgroups < 100 have been found to yield unreliable results (Meade & Bauer, 2007) and the smallest subgroup comprised n = 137, invariance testing included all possible subgroups. However, subgroups were somewhat unequal, which may affect invariance results (Yoon & Lai, 2018). Invariance was tested in JASP (0.18.3) through three sequential steps. First, configural invariance was examined in order to establish that the factor structure of the measurement did not differ across groups. Second, metric invariance was tested to ensure that responses to items did not differ due to group membership. Lastly, scalar invariance was tested to establish that item intercepts were not dependent on group membership. Invariance was established at each step, with the analysis only progressing to the next step if model fit requirements were met. Model fit was only allowed to alter within the parameters laid out by Cheung and Rensvold (2002) and Chen (2007) where ∆CFI ⩽ .010, ∆RMSEA ⩽ .015, ∆SRMR ⩽ .030 (for metric invariance), and ∆SRMR ⩽ .015 (for scalar invariance).
IV Results
1 Step 1: Creating two samples
The descriptive statistics for the demographic variables in Sample 1 and Sample 2 are presented in Table 1. The descriptive statistics for the in-class (S-FLCBS) and after-class boredom (BLEOS) measures are presented in Table 2, along with t-test results and Cohen’s d effect sizes, which demonstrate that there were no statistically significant differences between the two samples. In addition, descriptive statistics and t-test results for all other variables used in the validation of the short-form can be found in the supplementary material.
Demographic information for Sample 1 and Sample 2.
Descriptive statistics.
Notes. BLEOS = Boredom in Learning English Outside of School scale; S-FLCBS = Short-form Foreign Language Classroom Boredom Scale.
2 Step 2: Factor structure
The 21-item BLEOS was analysed via a PCA using Sample 1. One item was revealed to be problematic, with low inter-item correlations and no significantly large factor loadings, namely, item 8 (‘Learning English after classes rarely excites me’). In addition, three items cross-loaded onto multiple factors (i.e. item 2: ‘I don’t really know what to learn after classes when it comes to English’, item 3: ‘It often happens that I can’t find things to do with English after classes that would make a deeper sense to me’, and item 16: ‘When I was younger [e.g. in junior high school or senior high school], I used to find learning English after lessons monotonous and tiresome’). The results of the first PCA can be found in supplementary material. These weak and cross-loaded items were removed from the analysis and the PCA was reconducted.
The second PCA with 17 items yielded a two-factor solution (see Table 3), with all reverse-scored items loading onto a separate factor. Reverse-scored items have been known to create factors that are statistical artifacts due to the atypical response patterns negative item wordings may cause (Horan et al., 2003). Indeed, one of the main criticisms leveled against the Grit Scale is that the two-factor structure is likely a mere statistical artifact due to the inverse nature of the items in one of the subscales (see Postigo et al., 2024). Given that the second factor only contained reverse-scored items and that the scree plot could be interpreted to indicate either a one- or two-factor solution (see Figure 2), a decision was made to additionally run a forced one-factor solution PCA (see Table 4).
Principal component analysis (PCA) for the 17-item Boredom in Learning English Outside of School scale (BLEOS).
Note. The applied rotation method was promax.
*Reverse-scored items.

Scree plot.
Principal component analysis (PCA) of the 17-item Boredom in Learning English Outside of School scale (BLEOS): Forced single factor.
Notes. *Reverse-scored items. All loadings < .4 are omitted. aHigh loading (γ ⩾ .6). bAcceptable loading (.4 ⩽ γ ⩽ .6).
The forced unidimensional solution resulted in a factor model with all items loading significantly onto a single factor, including the reverse-scored items (see Table 4). The factor solution of the forced single factor PCA, the nature of reverse-scaled items, and the statistical artifacts created by these items in factor analyses (see Horan et al., 2003; Postigo et al., 2024), as well as the scree plot, resulted in a unidimensional solution being chosen for the short-form. Thus, the three factors of the original BLEOS scale could not be replicated in this sample and instead a unidimensional measurement model served as a basis for selecting items for the short-form BLEOS.
3 Step 3: Item selection
Item selection was made taking into account the factor loadings of the unidimensional PCA tested in Step 2 (see Table 4), the inter-item correlations, and the alpha-reliabilities of items. The resulting 10 statistically strongest items were selected for the S-BLEOS scale (see Table 5). The inter-item correlations and item reliabilities can be found in the supplementary material.
Ten item Short-form Boredom in Learning English Outside of School (S-BLEOS) scale.
Note. *Reverse-scored items.
4 Step 4: Testing the measurement model
The unidimensional measurement model of the 10 items selected in Step 3 was tested via CFA using the data from Sample 2. The CFA resulted in a reasonable model fit (χ2(35) = 165.516; p < .001; CFI = .932; TLI = .913; RMSEA = .087; SRMR = .047) but the modification indices suggested correlating items 17 and 19, thus ‘fixing’ the pathway between the two items (∆χ2= 38.793). Items 17 and 19 were rather similarly worded (‘I can easily focus on activities when I learn English after classes’ and ‘I am very active when it comes to learning English after classes’) and were both reverse-scored items. Reverse-scored items and similar item wordings can be confounding factors in measurement models (see Botes et al., 2022; Brown, 2003; Weijters et al., 2013). As such, another CFA was tested with the pathway between items 17 and 19 fixed (see Figure 3). This second CFA found an improved close to reasonable fitting model (χ2(34) = 126.244; p < .001), CFI = .952, TLI = .921, RMSEA = .074, SRMR = .042), providing evidence of construct validity for the newly created S-BLEOS scale.

Measurement model.
5 Step 5: Reliability and validity
The reliability of the S-BLEOS, as measured through Cronbach’s alpha and McDonald’s omega, was found to be acceptable (α = .884; ω = .886). The correlations between the S-BLEOS and ID variables in the known nomological network of L2 boredom are presented in Table 6. As expected due to the existing empirical evidence, after-class L2 boredom, as measured through the S-BLEOS, was significantly positively correlated with language learning anxiety and significantly negatively correlated with language learning enjoyment, language learning curiosity, grit, L2 grit, and motivated behavior. The bivariate correlations therefore provide convergent and divergent validity evidence, as the S-BLEOS conformed to expectations regarding significant relationships with other ID factors. In addition, the S-BLEOS was highly correlated with the long-form BLEOS (r = .959; p < .001). Furthermore, the S-BLEOS was moderately positively correlated with S-FLCBS (r = .563; p < .001), indicating that in-class boredom and after-class boredom are significantly related but can be considered two separate constructs.
Correlation matrix.
Notes. All correlations are significant at p < .001, with the exception of the correlation between anxiety and motivated behavior (r = .025; p = .553). S-BLEOS = Short-form Boredom in Learning English Outside of School scale; S-FLCBS = Short-form Foreign Language Classroom Boredom Scale.
6 Steps 6 and 7: Recombining the dataset and invariance testing
The full dataset (Sample 1 + Sample 2) was used to conduct invariance testing across gender, age, and nationality (see Table 7). The S-BLEOS was first tested across two gender groups (male: n = 307; female: n = 824) and found to be fully invariant. The necessary cut-offs were met for scalar invariance (∆CFI ⩽ .010; ∆RMSEA ⩽ .015; ∆SRMR ⩽ .030) and it can be concluded that male and female participants did not respond differently to different items. Next, the S-BLEOS was again found to be fully invariant across age groups (Teen [< 20 years old]: n = 211; Young adult [20 < x < 25 years old]: n = 766; Adult [> 25 years old]: n = 137), with all age groups responding to items in a similar manner. Lastly, invariance was tested across countries (Poland: n = 550; Hungary: n = 342; Iran: n = 241). Configural and metric invariance were confirmed with all fit indices meeting the necessary cut-off requirements. However, in terms of scalar invariance, while the RMSEA (∆RMSEA = .011) and SRMR (∆SRMR = .009) comfortably met the cut-off requirements and indicated an invariant model, the CFI did not (∆CFI = .036). In cases such as these, where mixed evidence is given, there are no clear statistical guidelines as to whether to accept or reject a model. Invariance results can be affected by small sample sizes, ratios of sample sizes of groups, the pattern of invariance, and the complexity of models, with different fit indices reacting more sensitively than others (see Chen, 2007, for an overview). As both the RMSEA and the SRMR are known to be sensitive fit indicators with a tendency ‘to over-reject an invariant model’ and both indicators provided evidence of scalar invariance (Chen, 2007, p. 501), the S-BLEOS is considered to be invariant across countries. However, additional research is needed to further investigate the cultural sensitivity of the S-BLEOS and researchers comparing results across countries are encouraged to establish invariance before cross-cultural differences are investigated.
Invariance testing results.
Note. Indicators that failed to meet the cut-off guidelines are marked in bold.
V Discussion
The study reported above aimed to develop a condensed version of the BLEOS, constructed by Pawlak et al. (2022), with the aim of tapping into different dimensions of after-class L2 boredom, examining the psychometric properties of the short-form scale and exploring its measurement invariance in terms of gender, age and nationality (i.e. Hungarian, Iranian, Polish). Three specific research questions were formulated for this empirical investigation and the discussion is structured around them. The limitations of the study and directions for future research are also addressed.
1 Research question 1: Factor structure of the S-BLEOS
When it comes to research question 1, a series of PCAs conducted on the data from Sample 1 resulted in a single-factor solution, with the 10 strongest items selected for inclusion in the S-BLEOS, based on factor loadings, inter-item correlations and reliabilities. It should be reiterated at this juncture that the unidimensional structure of the scale resulted from a forced one-factor solution PCA, applied in view of the fact that one of the factors in the previously obtained two-factor solution included exclusively reverse-scored items, which have been shown to underlie factors likely representing little more than statistical artifacts due to atypical response patterns (Horan et al., 2003; Postigo et al., 2024). The one-factor solution was subsequently corroborated through CFA with Sample 2, with the caveat that a pathway needed to be created between similarly worded, reverse-scored items 17 (‘I can easily focus on activities when I learn English after classes’) and 19 (‘I am very active when it comes to learning English after classes’).
On the one hand, the unidimensional nature of the S-BLEOS is somewhat unexpected in view of the fact that the underlying structure of the original BLEOS (Pawlak et al., 2022) included three factors. In addition, most of the other available scales for measuring L2 boredom, such as the BPELC-R (Pawlak et al., 2022), the Precursors of Students’ Boredom in EFL Classes (PSBEC) scale (Mousavian Rad et al., 2022) as well as the FLLBS (C. Li et al., 2023) and its shorter iteration (C. Li et al., 2024), are multidimensional. On the other hand, the recently developed S-FLCBS (Pawlak et al., 2025), which is the short form of the BPELC-R, also represents a single-factor model. This might reasonably suggest that also in this case a higher-order dimension of boredom emerged when analyses were run to develop a condensed version of the BLEOS, and this factor superseded the three lower-order factors that were found to underlie the structure of after-class boredom in the long-form instrument (Pawlak et al., 2022), that is, unwillingness to learn English and inability to find (interesting) tasks, lack of creativity, focus, and involvement, and altered perception of time, underused language abilities, and monotony. In fact, the 10 items ultimately selected for inclusion in the S-BLEOS constitute quite a proportionate representation of these three original factors. This interpretation also finds empirical support, since, for example, the studies undertaken by Elahi Shirvan et al. (2025), Kruk, Pawlak, et al. (2022), or Pawlak, Solhi, et al. (2024) have shown that the global factor of boredom may play a crucial role in the measurement theory of this negative emotion. What this means in practical terms is that while it surely makes sense to investigate the factors that underpin the construct of after-class boredom included in the original BLEOS (Pawlak et al., 2022) and corroborated in subsequent research (e.g. Pawlak, Solhi, et al., 2024), it is equally plausible to explore after-class boredom in its entirety, without focusing on its underlying structure. It should be emphasized, however, that whether the global factor of after-class boredom and/or its underlying factors are explored, the focus is still on the general concept rather than on its domain-specific (e.g. related to grammar) or task-based (e.g. related to a specific type of L2 learning activity) manifestations (see Li, 2025; Pekrun & Perry, 2014).
Obviously, the reduction in the number of items and underlying factors in order to enhance the practicality of administration inevitably comes at a cost. Specifically, the level of granularity at which after-class boredom is tapped into decreases because, for example, it might be more difficult with the use of the S-BLEOS to tease apart out-of-class boredom that is related to classroom L2 learning in some way from after-class boredom that might emerge from learners’ more autonomous efforts to improve their TL skills. This is an important trade-off that researchers need to consider when designing their studies into after-class boredom. When the aim is to further disentangle the construct of after-class boredom, perhaps in terms of its links to in-class boredom or L2 learning processes and outcomes, it is fully warranted to opt for the long form. A similar approach would also be preferable if the scale were to be adapted to shed light on the experience of after-class boredom in the performance of specific learning activities and tasks because a multidimensional perspective on the construct could help better understand how after-class boredom fluctuates over time and how this relates to changes in engagement and motivation. However, when after-class boredom is investigated in connection with an array of other variables such as ID factors with large samples, it may make more sense to choose the short form to facilitate data collection and ward off the problems that the administration of lengthy questionnaires entails.
2 Research question 2: Psychometric properties of the S-BLEOS
Research question 2 concerned the psychometric properties of the newly developed S-BLEOS. First, the CFA conducted with the data from Sample 2 provided evidence for a reasonably good fit of the measurement model of the S-BLEOS. Second, the scale was characterized by acceptable reliability, as evident from the values of Cronbach’s alpha (α = .884) and McDonald’s omega (ω = .886). Third, correlational analyses revealed that the S-BLEOS possesses the requisite convergent and divergent validity. On the one hand, the scale was positively, moderately and significantly correlated with the S-FLCBS, which indicated that although in-class and after-class boredom are closely related, they nevertheless constitute distinct constructs. Moreover, after-class boredom, as measured by the S-BLEOS, was also positively, significantly, albeit weakly, correlated with foreign language anxiety, which is consistent with the existing empirical evidence (e.g. Dewaele et al., 2023a, 2023b; C. Li & Wei, 2023). At the same time, the S-BLEOS proved to be correlated significantly, negatively and moderately with ID variables that have been shown to play a beneficial role in L2 learning, such as domain-general grit, L2 grit, language learning enjoyment and motivated learning behavior. The relationship was also negative and significant but smaller in the case of language learning curiosity, which can be linked to the way in which the construct is operationalized in Mahmoodzadeh and Khajavy’s (2019) scale (i.e. feeling of interest vs feeling of deprivation). These negative relationships do not come as a surprise because, in line with the control value theory (Pekrun, 2006), being associated with low control and low perceived value, after-class boredom is bound to negatively impact motivation, perseverance in the attainment of long-term goals, the desire to resolve information gaps or positive affect. These findings are also in line with previous research, although it should be kept in mind that some of the constructs may have been gauged through alternative tools and different underlying factors may have been taken into account (e.g. Csizér et al., 2024; Dewaele et al., 2023a, 2023b; Eren & Coskun, 2016; Hosseini et al., 2023; Kruk & Zawodniak, 2018; C. Li, 2022; Zhao & Wang, 2023; Zhao et al., 2023). Overall, these results indicate that the S-BLEOS can serve as a reliable and valid measure of after-class boredom in L2 learning, although of course there is a need for further validation in other contexts and with respect to specific TL domains.
3 Research question 3: Measurement invariance of the S-BLEOS
With respect to research question 3, it focused on measurement invariance in terms of gender, age and nationality (i.e. Hungary, Iran and Poland), with the analyses being conducted with the entire, combined dataset (Sample 1 and Sample 2). Full measurement invariance (i.e. configural, metric and scalar) was uncovered for gender and age, and it can also be claimed for nationality, which indicates that the members of the different groups responded to the items included in the scale in the same way. That said, in the case of nationality, while the RMSEA and SRMR indicators met the requisite cut-off requirements, the CFI did not. This is an important takeaway for researchers investigating after-class boredom in various countries as they should be alert to the potential differences in how this negative emotion can be manifested cross-culturally in out-of-school contexts. Interestingly, Pawlak et al. (2025) also failed to report scalar invariance for the S-FLCBS across nationality groups. In addition, studies of other ID factors have shown that data-collection tools may be in need of far-reaching modifications when they are used in diverse national and cultural settings, a good case in point being the Grammar Learning Strategy Inventory (Pawlak et al., 2023; Wang et al., 2024). Keeping such findings in mind, it is clearly necessary to exercise caution when the S-BLEOS is employed, perhaps alongside other data-collection tools, to investigate cross-cultural or cross-national differences. It may also be prudent to undertake empirical studies that would shed more light on how after-class boredom is experienced, manifested and conceptualized across cultural groups.
VI Limitations and directions for future research
The present study is not free from limitations that should be considered in future research. First, it was cross-sectional and, for this reason, it was not possible to explore the measurement stability by establishing the test–retest reliability of the S-BLEOS. Future attempts at validating the scale in other contexts could collect longitudinal data to help address this issue. Second, although measurement invariance across gender was established, the sample was predominantly female (73%), which may limit the generalizability of the findings and the precision of estimates for the smaller male subgroup. This imbalance is further seen in the subgroups of the invariance tests, where some subgroups were considerably larger than others. This imbalance may have affected the results of the invariance testing (Yoon & Lai, 2018) and therefore future research using balanced samples is recommended. Third, the analyses did not take into account the link between the S-BLEOS and TL attainment, which could help shed light on the predictive validity of the instrument, but also add an important piece of the puzzle in relation to measurement invariance. However, since the participants came from different national and educational contexts which differed considerably in their grading systems, and no universally acknowledged proficiency indices were available (e.g. standardized tests), the inclusion of this variable was not feasible in this case, but this is definitely something that future research should consider. Fourth, it might be argued that after-class boredom is a very broad construct, not only because boredom in such settings can be a corollary of L2 learning in the classroom or the consequence of self-directed attempts to master a specific TL, but also on account of the fact that it may be experienced differently when learners do traditional exercises, interact with other TL users through social media, or use AI to facilitate L2 learning. Obviously, the S-BLEOS, just like the long-form version, does not differentiate between such domains and research is needed to develop dedicated scales for tapping after-class boredom in such contexts.
VII Conclusions
The present study developed the S-BLEOS, a data-collection instrument for investigating after-class boredom and provided evidence for its requisite psychometric properties (i.e. measurement model, reliability, construct, convergent and divergent validity). It successfully tested its measurement invariance with respect to gender, age, as well as nationality. Researchers interested in examining the role of L2 learning boredom in out-of-class situations have thus been equipped with yet another useful tool that can particularly come in handy when after-class boredom is investigated together with a number of other variables, and the length of the scales needs to be reduced for the sake of practicality and ease of administration. It should also be emphasized that the S-BLEOS is the only compact tool that can be employed to provide insights into the experience of boredom in out-of-class situations since, for example, C. Li et al.’s (2024) FLLBS-SF does not tap into this dimension of L2 learning boredom at all. Its contribution notwithstanding, more research is needed to further extend the application of the S-BLEOS. One obvious line of inquiry is to explore its utility in different cultural, educational and national contexts, as well as with participants at lower educational levels (e.g. secondary school). As indicated above, the scale could also be adjusted to reflect the distinctive character of after-class boredom in specific domains, such as out-of-class L2 learning aided by new technologies and, in particular, AI. Finally, with minor modifications in wording, the S-BLEOS could also be used to investigate after-class boredom in learning languages other than English. On the whole, further research into after class boredom is urgently needed, given the fact that the importance of L2 learning in out-of-class situations is bound to grow in the coming years thanks to advances in the use of new technologies.
Supplemental Material
sj-docx-1-ltr-10.1177_13621688251412206 – Supplemental material for Less is more: Developing and validating the short scale for investigating L2 boredom beyond the classroom
Supplemental material, sj-docx-1-ltr-10.1177_13621688251412206 for Less is more: Developing and validating the short scale for investigating L2 boredom beyond the classroom by Mirosław Pawlak, Mariusz Kruk and Elouise Botes in Language Teaching Research
Footnotes
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Science Center, Poland (NCN 2022/45/B/HS2/00187 (2023–2025)).
Supplemental material
Supplemental material for this article is available online.
