Abstract
At the turn of the new millennium, in an article published in Language Teaching Research in 2000, Dörnyei and Kormos proposed that ‘active learner engagement is a key concern’ for all instructed language learning. Since then, language engagement research has increased exponentially. In this article, we present a systematic review of 20 years of language engagement research. To ensure robust coverage, we searched 21 major journals on second language acquisition (SLA) and applied linguistics and identified 112 reports satisfying our inclusion criteria. The results of our analysis of these reports highlighted the adoption of heterogeneous methods and conceptual frameworks in the language engagement literature, as well as indicating a need to refine the definitions and operationalizations of engagement in both quantitative and qualitative research. Based on these findings, we attempted to clarify some lingering ambiguity around fundamental definitions, and to more clearly delineate the scope and target of language engagement research. We also discuss future avenues to further advance understanding of the nature, mechanisms, and outcomes resulting from engagement in language learning.
Keywords
I Introduction
Engagement defines all learning. Learning requires active involvement on the part of the learner, and action is the defining characteristic of learner engagement (Mercer, 2019). In the everyday sense, engagement has a generic meaning related to being occupied or busy doing something. However, in the realm of teaching and learning, engagement extends beyond this and refers to the amount (quantity) and type (quality) of learners’ active participation and involvement in a language learning task or activity. An engaged learner is actively involved in and committed to their own learning, and without engagement meaningful learning is unlikely. The growing recognition for the importance of engagement in contemporary education has also made it one of the most popular research topics in education, to the extent that it has been described as ‘the holy grail of learning’ (Sinatra et al., 2015, p. 1). Specifically in language learning, the notion of learner action for learning is deeply embedded in the dominant paradigms of communicative and constructivist language learning and teaching, which view language use and interaction as critical for language development. Furthermore, the predominant line of thinking in many theoretical understandings of language acquisition (e.g. cognitive-interactionist approaches, sociocultural theory and complexity/dynamic systems theory) is also that learning occurs through meaningful use of the language. As such, it is apparent why learner engagement has come to be of particular interest to scholars and practitioners in the field of language learning. The domain of language learning has begun to build on the considerable body of work in the learning sciences and educational psychology (Fredricks et al., 2019), extending it in domain-specific ways (Hiver, Al-Hoorie & Mercer, 2021b).
Considering this wave of interest in engagement, it seems that it is both appropriate and necessary to assess this body of empirical work and evaluate the strength of its contribution to the field by examining the methods and conceptual frameworks adopted in language learning (L2) engagement research (Oga-Baldwin, 2019). In the present review of 20 years of L2 engagement research, we had two main objectives. Our first aim was to look back at the methodological characteristics of previous empirical L2 engagement research in second language acquisition (SLA) and applied linguistics to note trends and tendencies in designs and analytical choices. By defining the shape of existing research designs, we can take stock of study quality and chart a path forward, evaluate what evidence it has provided for specific domains of study, and what conclusions this empirical work has allowed the field to draw relative to shared concerns and issues. In addition to methodological characteristics, we were also interested in the conceptual application of definitions and operationalizations of engagement across various subdomains of language education. Study design and analytical choices in empirical work are often informed by rigorous conceptual or theoretical understandings of relevant constructs. For this reason we intended to explore whether there were limitations and potential areas to clarify lingering ambiguity around fundamental definitions, and more clearly delineate the scope and target of L2 engagement research. This dual focus will, we hope, allow us to identify future directions for engagement research that will continue to advance our understanding of the ways and means by which learners pursue meaningful participation in language learning and use.
To our knowledge, this is the first systematic review and synthesis of engagement research in language learning. In the wider educational research literature, reviews synthesizing research on student engagement are centered on conceptual and theoretical issues (e.g. Lawson & Lawson, 2013; Reschly & Christenson, 2012). None have been methodological. And, because theory and conceptual models have proven so diverse, the methods used to study student engagement reflect this heterogeneity. For instance, there has been no attempt to meta-analyse empirical engagement reports in educational research (Christenson et al., 2012; Fredricks et al., 2019). Consequently, when considering methodological aspects and study quality, we followed recommendations to examine broader and more generic methodological issues first as these can inform later reviews that assess more fine-grained aspects of study quality (Siddaway et al., 2019). In our review we aimed to survey the methods employed by L2 engagement researchers broadly, looking at generic characteristics such as research objectives, design and methodological orientation, sampling characteristics, data elicitation measures, and analytical strategies. We turn now to outlining the topic, scope, and rationale for the present review.
II Defining engagement
1 Characteristics of language learning engagement
The central characteristic of engagement in learning is the notion of action (Skinner & Pitzer, 2012). While there are differing perspectives and definitions of engagement, this feature of engagement as action is consistent across definitions and frameworks (Reschly & Christenson, 2012). As mentioned above, engagement refers to how actively involved a student is in a learning task and the extent to which that physical and mental activity is goal-directed and purpose-driven.
A second characteristic is that engagement is highly context-dependent. A learner’s engagement does not emerge in a vacuum. It is in part a product of cultures, communities, families, schools, peers, classrooms and specific tasks and activities within those classrooms (e.g. Finn & Zimmer, 2012; Pianta et al., 2012; Shernoff, 2013). These different contextual layers influence each other and extend their influence across various layers of engagement. For example, academic engagement in school settings is a long-term form of action that covers months or years, but within a specific classroom at school, there are task-level forms of engagement that function at a timescale of minutes or hours.
Third, engagement always has an object. It is possible, for instance, to be engaged with a topic, a person, a situation, or in an activity or task. This means that while definitions have often focused on the intrapersonal components of engagement, there must also be a commensurate understanding of its situated characteristics. An understanding of the ‘person-environment fit’ (Reschly & Christenson, 2012, p. 13) of learners in their learning contexts can better reveal how engagement impacts learning, and how it can be enhanced. Engagement is inherently situated. As such, engagement research must be clear about the contexts and the timescales of relevance to engagement.
A final characteristic is that engagement is dynamic and malleable (Appleton et al., 2008). Although research investigating developmental trajectories of engagement remains rare (e.g. Aubrey et al., 2020), this characteristic provides a promising point of direct action for educators as it suggests that learners can become more engaged with the right kinds of intrapersonal and contextual conditions (Fredricks et al., 2004). This also indicates the potential for well-constructed interventions exploring the dynamism of learning engagement on various timescales (Reschly & Christenson, 2012).
2 Dimensions of language engagement
Scholars posit at least three (though sometimes four or more) core dimensions of engagement. A main strand of work in the field suggests that engagement is manifested not only in its behavioral facet (i.e. individuals’ qualitative behavioral choices in learning), but also in demonstrations of action through the cognitive (i.e. learners’ mental activity in the learning process) and social dimensions (i.e. relations between interlocutors that support interaction and learning), as well as in students’ emotional responses to learning tasks and peers (Baralt et al., 2016; Henry & Thorsen, 2020; Lambert et al., 2017).
Behavioral engagement corresponds with the amount and quality of learners’ active participation in learning, and early L2 research operationalized behavioral engagement by measuring word counts and turn counts (Bygate & Samuda, 2009; Dörnyei & Kormos, 2000; Platt & Brooks, 2002). Examples of behavioral engagement in L2 learning include learners’ voluntary involvement in speaking, interactional initiative, time on task, the amount of semantic content produced while on task, and persistence on task without the need for support or direction (Philp & Duchesne, 2016). Whereas all domains of engagement involve some degree of action, more recent reviews view behavioral engagement as students’ expenditure of effort on learning tasks, the quality of their participation, and their degree of active involvement in the learning process (Sang & Hiver, 2021). While more subjective than conventional dichotomous perceptions (i.e. of being on-task vs. off-task), this perspective of behavioral engagement taps into the quality action and opens the possibility for researchers to link behavioral engagement to other dimensions.
Cognitive engagement refers to learners’ mental effort and mental activity in the process of learning. Learners are cognitively engaged when they exhibit deliberate, selective, and sustained attention to achieve a given task or learning goals (Reeve, 2012; Svalberg, 2009). In L2 classroom settings, research on cognitive engagement has focused primarily on verbal manifestations, including peer interactions, students’ questioning, hesitation and repetition, volunteering answers, exchanging ideas, offering feedback, providing direction, informing and explaining. In addition to such negotiation of meaning, others see language-related episodes (LREs) as fitting indicators of cognitive engagement (e.g. Baralt et al., 2016; Lambert et al., 2017; Svalberg, 2017). Non-verbal communication, private speech and exploratory talk (i.e. learner discourse that occurs as they attempt to make sense of learning) are also seen by some as further indicators of this dimension (see, for example, Hiver et al., 2021b). In addition to these more obvious communication cues, it is also possible to study cognitive engagement through nonverbal cues such as body language, facial expressions, eye movements and body positioning (Fredricks & McColskey, 2012).
In L2 instructional settings, emotional engagement is often manifested in learners’ personal affective reactions as they participate in target language-related activities or tasks. Emotionally engaged learners are characterized as having a ‘positive, purposeful, willing, and autonomous disposition’ towards language, associated learning tasks, and peers (Svalberg, 2009, p. 247). Expressions of discrete positive emotions such as enjoyment, enthusiasm, and anticipation are thought to be representations of students’ affective engagement, whereas negative emotions such as anxiety, boredom, frustration and anger demonstrate emotional disengagement or disaffection (Mercer, 2019). Emotional engagement is considered to have a key impact on other dimensions of engagement because the subjective attitudes or perceptions learners carry with them in a class or through language-related tasks are fundamental to the other dimensions of engagement (Dao, 2019; Henry & Thorsen, 2020). Affective engagement is, therefore, related to learners’ attitudes towards learning contexts, the members in that context, the learning tasks, and their own participation in learning (Skinner et al., 2009; Reeve, 2012).
Social engagement, too, occupies a central place in language learning (Philp & Duchesne, 2008; van Lier, 2004). The social aspect of engagement is defined in light of the social forms of activity and involvement that are prominent in communities of language learning and use including interaction with interlocutors, and the quality of such social interactions (Linnenbrink-Garcia et al., 2011; Mercer, 2019). The social dimension can be distinguished from other forms of engagement when considering that it is explicitly relational in nature and its purpose is interaction with and support of others. Social engagement underlies the connections among learners in terms of the learner’s affiliation with peers in the language classroom or community, and the extent of their willingness to take part in interactional episodes, turn-taking and topic development, and collaborative activities with others (e.g. Lambert et al., 2017). This dimension of engagement is also linked to phenomena such as reciprocity, mutuality, and other prosocial expressions of affiliation that are manifested in empathetic discourse moves such as learners’ willingness to listen to one another or pay attention to teacher talk (Storch, 2008). Social engagement also pertains to learners’ active connection to the learning environment (Järvelä & Renninger, 2014).
III Relating engagement to learning
1 Importance of engagement for language learning theory
Given the multiple dimensions and the diverse topical areas of concern that engagement touches on, including task-based learning and L2 interaction, engagement can be positioned as a meta-construct that unites many separate lines of research within the field. A prime example of this is in the domain-specific framework of Engagement with Language (EWL) that originates in Svalberg’s (2009, 2017) work on how language awareness is developed. In her work on the topic, she offers the following definition: ‘In the context of language learning and use, Engagement with Language is a cognitive, affective, and/or social process in which the learner is the agent and language is the object (and sometimes vehicle)’ (Svalberg, 2009, p. 247).
One point that drives work on engagement for language learning is the object of the learner’s attention and engagement. For Svalberg (2009), the focus is clearly on the language itself. This EWL gives rise to the critical notion of language awareness, which has been connected with language acquisition by some researchers (Svalberg, 2017). Much of the work on EWL itself has focused on language as the object being studied and the form of interaction in classroom tasks. This is equally the case in related areas of inquiry such as L2 interaction research using LREs (Storch, 2008; Swain & Lapkin, 2001). This focus on language as the process of learning and the outcome of learning is especially pronounced in L2 interaction research.
Another important dimension of this work has been a consideration of attention, which is critical to engagement – that is, a learner must direct their attention to tasks and to connections between language form and its meanings in use in order to be truly engaged. While there are marked parallels to Schmidt’s (2001) pioneering work on noticing, the field of language learning is still notoriously divided regarding the role of deliberate attention and awareness in language acquisition (Rebuschat, 2015). Yet, as Philp and Duchesne (2016) explain, attention itself is the gatekeeper of our working memory, and the ultimate currency of instructed L2 settings. Because engagement is ‘the major force of learning’ (Ellis, 2019, p. 48), engagement research in language learning raises critical questions about the link to implicit and explicit learning mechanisms and knowledge, and the elements that learners’ attention is being directed to – whether that is formal features of the language, the task, the content and/or the social interaction.
This also raises important questions about what is considered as an indicator or proxy of engagement. In a considerable number of studies looking at engagement in language learning, the indicators of engagement have centered around the quantity, quality, and form of individual learner discourse and participatory behavior (Baralt et al., 2016; Lambert & G. Zhang, 2019). However, there are likely to be other indicators of learner engagement (e.g. Dao et al., 2019; Z. Zhang, 2020), given the less visible dimensions of engagement (e.g. cognition and affect). Additionally, given the typical format of language learning contexts, the interaction that defines language development, and the social nature of all language interaction, Svalberg (2009) stresses the importance of social indicators of engagement within language learning processes. Adopting broader, more inclusive markers of engagement is likely to make this domain relevant to all areas of language acquisition research.
2 Importance of engagement for language learning practice
In both language learning and in educational research and practice more generally, one of the appeals of engagement as a construct is that it can provide a broad portrait of how students think, act, and feel in instructional settings (Oga-Baldwin, 2019). Engagement is intertwined with many other individual and situational factors and relates to broad aspects of students’ and teachers’ functioning in school contexts (Mystkowska-Wiertelak, 2020). High learner engagement has been linked to many positive outcomes in education. These include high levels of academic persistence, effort and achievement, high academic aspirations and increased mental health and low dropout rates and reduced high-risk behaviors (Christenson et al., 2012).
There are also important policy implications of L2 learner engagement. In formal L2 classroom settings language development is a driver of equity. However the emphasis on standards, outcomes and teacher accountability has intensified. With the progress and achievement of L2 students under greater scrutiny than ever, students need to be engaged to actually succeed (Hiver et al., 2021b). In many educational systems, the makeup of the local communities that schools serve has become more linguistically and culturally diverse, pushing schools and teachers to manage a broader, more ambitious role in supporting their community. This is also perhaps why many educational systems keep close tabs on student engagement and disengagement to identify students who are struggling and might benefit from targeted interventions (Fredricks et al., 2019).
Engagement also resonates with practitioners because it is easily understood as an essential ingredient for learning and for quality instruction. Educators across the globe, in language education and beyond, increasingly recognize the difficulties of keeping learners engaged and focused on their learning in the face of a myriad of distractions (Mercer & Dörnyei, 2020). What many teachers witness in their daily classrooms is closely related to learner attention and action: these are problems of engagement. Studying engagement brings together teaching and learning perspectives, and for this reason it can help to identify the classroom and instructional conditions that shape student outcomes and build meaningful involvement and participation (Fredricks et al., 2004).
IV Aims and research questions
The L2 research community has witnessed a growth in interest and activity around the construct of engagement over the last two decades. This points to a clear desire to probe the nature of engagement, capture the necessary conditions for engagement, explore the development of engagement over time, and uncover ways to maintain and sustain learners’ engagement as well as re-engage disaffected students (Hiver et al., 2021b). However, several unsettled issues that might hamper this program of research have yet to be resolved. The first and foremost of these is related to the fuzziness surrounding how engagement is defined and operationalized. Engagement still suffers from a jingle (i.e. different terms being used to refer to identical notions or constructs) and jangle (i.e. the same terminology being used to describe distinct notions and constructs) in the way it is defined and operationalized (see also Reschly & Christenson, 2012). Given the variety of operational definitions used across studies it is common to discover, for instance, that one researcher’s conceptualization of cognitive engagement is used as another’s measurement of behavioral engagement.
In addition to such operational issues, other challenges relate to eliciting, measuring and analysing this multidimensional construct. In particular, the large variation in the measurement of this construct has made it challenging to compare findings across studies. To date, the most frequently used approach in evaluating engagement is self-report, an indirect measure of the construct (Fredricks & McColskey, 2012). However, as some reviews of designs and measurement techniques for engagement show, too few valid and psychometrically sound indirect measures of student engagement exist with which to assess the multidimensional nature of engagement (Hofkens & Ruzek, 2019). Exacerbating matters, some view engagement as an outcome that predicts learning, while others consider it a resource progressively built during the process of learning (Symonds et al., 2019).
To assess the unique contribution of engagement to student learning and development across these dimensions as accurately as possible, it is essential to take stock of existing operational definitions, study designs, and analytical strategies in the field. As 20 years have passed since the first study explicitly investigated engagement in language learning, it is time to look back at this body of research and systematically review it. As mentioned above, we approached this systematic review project with two parallel objectives – one descriptive and one substantive. These correspond with our research questions. On the basis of a body of research spanning the 20 years from 2000 to 2020, we asked the following research questions:
Research question 1: What are the methodological characteristics of engagement studies in the field (including trends in study design and analytical choices)?
Research question 2: What conceptual definitions and operationalizations of engagement are adopted in empirical reports?
Research question 3: What, if any, areas for improving engagement study quality are apparent?
V Method
1 Report pool creation
We constructed the report pool through a sequential process: 1) journal selection, 2) automated search, and then 3) report exclusion/inclusion. Our aim was to ensure robust coverage by compiling a pool of all studies on engagement in language learning from major field-specific journals.
In line with recent L2 research syntheses (see, for example, Andringa & Godfroid, 2020) we restricted our analysis to reports published in L2 journals in the Web of Science’s Social Science Citation Index (SSCI) as these journals have been observed to present high-quality research (for a rationale on exclusive SSCI focus, see also X. Zhang, 2020). We must concede however that this exclusive focus is not without challenge. L2 research syntheses, especially meta-analyses such as Bryfonski and McKay (2019) and Vitta and Al-Hoorie (2020), have alternatively included reports from a divergent range of databases with calls for unpublished reports. Such comprehensiveness is in line with Norris and Ortega (2000), and the rationale for such comprehensiveness includes the mitigation of selection and publication biases where large and significant effects are favored to be published (Fanelli, 2010). With systematic reviews, however, there appears to be a trend to focus on the SSCI to capture the methods and research aspects published in the journals which the field trusts as both robust and consistent (for a similar rationale, see also Zou et al., 2020). Thus, there is precedent for the current study’s SSCI focus, but we acknowledge that doing so presents a representativeness limitation.
As the SSCI does not have a specific sub-category for L2 or applied linguistics journals (Vitta & Al-Hoorie, 2017), we reviewed two recent comprehensive reviews of SSCI L2 journals to systematically construct the SSCI journal list (Al-Hoorie & Vitta, 2019; X. Zhang, 2020), and combined their respective journal lists (n = 20 journals). Subsequently, the decision was made to include the journal Language Awareness given that seminal research on L2 engagement has been published there (e.g. Svalberg, 2009, 2017). A total of 21 journals (see Appendix 1) were selected from which the report pool was constructed.
Each journal was searched using the ‘engagement’ keyword and date of publication was set at 01-01-2000 to 02-02-2020 in accordance with our research questions. We restricted the keyword search to titles, abstracts, and keywords. This decision was intended to avoid the false negatives likely to arise from the more generic use of the term engagement in applied linguistics research. 1 Such restriction also enhances the replicability of our approach. The journal searches were conducted using the Scopus platform as Scopus allowed for a parsimonious title-abstract-keyword (TITLE-ABS-KEY) while facilitating an easy download of reports bibliometric data. To ensure that Scopus that not omit relevant reports, redundancy checks followed using ProQuest, EBSCOhost, Web of Science, and manual checks of journal websites. At the end of the journal searches (process shown in Figure 1), there were 13,710 unique reports in the 21 journals published within our selected time range. Of these, 351 reports were automatically selected by the keyword search and 247 were retained as empirical reports via manual inspection. A second researcher inspected these empirical judgments and 100% agreement was observed (κ = 1).

PRISMA flow chart illustrating journal search and report pool creation process.
2 Coding
The resulting 247 reports were then coded by the authors using the categorization scheme in Table 1, which operationalizes L2 engagement as the amount (quantity) and/or type (quality) of learners’ active participation and involvement in a language learning task (see, for example, Hiver et al., 2021b). At the end of this process, 112 empirical reports were retained (for number of reports by year, see Figure 2). Within these selected reports, there was further differentiation, as highlighted in Table 1 where a report was coded as being either a bona fide L2 engagement report as defined above (k = 39) or an ambiguous one (k = 73; i.e. a study that adopts a generic notion of engagement as participatory behavior of any kind within a language learning context).

Studies filtered from initial search by year of publication.
Report pool summary.
These reports were then coded individually by a team of 3 trained coders using a detailed categorization scheme (see supplementary material) that included descriptive markers such as study aim and unit of analysis as well as more substantive descriptors such as indicators included in operational measurement. To validate these judgments, one researcher independently coded 30% of reports. The observed interrater agreement (75.7%) approached the conventional 80% threshold (McHugh, 2012) and the observed kappa (κ = .612, p < .001) was within a magnitude range (.60 ⩽ κ ⩽ .79) considered to be either substantial (Landis & Koch, 1977) or moderate (McHugh, 2012). As with Plonsky (2013) who had a κ value (.56; 82% agreement) below conventional thresholds, we highlight the conservative nature of kappa especially as possible categories increase (Brutus et al., 2010). Thus, we consider the reliability of the coding to be acceptable but we concede that future researchers may improve upon it.
VI Results
Research question 1: Methodological characteristics
Starting with the characteristics of participants found in engagement research, as expected a range of sample sizes and participant age groups were included (Table 2). Fifteen studies included a sample of ⩽ 5. A similar number of studies in this pool sampled between 6 and 20 participants (18.7%), 51 to 100 participants (15.2%), and 101 to 500 participants (17.9%). The largest single category was studies with between 21 and 50 participants (25%), and the biggest sample size in the article pool was N = 43,463 (Mdn = 31; IQR = 85.75, 12–97.75). Three studies featured multiple samples, and two did not specify any sample size information. Table 2 also presents an overview of participant ages in engagement research. Studies with younger participants were clearly the minority, with 63 studies (56.3%) sampling either university students or adults aged 18 or older. The rarest were studies with participants aged seven years and younger (3 studies) followed by those with respondents aged 7–12 (8 studies). Fifteen studies featured multiple, mixed age groups, while the age of participants was unspecified in 7 studies.
Participant characteristics.
Note. Largest sample size in pool is N = 43,468.
Although we expected equal representation of a variety of research contexts in the study pool, as Table 3 shows, foreign and second language learning contexts accounted for 100 studies (89.3%) of the total. Other research contexts were only minimally present, such as bilingual/multilingual language contexts (4.5%), and a mix of several of these within the same study (2.7%). Various instructional settings were also part of this pool. In addition to the 56 studies (50%) which took place in conventional instructed language settings, our pool showed that the next most frequent instructional setting was online, app-based, or a virtual learning environment (19.6%). Only a handful of engagement studies have been conducted in immersion environments, in study abroad contexts, in content-based language classrooms, language for specific purposes classrooms, or with a combination of these. Furthermore, only 3 studies investigated engagement in untutored, naturalistic language learning.
Contextual characteristics under study.
Notes. k = 112. The multiple/mixed category includes additional languages including Czech, Dutch, Hungarian, Korean, Mandarin Chinese, Portuguese, and Swahili, among others. VLE = virtual learning environment; CLIL = content and language integrated learning; EAP = English for academic purposes.
Participants also represented various first language (L1) backgrounds and target L2s (Table 3). We categorized roughly 20 different L1s here based on their geographical origin for the sake of parsimony (i.e. some studies featured multiple languages). The majority of L1s were languages of Asian origin, followed by European and Middle Eastern languages. An additional 22.3% of studies we reviewed included participants with mixed L1s (25 studies), and 1 study did not specify participants’ first language(s). When it comes to target languages being learned, far fewer languages were featured. Among these, what stands out is the dominance of L2 English, accounting for just over 70% in the pool. Though we only included reports written in English, this imbalance is perhaps not surprising given the global importance of L2 English (Dörnyei & Al-Hoorie, 2017). It also stands in contrast to the relatively low frequency of other languages that are, arguably, equally widespread and important target languages. Spanish was the second most represented L2 in our pool (5.4%), while some world languages were featured in just a single study. Six studies featured learners of multiple and mixed languages, and 4 studies did not specify the target language in question.
As mentioned earlier, one point of interest in this study was to examine the conceptual definitions and operationalizations of engagement adopted in empirical reports across various domains of language learning. Having excluded 140 reports from the initial pool (see Table 1) due to their use of the term engagement as a non-specific catch-all or synonym for activity in language learning – i.e. with no additional focus or elaboration on its substantive meaning – we expected more precise and focused definitions in the remaining 112 reports. However, Table 4 shows that 65.2% of reports in this pool (73 studies) adopted a generic notion of engagement as participatory behavior of any kind within language learning contexts. Notably, there was a low bar concerning what forms of learner participation/behavior were indicative of engagement such that nearly any desultory student behavior counts as ‘engagement’ or ‘engaging’. These reports featured ambiguity or no specific information regarding how that engagement is conceptualized, operationalized, and/or measured. The remaining 39 studies (34.8%) adopted a specific definition of engagement as deliberate attention to and volitional action for language learning. These studies examined the amount (i.e. the quantity) and/or type (i.e. the quality) of learners’ active participation and involvement in a language learning task, whether in a classroom setting or other instructed setting. In these studies, engagement was directed either to language learning tasks/activities as the process or vehicle for learning or to language itself as the outcome of learning.
Definition and domain of engagement.
Notes. k = 112. Number of studies for Operational Domain may not sum to 112 because multiple domains of engagement were coded in several studies.
We additionally coded the operational subdomains of engagement adopted in all reports. In line with generally accepted definitions of engagement as action, the behavioral domain was featured most often (52.7%), followed by relative parity between the cognitive (20.5%) and emotional (21.4%) domains of engagement. A handful of studies explored the social domain of engagement (3.6%) and language domain-specific aspects of engagement (11.6%) applicable only to a particular domain of language learning (e.g. oculomotor engagement, shared reading engagement). EWL and LREs were featured in 3.6% and 5.6% of reports respectively, and were coded as separate categories given that these each combine several operational domains of engagement. Disengagement was the explicit focus of very few studies (2.7%). Finally, in over 15% of reports, we were unable to ascertain which operational domain(s) of engagement had been adopted or were the area of focus. This finding, coupled with the very large number of reports (65.2%) that did not feature a clear definition and/or operationalization of the construct itself in the first instance is a genuinely puzzling state of affairs, one that we will devote more attention to in our discussion.
Returning to study design characteristics (Table 5), we looked at the general approach to study design as well as the method adopted in the reviewed studies. Nearly two thirds of reports in the pool (55 studies) were observational cross-sectional studies. More than 15% of studies (17 studies) adopted longitudinal observational designs, and an additional 14.3% were quasi-experimental. The overall approach to study design was ambiguous in the remaining 24 studies. Examples of these include studies of machine translation-assisted editing of student writing, pedagogical priorities in a flipped classroom, learners’ use of metalanguage in interaction, and graduate student writing in a community of practice – all presumably legitimate topics of focus within the rubric of student engagement, but accompanied by inadequate detail about the overall approach or study setup. Table 5 further shows that choice of method was split across qualitative (35.7%), quantitative (37.5%), and mixed methods studies (26.8%). Our review showing this near proportionate split and the prominent adoption of mixed and multi-method studies may mirror growing trends in the field to value multiple methods as equally productive and integrate methods as an innovative way forward (Hiver, Al-Hoorie & Larsen-Freeman, 2021a).
Study design.
Note. k = 112.
Turning now to the study aims and the purposes for which the engagement construct featured in studies (Table 6), we found that many studies (33%) adopted engagement as the outcome of interest or as the dependent variable (e.g. in quantitative designs). The next most frequent study aim was found in studies (25.6%) examining engagement as a predictor of language learning and use; also analogous to the independent variable in quantitative studies. A further 23.2% of studies investigated the dimensions and make-up of engagement in a more descriptive way. Although theorizing on the topic plays a clear role in driving empirical work and the operational domains adopted in respective studies (see Table 5), this exploratory strand of work on the composition of engagement remains prominent in the report pool. Only 9 reports had as their aim to study the processes and mechanisms of developing engagement. By itself, this seems to signal a preference in most studies for viewing engagement as a product rather than a process. Finally, in 46 reports engagement was coded as being incidental to the study aims. This is cause for some concern as it means that 41.1% of engagement studies were self-labeled as such but were not actually doing engagement research per se. Examples of this include studies in which engagement was listed in the keywords and abstract or reviewed as part of the background literature or theoretical framework but did not subsequently feature in data elicitation, data analysis, or the presentation and discussion of results. We return to this finding later when reflecting critically on our research questions. Finally, Table 6 shows that the unit of analysis in these studies was split across the group-level (56.3%) and individual-level (28.6%). The remaining 16.1% of studies adopted texts as their unit of analysis.
Purpose of study.
Notes. k = 112. Number of studies for Study Aim may not sum to 112 due to studies employing the construct of engagement for multiple purposes.
Closely related to the study design characteristics we reviewed above are the choices of data elicitation methods and data analytical strategies. We found that a range of indirect and direct measures and techniques for data collection were present in reviewed studies (Table 7). The techniques most frequently adopted were surveys or questionnaires (42 studies; 37.5%) and interviews or focus groups (34 studies; 30.3%), both forms of self-report. Other commonly used data elicitation methods included lesson observations, tasks, tests, analysis of written samples of learner language, and oral language/interaction samples. Other data sources used more sparsely included field notes, stimulated recall, journals or diaries, and think-aloud protocols. Nine studies featured other types of data elicitation tools such as samples of student academic work or class artifacts, screen captures, or chat logs. The fact that studies generally do not rely on a single form of data or adopt exclusively indirect (i.e. self-report) measures of engagement is a promising finding, and one future work should build on.
Analytical strategy.
Notes. k = 112. Number of studies for Data Collection and Analysis Technique may not sum to 112 respectively due to many studies eliciting multiple forms of data and adopting multiple techniques for analysing those data.
Looking also at analysis techniques, it was not surprising – given the large number of studies that were qualitative in design or that adopted qualitative data collection techniques – that qualitative coding and analysis methods were employed in roughly half the reviewed studies (55 studies; 49.1%). This included qualitative data analysis techniques such as inductive thematic coding, grounded theory analysis, conversation and discourse analysis, or ethnographic analysis. Examining other data analytical strategies further revealed studies that relied on descriptive statistics (16.1%) and studies that adopted conventional inferential statistical analyses (30.3%). These included analyses such as chi-square tests, parametric and non-parametric correlations, t-tests and analyses of variance (ANOVA), and linear regression analysis. A handful of other advanced multivariate statistical analyses were used (14 studies) including factor analysis, Rasch analysis, mixed effects modeling, and latent variable modeling (i.e. SEM). We also found a number of instances (9 studies; 8.0%) in which the data analysis technique was either unclear or unspecified – examples of this include unintuitive descriptions such as ‘data were identified and coded’, ‘interviews were transcribed and analysed’ or ‘data analysis was conducted recursively.’ The finding that such key design details remain ambiguous in a number of studies is again cause for concern, and one we will return to below in our discussion.
Research question 2: Operationalizations of L2 engagement
The current report pool was also informative regarding the measures or indicators used to assess student engagement (Table 8). As previous reviews show (e.g. Finn & Zimmer, 2012) studies of student engagement often tend to include a range of obfuscating antecedents, indicators, and facilitators, at times collecting, analysing, and reporting data on these measures simultaneously. Here we focused specifically on indicators of the various subdomains of engagement included in studies in this pool. This included indicators measured through direct and indirect means.
Comparison of indicators of engagement included in operational definitions.
Notes. Full references to the studies cited in this table are listed in the supplementary material. k = 112. Number of studies across indicators may not sum to 112 as studies included multiple indicators. Number of studies within a domain may not correspond with numbers in Table 4 as studies included multiple indicators of a single domain. * Indicators marked with an asterisk are ambiguous and fall outside the scope of the engagement construct; for completeness, we have classified these in their nearest domain.
Indicators used to assess the cognitive domain relate to deliberate, selective, and sustained mental effort on the part of learners. Included here are indicators that reference mental activity more broadly such as focus (7.1% of total pool), attention (5.6%), and self-regulation (3.6%), as well as those marked by the quality of such mental activity including depth of processing (2.7%), negotiation of meaning or form (2.7%), and mental elaboration (4.5%). Additional indicators related to higher-order mental effort included monitoring (0.9%) and metacognitive capacities (1.8%). Several indicators used to tap into cognitive engagement were unrelated to the type and quality of learners’ mental effort in the process of learning. Some were either too narrow in focus (i.e. building linguistic knowledge), tangential in nature (e.g. motivational intensity), or a consequence resulting from cognitive engagement (e.g. understanding/comprehension).
Turning to indicators of behavioral engagement, many were related to the degree learners were actively involved in the learning process and the quality of their participation more generally. This includes indicators such as active participation (11.6%), effort expended (8.9%), task completion (18.7%), and time on task (9.9%). Other indicators were specific to individuals’ involvement in language learning such as interaction and language use (23.2%), number of spoken turns (4.5%) or written texts/posts (3.6%), out-of-class language use (3.6%), or instances of practice (2.7%). As with the other domains of engagement, several measures used as indicators of behavioral engagement were ambiguous and fell outside the scope of the engagement construct (e.g. learner agency and vocabulary learning).
Indicators of emotional engagement included general affective responses (1.8%) and positive appraisals (1.8%), as well as dispositions such as interest (3.6%), satisfaction (0.9%), and enthusiasm (1.8%). Several studies adopted discrete emotions such as enjoyment (3.6%) or emotional responses such as laughter (0.9%) as indicators of emotional engagement. This domain, however, is marked by the greatest number of ambiguous and extraneous indicators. This includes indicators that tap into qualitatively distinct constructs such as flow (4.5%), intrinsic motivation (0.9%), or autonomy (0.9%). Emotional engagement should be manifested in learners’ personal affective reactions as they participate in target language-related activities or tasks. This is why the use of indicators that relate to antecedents or outcomes of emotional engagement like a sense of meaningfulness (0.9%), a sense of purposefulness (1.8%), or the willingness to engage (3.6%) are ambiguous at best.
Only 4 studies in the entire pool investigated social engagement. The indicators adopted here included a sense of community/belonging (0.9%), one’s supportiveness of others (0.9%), and affiliation in discourse (0.9%). In this domain too, seemingly unrelated indicators were found (i.e. motivation contagion). Language domain-specific indicators (11.6%) were those related to a specifically circumscribed area and only applicable to that domain of language learning (e.g. oculomotor engagement or shared reading engagement). LREs (5.6%) and EWL (3.6%) are listed as separate categories here given that these indicators each combine multiple operational domains of engagement. Disengagement indicators were rare (3.6%) and were coded as a separate category here because they include measures from multiple domains (e.g. boredom, apathy, and frustration = emotional disengagement; avoidance = behavioral disengagement).
VII Discussion
Having described prominent design and analytical choices from engagement studies in the field as well as the conceptual definitions and operationalizations of L2 engagement adopted in these empirical reports, we turn now to a discussion of the areas this review has highlighted for improving engagement study quality, which relates to our final research question.
1 Methodological issues
One positive trend in studies we reviewed was the inclusion of multiple measurements and complementary data sources to tap into domains of L2 engagement. Such studies have provided a valuable understanding of the nature and function of the various dimensions of engagement and their role in learners’ development, particularly at the aggregate level. The fact that studies generally do not rely on a single form of data or adopt exclusively indirect (i.e. self-report) measures of engagement is a distinct strength, and one that future work should capitalize on by continuing to supplement indirect measures with direct measures of the relevant domains of L2 engagement (Zhou et al., 2021). Going forward, there will be clear value in increased use of skill- and domain-specific measures of engagement as well as measures that allow the dynamics of engagement (e.g. how it is sustained and how it deteriorates) to be investigated. Areas not yet explored in this report pool include implicit measures of engagement that can tap into the subconscious side of engagement and technology-driven real-time, authentic data (i.e. big data) that can be used to test hypotheses and the effectiveness of different interventions. Advancing the measurement of engagement has potential to open entirely novel avenues of research and increase the depth of insights obtained from this research.
Secondly, the sample size characteristics of the studies reviewed, in combination with several other design characteristics, highlight the prevalence of group-based and cross-sectional designs in existing L2 engagement research. Capturing general profiles in L2 engagement or studying the tendencies and patterns of groups of learners has shown where teacher practice is likely to have an immediate and sustained impact. Operational and design choices in L2 engagement research can have an important impact on the research-practice interface that results from work on the topic. Here too, there is potential for more individual-based work that looks beyond isolated aspects of student engagement, pooled across participants. The level of granularity in a study’s sample, unit of analysis, and design can be fine-tuned by varying the agent (the individual student vs. a group), the task (individual L2 tasks vs. a course of learning), and the time (a momentary timescale vs. an ongoing developmental timescale) (Symonds et al., 2019). Individual-based designs that examine intra-individual variation can shed light on the more proximal processes of engagement that are specific to momentary language learning such as those involved in particular learning tasks (e.g. attention, processing, retrieval, reconstruction) and provide new insights that complement advances already made in the field.
Third, from our review we found very little work either assessing the malleability of engagement, investigating the dynamics of its development, or focusing on re-engaging disengaged and disaffected students. Student engagement is not static or immutable – it can change. How it is dynamic, under what conditions, and for whom remains unclear from the current body of work. Our analysis shows that student engagement is most often conceptualized as a desired outcome, and this is a sound design choice in many instances. With some small design modifications, however, it is possible to take more explicit temporal considerations into account and contribute to an expanding picture of this topical area. This can be done, for example, by investigating the role of teachers, peers, and learning tasks on the development of engagement over time, and examining how classroom learning opportunities, assessments, and extramural interests and experiences influence learners’ engagement. There are also other means through which engagement can be studied to foreground the ways in which it is dynamic and emergent. Engagement can be studied as the process or activity through which development occurs, as a vehicle to better understand how learners’ achieve success, or as a mediator in the mechanisms of learning and development. These are all in line with the growing trend of taking time and change into account in research designs (Hiver, Al-Hoorie & Evans, under review) in order to reflect the way learners develop in interaction with the environment synchronically and diachronically.
2 Operational and definitional issues
First and foremost, our review has shown that the issue of definition needs further attention in L2 engagement research. Engagement models can be used to inform pedagogy and bolster student performance only to the extent that engagement itself and its components are well defined and clear for researchers and practitioners to understand. As our analysis shows, fewer than 35% of studies reviewed featured a clear definition and/or operationalization of the construct itself. If further work is to contribute substantively to this domain, this is a concern that must be addressed. Put in a larger context, the aims of the studies might also shed light on these definitional concerns: a large number of reports (41%) were self-labeled as engagement studies and included engagement in the keywords and abstract, or mentioned engagement in a review of background literature. On closer scrutiny of these reports, however, engagement did not feature in any data elicitation, data analysis, or the presentation and discussion of results. If engagement is incidental to nearly one in two reports in this pool, it should not be surprising that such a large number of studies adopted such generic, unfocused, or potentially ambiguous operational definitions of engagement within language learning contexts.
Looking more closely at definitional issues, our review showed that a range of subdomains were featured in reports, with behavioral engagement the most frequent and social engagement the least frequent. However, coding these specific domains was not always straightforward given the lack of explicit detail and transparency in the operational descriptions across many reports. For example we had to extrapolate from notions such as students’ preparation, text reconstruction, linguistic sensitivity, and sense-making which domain of engagement was being targeted (if any) in reports. Ultimately, in over 15% of reports, we were unable to ascertain which operational domain of engagement had been adopted or was the area of focus. In addition to domain-specificity, our review also showed that even when combining EWL and LREs with other L2-specific definitions only roughly 20% of the measures adopted were skill- or language learning-specific. The issue of whether domain-general (i.e. academic participation) definitions are adequate for the field or if measures should be skill- and language learning-specific straddles both conceptual and methodological territory. More focused measurements are often more informative psychometrically and reflect a more mature and sophisticated theoretical framing of a construct. However, like the early work on student engagement outside our field (for one review, see Reschly & Christenson, 2012), many of the studies we reviewed seem to suffer from conceptual haziness and lack of focus.
With regard to measures and indicators of engagement adopted more specifically, our analysis revealed that many indicators used by studies in the report pool were ambiguous and fell outside the scope of the engagement construct. For example, our earlier review of the literature makes clear that engagement functions as a precursor to learning. However, several indicators in the cognitive and behavioral domains did not distinguish engagement from learning or comprehension/understanding, confusing a necessary antecedent with a desired outcome. Indicators in the domain of emotional engagement in particular were also marked by imprecision regarding their intensity. Some indicators were more prosaic (e.g. satisfaction) or categorical (e.g. interest), while others equated qualitatively and functionally distinct notions such as flow, a supercharged mental and emotional state that emerges from engaging in an activity, with emotional engagement. Many indicators also featured overlapping notions and invited unnecessary complexity into the accompanying conceptual definitions. As just one example, goal-directed behavior is an indicator that combines goal-setting (a cognitive indicator) with strategic pursuit (a behavioral indicator).
There were also lingering issues with murkiness in many reports regarding the distinctions between engagement and constructs such as motivation, agency, autonomy, and strategy use. In many cases these were used as indicators or proxies for engagement. To elaborate on just one comparison, action is key in distinguishing engagement from motivation. Motivation represents initial intention and engagement is the subsequent action (see Noels et al., 2019). The ways in which learners engage in L2 learning are no doubt linked to their initial intent to participate in learning. Still, the consensus is that motivation is meaningfully distinct from engagement in the sense that the intensity and the quality of student involvement in the learning activity or environment (engagement) differs from the forces that energize and direct that behavior (motivation) (Martin et al., 2017). These concerns may strike some as merely semantic in nature, but such imprecision in thinking no doubt affects study design and methodological choices, rendering results and conclusions potentially invalid. It is no doubt also the case that application to classroom-level research and practice is near impossible unless theoretical and conceptual boundaries are both coherent and consistent in the field. Our review, therefore, highlights the importance of specifying the boundaries of the engagement construct and its indicators to increase clarity in this body of empirical work.
3 Pedagogical implications
The reports we have reviewed pointed out some pedagogical implications which we review here. First, because engagement is at the core of all language learning, we feel there would be a meaningful return on investment for fundamental work in establishing clear definitions and toolkits for assessment that help practitioners better understand how the various dimensions of engagement interact with one another, and whether there are phenomenological differences in how individuals experience engagement across various settings (see also Dao et al., 2019; Egbert, 2020; Z. Zhang, 2020).
Second, it is clear that language pedagogy must sharpen its focus on the necessary conditions for engagement. Language instruction must also attend to issues of what makes language learning engaging for students both inside and outside of classroom settings, what conditions are part of an engaging instructional context, what makes for engaging language learning tasks and how these differ across groups of culturally and linguistically diverse learners with varied levels and learning objectives (Nakamura et al., 2020). Many reports point out the importance of task characteristics in generating engagement, whether these relate to the level of support provided, challenge built into tasks, or choice, sequencing, and focus of the task (see also Dao, 2019, 2020; Lambert et al., 2017; Lambert & G. Zhang, 2019; Phung, 2017).
With regard to the ways teachers can build language learning environments that are engaging, it seems that many of the pedagogical implications to come out of our review have to do with technology. For example there are interesting suggestions about engaging students using video games, synchronous CMC tools, authentic social media and web-based uses of language, and other affordances to establish rapport and increase L2 interaction during online exchanges (see, for example, Henry, 2019; Henry & Thorsen, 2020). Similar suggestions include connecting learners with both L1 and L2 speakers of the language to raise their awareness of the different varieties of the language through cross-cultural communication. These ideas seemingly blur the lines between classroom language learning and extramural language development, and although they may be unstructured in nature they reflect more meaningful purposes for language use while also stimulating learner engagement (Mercer, 2019).
Finally, identifying disaffected learners and disengaging learning environments is an important way to shed light on the policies, practices and contextual influences that provide a disincentive for active learner involvement and meaningful participation. Engaged language learners develop further and faster, and benefit from many desirable ‘side-effects’ such as deeper interest, greater motivation, and stronger self-efficacy and persistence (Egbert, 2020). By comparison, chronic disengagement and lack of interest in language learning can lead to passivity and feelings of alienation from teachers and peers in the learning environment, unfocused and wasted attention and effort for learning, and poor persistence and commitment to learning more broadly. This highlights the importance of targeted interventions that can help disengaged learners recapture their energy for action and rediscover their appetite for meaningful involvement in language learning. Doing so should also include an appropriate focus on student voices and perspectives and encourage an upward trajectory towards personal investment in language learning (see also Aubrey et al., 2020; Mercer, 2019; Mystkowska-Wiertelak, 2020).
VIII Conclusions
Engagement is a dynamic, multidimensional construct comprising situated notions of cognition, affect and behaviors – including social interactions – in which action is a requisite component. Our review of the past 20 years of work in this area has been revealing. The purpose of our review was to take stock of empirical work in the field and draw conclusions across the subdomains of language education (e.g. classroom settings, technology assisted learning), based on the totality of this research. In particular, we examined the methodological characteristics of previous empirical L2 engagement research as well as the conceptual definitions and operationalizations of engagement.
While engagement research has avoided some of the pitfalls of other instructed SLA and psychology of language learning domains (e.g. general language motivation research), notably the overreliance on self-report measures and leap to pedagogical claims that are not substantiated by empirical evidence and hypothesis testing, there are still areas to be addressed in future research. Examples include increasing the use of longitudinal and individual-based investigations uncovering the dynamic nature of engagement as well as interventions targeting disengaged and disaffected learners. Work is also needed to enhance the conceptual and definitional precision of engagement and its different subdomains. The definitional/conceptual issues that are persistent across the studies reviewed raises the question of whether engagement represents a concept too broad to be meaningful or if it is in fact a helpful construct. We believe it is meaningful, and suggest that future success in researching the construct be reflected by greater operational/definitional transparency in empirical studies. We look forward to scholars in the field taking up the call to clarify definitional concerns in order to broaden and sharpen our understanding of how engagement connects to other aspects of language learning and teaching processes in diverse settings. Future work on L2 engagement, whether it privileges an exploratory or experimental paradigm, will no doubt build on the field’s understanding and advance the quantity and quality of work, opening up new avenues of research that will lead to a richer, more informative and more diverse range of studies in language learner engagement.
Supplemental Material
sj-pdf-1-ltr-10.1177_13621688211001289 – Supplemental material for Engagement in language learning: A systematic review of 20 years of research methods and definitions
Supplemental material, sj-pdf-1-ltr-10.1177_13621688211001289 for Engagement in language learning: A systematic review of 20 years of research methods and definitions by Phil Hiver, Ali H. Al-Hoorie, Joseph P. Vitta and Janice Wu in Language Teaching Research
Footnotes
Appendix
List of journals.
| Applied Linguistics |
| Applied Psycholinguistics |
| Bilingualism: Language and Cognition |
| Computer Assisted Language Teaching |
| English for Specific Purposes |
| English Language Teaching Journal (Oxford) |
| Foreign Language Annals |
| International Review of Applied Linguistics in Language Teaching (IRAL) |
| Journal of English for Academic Purposes |
| Journal of Second Language Writing |
| Language Assessment Quarterly |
| Language Awareness |
| Language Learning |
| Language Learning & Technology |
| Language Teaching Research |
| Language Testing |
| Modern Language Journal |
| ReCALL |
| Studies in Second Language Acquisition |
| System |
| TESOL Quarterly |
Acknowledgements
We would like to thank Hyejin An and Dr. Samuel Reid for their assistance with coding the report pool.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental material
Supplemental material for this article is available online.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
