Abstract
A central concern among qualitative researchers over the last two decades has been enhancing its visibility and credibility, particularly among quantitative researchers as well as general audiences. Addressing calls for top-down insights to help stakeholders of research take stock of an increasingly large and complex literature body, this bibliometric analysis provides quantitative insights into 3,758 qualitative studies in language education, published in 34 major academic journals from 1999 to 2021. It investigates patterns of research productivity across authors, their institutions and the countries these are located in, the journals authors publish in, the research approaches they use, the topics they address, and the sources they commonly cite. The study uncovered a sizeable increase in the literature body, particularly in 2019–21, driven by growing interest in staple topics relating to teaching and learning featuring predominantly case study, conversation analytic, and ethnographic methods. The literature body, starting off as largely Anglophone centric and individually authored has become more diversified. Implications largely in the form of gaps in the dataset and suggestions for future research are discussed.
Keywords
I Introduction
A pressing concern within qualitative language education research (QLER), as in other disciplines over the previous 25 years, has been the extent of its visibility and credibility vis-à-vis quantitative research (Benson et al., 2009; Dooley, 2020; Lazaraton, 2005; Mirhosseini, 2017), particularly among the ‘uninitiated’ (i.e. some quantitative researchers and the public). Nearly 30 years ago, Lazaraton (1995, p. 467) contemplated ‘whether 10 years hence qualitative research will be on an equal footing with quantitative research in how frequently it is employed and how it is received by the profession.’ While the empirical data is somewhat inconsistent (Amini Farsani et al., 2021; Benson et al., 2009; Richards, 2009), applied linguists have grown increasingly receptive to qualitative research (QR) in recent years (X. Gao, 2017). There are now also several books (e.g. Barkhuizen et al., 2014; Burns & Varshney, 2012; Hadley, 2017; Heigham & Croker, 2009; Mirhosseini, 2020) and discussion articles (e.g. Holliday, 2004; Mirhosseini & Pearson, 2025; Pavlenko, 2002; Richards, 2009; Shohamy, 2004), and journal special issues (CALICO Journal, 2015; Language Learning and Technology, 2018; TESOL Quarterly, 1995, 2011) devoted to various aspects of doing qualitative research in language education. Evidence for its greater legitimacy is also apparent in the increasing visibility of qualitative research guidelines (e.g. Chapelle & Duff, 2003; Language Learning and Technology, 2023; Mahboob et al., 2016).
While it is admittedly not possible to do justice to the contribution of qualitative research to language education within one paragraph, we feel it is important to highlight its value across three important areas. As language teaching and learning are uniquely human phenomena (Kress, 2011), there is much we can learn from the textual artefacts people create, particularly those that serve to mediate teaching and learning. It is also the case that individuals’ attitudes, emotions, beliefs, and opinions are far from incidental to the processes of language teaching and learning and that such types of knowledge have played important roles throughout human history (Mirhosseini, 2020). Furthermore, QR usefully supports the investigation of phenomena that go beyond language education and have become deeply ingrained in contemporary human cultural thought patterns, namely identity, agency, and self-concept. Finally, from a practical point of view, the qualitative tradition may be the entry point into the world of research, since the demands of the scientific method could exceed what is feasible within the constraints of degree programmes. Or it could be that graduate students are encouraged to undertake qualitative analysis so as to ensure there is sufficient material for a rigorous discussion. Regardless, there is much to be gained from introspection within QR, addressing questions, such as ‘how/where/when is it done?’ and ‘how impactful is it?’.
Yet, it is also the case that, in spite of the increasing number of methodological syntheses in language education (e.g. Byrnes, 2013; Phakiti et al., 2018; Plonsky, 2013), QR has, as of yet, not been well surveyed using empirical approaches and not comprehensively in recent years. Much progress has been witnessed in qualitative inquiry in language education in recent years, but contemporary high-profile instances of meta-research specifically focusing on qualitative research in the field are rare (e.g. Dooley, 2020; Henry & MacIntyre, 2024; A.M. Riazi et al., 2023), with most being around two (e.g. Benson et al., 2009; Lazaraton, 2002; Richards, 2009) or even three decades old (e.g. Lazaraton, 1995). The methodological turn (also termed ‘methodological awareness’) denotes a scholarly movement encompassing ‘research on research’ (or meta-research) (Amini Farsani et al., 2021). The turn is not merely a drive to take stock of existing knowledge domains. It is rooted in realization of the importance of professional scrutiny that, as members of a scholarly community, directly addresses ‘the core of what we do, what we know, and what we can tell our publics that we know’ (Byrnes, 2013, p. 825). Meta-research contributes to heightening the expectations for methodological rigor (X. Gao, 2017). This point seems to particularly resonate within QR, where measures of rigor are generally less well known and can be considered (sometimes derisively) subjective and open ended (X. Gao, 2017; Mirhosseini & Pearson, 2025).
Studies focusing on the quantitative tradition are comparatively better represented in empirical meta-research (e.g. through the influential works of Luke Plonsky). Where QR is addressed, rarely does it constitute the focal point of meta-research (e.g. Amini Farsani et al., 2021; Benson et al., 2009). This is exhibited in two prominent approaches, one being methodological syntheses within disciplinary areas or written registers which touch upon qualitative methodologies and methods (e.g. Amini Farsani & Babaii, 2020; Amini Farsani et al., 2021; Hyland, 2016) and the other being research that maps the proportion of qualitative studies vis-à-vis quantitative and mixed methods research (e.g. Amini Farsani et al., 2021; Y. Gao et al., 2001; Henning, 1986; Lazaraton, 2002, 2005). Of particular interest is Amini Farsani et al.’s (2021) analysis which, in addition to tracking the scope of QR in applied linguistics, identified the prevalence of 13 qualitative designs in 18 leading journals from 2009–18. A rare instance of meta-research with a sole qualitative focus is Benson et al.’s (2009) survey of QLER, which focuses on tracking QR as a proportion of all research articles across 10 major journals from 1997 to 2006. Mirhosseini and Pearson’s (2025) examination of research quality is another instance of meta-research that adopts a qualitative approach itself but focuses on a specific topic (research quality) and a relatively small body of research. However, a new area of inquiry, akin to the quantitative meta-analysis, the QR synthesis (see Chong & Plonsky, 2024), offers promise to rebalance meta-research within language education.
1 Bibliometric research in language education
One meta-research approach which has yet to be applied to research methodology is bibliometric analysis (in language education, see Han et al., 2023; Hyland & Jiang, 2021; Lei & Liu, 2019; Mohsen et al., 2024, 2025; Pearson, 2022; A.M. Riazi & Amini Farsani, 2024; Zhang, 2020; Zhou et al., 2009). Bibliometric study denotes, ‘a type of quantitative analysis that extracts patterns from publications to provide insights into a field or a discipline based on characteristics of publications’ (Zhang, 2020, p. 200). One notable characteristic of interest in bibliometric analysis is productivity, typically of research article authors (and their affiliated institution and country) and academic periodicals, which also encompasses patterns in the frequency of research topics and how these are investigated (i.e. methodology) (Zhang, 2020). An additional perspective is citation impact, accessed via a research document’s citation count according to the index used to retrieve the data (e.g. the Web of Science, Scopus, Google Scholar, etc.) (e.g. Hyland & Jiang, 2021; Lei & Liu, 2019; Pearson, 2022; Zhang, 2020). Other forms of research impact, such as the number of document downloads, social media engagement, and how a document is used are, naturally, not captured by research indices, and hence, not incorporated into such studies.
Bibliometric studies often feature both cross-sectional and longitudinal analyses. The former are salient for providing an overview of the field; for example of the most productive authors or institutions (e.g. Pearson, 2022; Zhang, 2020) or the most commonly cited sources (e.g. Crosthwaite et al., 2022; Mohsen et al., 2025). Additionally, historical patterns and new areas of interest in topics and methodological approaches can be identified through analysing trends across fixed time periods (e.g. Hyland & Jiang, 2021; Lei & Liu, 2019). To enable meaningful comparison in citation impact, raw citation frequencies are normalized according to the number of research documents. Furthermore, by factoring in a document’s age, the usually uneven accrual of citations over a document’s lifespan can be accounted for. As a research output, bibliometric studies often report gaps or deficiencies in what/how issues are addressed or could reveal a lack of diversity and representation among contributors. As such, they provide important global perspectives on a particular field or sub-discipline, impacting on researcher and institutional decision-making about what to research, the selection of topics for journal special issues, and the allocation of research funding.
As researchers look for top-down perspectives on and take stock of the rapidly expanding literature base, so the scope of bibliometric analyses has increased. There now exist a raft of bibliometric studies within language and linguistics providing coverage to a range of topical concerns, including second language acquisition (SLA) (Zhang, 2020), computer-assisted language learning (CALL) (Mohsen et al., 2024, 2025), English for academic purposes (EAP) (Hyland & Jiang, 2021), and written feedback on second language (L2) writing (Crosthwaite et al., 2022). Despite the heterogeneity of focal areas, several studies note the rapid and extensive expansion of the literature body (Pearson, 2022; Zhang, 2020), including greater diversification of contributions, especially from ‘non-Western’ affiliated authors and institutions (Lei & Liu, 2019). Additionally, analyses indicate that the practical matters of language learning and teaching continue to predominate as central research concerns (Hyland & Jiang, 2021; Lei & Liu, 2019), albeit new areas of interest are also highlighted (e.g. MALL, eye tracking), with both practical and theoretical implications (Hyland & Jiang, 2021).
2 Study aims
Aligned with this trend of meta-research, this study constitutes a bibliometric analysis of QR in language education, published between 1999 and 2021. To some readers it may seem anomalous to apply a quantitative form of data analysis to the literature body on QR. However, we believe that, in order to capture the extensive breadth of the research publication landscape in today’s world, such an approach usefully serves to highlight the current state of the literature, including questions concerning what is researched qualitatively and how, who the main contributors to QLER are, where they come from, and how they collaborate. The following five research questions guided the design of the study:
Research question 1: Which academic journals commonly publish QLER. Which have the highest citation impact?
Research question 2: Which authors, affiliated institutions, and countries of authors’ institutions have the highest productivity and citation impact?
Research question 3: What qualitative research approaches and methods are adopted in QLER and how have these changed over time?
Research question 4: What language education topics are explored in QLER as measured by keywords from authors and abstracts? How have these changed over time?
Research question 5: What are the most cited sources among the retrieved literature?
II Method
1 Data retrieval
The present study followed PRISMA procedures for conducting and reporting data retrieval, screening, and selection (Page et al., 2021), outlined in Figure 1. Data for this bibliometric study was retrieved from Scopus, one of the largest curated indices of research in the world that claims to list over 76 million records (Baas et al., 2020). The index has wide global and regional coverage of academic journals, books, and conference proceedings with a notable focus on quality assurance through rigorous content selection and (re-)evaluation by its independent Content Selection and Advisory Board (Baas et al., 2020). As such, we are aware that our sample is weighted towards more widely acknowledged QLER, since Scopus is significantly more selective than less-regulated indices, such as Google Scholar (Martín-Martín et al., 2018). We should also acknowledge that the efficiency in which bibliometric records can be identified using search strings, exported for analysis, and also analysed using the application’s powerful in-house metrics were additional considerations that led to the selection of Scopus.

Schematic representation of study identification, screening, and retrieval.
As with other bibliometric analyses, we began by generating search terms to determine the body of relevant research. We consulted QR handbooks (e.g. Creswell, 2007; Denzin & Lincoln, 2005; Hammersley, 2013; Lincoln & Guba, 1985; Miles & Huberman, 1994) to devise a set of both inclusionary and exclusionary terms (see Table 1). These were discussed between the two of us in multiple rounds and further refined using Scopus. For instance, we specifically discussed if qualitative approaches, traditions, and methods appearing in the table of contents of these books could be adopted as search terms. However, we opted for more open-ended terms (e.g. qualitative, rather than qualitative study, qualitative research, qualitative analysis, etc.) and the use of wildcards (*) to reduce the prospect of missing relevant studies. The full search string was constructed using Boolean operators and was applied to the title, abstract, and keyword from author fields within Scopus (see supplementary material part A). Data were retrieved in May 2023.
Search terms used to identify the literature.
As we wanted to obtain a body of both recent and older QLER, studies were limited to being published between 1999 and 2021. We felt 1999, the year after Edge and Richards’ (1998) seminal work on warrants in QR in TESOL, constituted a suitable cut-off. Commensurate with other bibliometric analyses, in an effort to identify meaningful patterns and trends, we divided the time period into three timeframes (1999–2006, 2007–13, 2014–21). We opted to incorporate timeframes of equal length, rather than according to particular milestones or developments in QLER (e.g. the publication of other renowned sources), since these seemed too idiosyncratic or subjective. Owing to the explosion in language education research over the whole timeframe, the distribution of research documents is far from equal across the three time-windows (a focal area of our analysis). As such, we have normalized the findings to enable meaningful comparison by using the formula, ‘frequency counts / total number of documents in the given timeframe × 100’.
To ensure we retrieved qualitative studies situated within language education, we decided to limit the scope of academic publications to a pre-determined list that we assembled. We restricted our coverage to research published in academic journals, since these are considered the key repository for cutting-edge academic knowledge (Plonsky, 2013). While Scopus categorizes publications for the purposes of ranking publications within disciplines, we felt that the 968 discrete publications listed as ‘Language and Linguistics’ (a sub-category of ‘Arts and Humanities’) was too broad a literature base for the findings to be meaningful. As a consequence, we set about to determine a list of language education journals with reference to prior research.
We consulted 10 meta-research studies situated within applied linguistics, second language acquisition, and TESOL (Amini Farsani et al., 2021; Benson et al., 2009; Egbert, 2007; Lei & Liu, 2019; M. Riazi et al., 2022; Richards, 2009; VanPatten & Williams, 2002; Zhang, 2020) and incorporated only those periodicals that featured a minimum of three times collectively across these works. Thirty-four publications met this criterion and were incorporated into the study as a result (see supplementary material part B). One of these publications, Second Language Research, focuses on ‘experimental studies and contributions aimed at exploring conceptual issues’. However, we decided to retrieve and screen records for this journal so as not to miss any potential QR.
Out of an initial body of 4,277 items, we excluded forms of research publishing not classified by Scopus as ‘article’, ‘review’, or ‘conference paper’ (n = 55). This was because other forms of publishing (e.g. errata, notes, letters to the editor) do not always address methodological concerns and usually lack keywords and abstracts (employed in the investigation of research topics). We also removed documents missing abstracts (n = 53) and those not written in English (n = 20), since the research methodology and topic analysis drew on article abstracts. Bibliometric records for the remaining 4,149 studies were downloaded and imported into Excel for eligibility checking. To ensure relevance, the first author manually checked all document titles and abstracts, which was followed by the second author checking 10% of the dataset. We excluded studies that were not qualitative (n = 315), adopted a mixed methods approach (n = 40), and featured narrative as a written genre, rather than research approach (n = 36). Difficulties (mostly relating to ambiguous or rare QR approaches) were resolved by occasionally retrieving the full text of articles and through regular discussion between the authors. Notable content words in abstracts which indicated a non-qualitative approach that did not feature in the initial literature identification (e.g. control group, narrative essay, sequential analysis) were recorded and, along with uses of the term narrative, applied to the entire data set to filter out further irrelevant entries. The final dataset comprised 3,758 documents that were deemed QLER studies.
2 Data analysis
a Documents and journals
The prevalence of published QLER documents and their citation impact were calculated and are presented on a year-by-year basis to highlight macro trends across the timeframe. Since it typically takes several years for research articles to accrue citations (Aksnes et al., 2019), a measure of documents’ age-weighted citation rate (AWCR, i.e. the total number of citations divided by the age of the article in years) was used. We acknowledge that citation impact is but one form of research impact. The use of publications as teaching materials, the insights they offer to students (not necessarily publishing themselves and, therefore, not citing their readings), and the use of academic publications by practitioners, professionals, and policy makers, are other notable impacts that cannot be captured by citation figures. Similarly, while we are aware that not all citations are ‘positive’ – authors may cite for self-serving reasons (Egbert, 2007) and research quality is not necessarily measurable by citation counts – they remain the pre-eminent indicator upon which decisions of what to research, how, and by whom are based. Additionally, we calculated the productivity and citation impact of the 34 academic journals included in the dataset and report the 10 most impactful in terms of accumulated AWCR (by adding the AWCRs of all included documents published by the journal). This accounts for variations in citations over time, albeit such a metric naturally favours higher volume journals.
b Countries, institutions, and (co-)authorship
The 10 most productive and impactful countries, institutions, and authors were identified using Scopus’ Analyse Search Results function. Since these results were limited to the first author’s affiliated institution and country only, formulae were created to obtain more precise document and citation counts for the thirty most impactful countries, institutions, and authors (by their raw citations). In the event of multi-authored works, details of all contributing authors’ affiliated institutions and countries were incorporated into the analysis, in contrast to prior analyses (e.g. Lei & Liu, 2019). Frequency counts and distributions of documents and citations are presented as totals across the entire research period for institutions and authors, since we considered such figures too low to identify meaningful trends over time. Citation counts were also normalized relative to the number of research documents (n citations divided by n documents), given that authorial (and institutional) productivity naturally varies greatly.
c Research approaches
QR is a broad umbrella term that denotes a very large and diverse group of research methodologies. We employed both keywords from authors and keywords from abstracts (see Pearson, 2024) to identify prevalent approaches that authors claim to use. Initially, the first author manually examined all author keywords that were utilized in more than one study across the dataset (n = 1,890). The second author reviewed these 109 candidate items, and after discussion of discrepancies (for instance, whether a broad term such as qualitative analysis can be indicative enough), a further 11 were excluded because they were considered too general (e.g. qualitative, quality study). As automated measures lack sophistication in identifying the heterogeneous and nuanced ways in which authors characterize their research approaches and because authors do not always specify an overall approach (Benson et al., 2009), we also manually analysed 5% of document abstracts to identify salient methodological terms. These were compared with the keywords from authors, and after uncertainties over their qualitative and or methodological nature were resolved through discussion (e.g. action ascription, screen movements, reflexive approach), a further 52 approaches were incorporated. 23 borderline acceptable items (e.g. narrative, multimodality, accounts) were reviewed in context to ensure they were methodological, with frequency counts being adjusted to accommodate uses of terms in a non-methodological way.
The 125 keywords were then synthesized into broader approaches to avoid overlap, for example, open-ended questionnaire, qualitative questionnaire, qualitative survey, and open-ended survey as qualitative survey methods and ethnomethodology and ethnomethodological as ethnomethodology (supplementary material part C). For a more meaningful comparison, these categories were further synthesized into ways of conceiving of research at two levels of abstraction, methodological approaches and data collection methods, following guidance from the literature (e.g. Amini Farsani et al., 2021; Creswell, 2007; Heigham & Croker, 2009). This delineation was not always clear (e.g. discourse analysis, genre analysis), requiring careful reading of abstracts and discussion between ourselves. Since we found authors rarely outlined their methods of data analysis in the abstract (a finding in and of itself), we did not code such information in the dataset. However, we opted to include qualitative approaches to secondary research, drawing upon the scheme developed by Chong and Plonsky (2024).
The findings are presented as raw and normalized frequency counts, generated by cross-referencing the list of uncovered methodological terms with document titles, abstracts, and keywords as reliance upon one field alone may provide an insufficient picture of the literature (Pearson, 2024). We ensured to account for linguistic variations in the presentation of research approaches through the use of wildcards (e.g. narrative?analy* to retrieve narrative analysis, narrative analyses, narrative-analytic [study]).
d Research topics
An important rationale for bibliometric analyses is the identification of prevalent, consistent, and rising/declining research topics across the body of literature (Hyland & Jiang, 2021; Lei & Liu, 2019; Zhang, 2020). In this study, we adopted an approach that combines both keywords from abstracts and authors, in an effort to mitigate limitations with each approach and provide a more complete account of the literature (Pearson, 2024). The process of identifying research topics began with the extraction of keywords from abstracts (KFAs). These were identified by searching the document abstracts for n-grams of 2–7 words in length using AntConc (Anthony, 2018), following the patterns uncovered by Pearson (2024). These included, for example, noun + noun (e.g. language choices) and noun + preposition + noun + noun (e.g. usefulness of peer feedback). N-grams that were eight or more words in length yielded no hits. While we did initially include single-word topics, upon closer examination, we noticed that many prevalent candidate topics were too general to be considered meaningful (e.g. teachers, research). As such, we excluded single-word topics from the list. Furthermore, we omitted items that occurred fewer than three times for reasons of manageability.
The initial n-gram search minus the excluded items yielded 1,344 candidate KFA-based topics, which were supplemented with a further 500 author-provided keywords (APKs). At first, we extracted all APKs from the dataset, totalling 4,212 items after duplicates were removed. From these, we omitted 3,712 that either constituted one-word keywords (because we felt these were too general to be considered viable topics) or occurred fewer than three times (since these could not be considered common topics in language education). The remaining 500 items were merged with the list of KFAs, and any duplicate items removed (n = 268) before further screening was applied. Candidate topics were manually checked in multiple rounds to ensure their meaningfulness. We omitted 1,483 items that, semantically, were not topics (e.g. order to promote) – all of which were KFAs. We also removed topics that we considered too general (e.g. years of teaching) or that were methodological (e.g. case study) to prevent overlap with our analysis of methodological approaches. We were left with a final list of 917 items, which we labelled as topics.
Thereafter, we inductively coded the topics into 21 topic themes (e.g. coding academic writing, academic emotions, research article introductions as academic language and purposes). We undertook several rounds of coding, iteratively revising, merging, and collapsing themes, resolving differences through discussion. Our debates on the representativeness of the categories and their labelling was an important aspect of our reflexivity and corroboration as well as a basis for maximum transparency in this section of the paper, aimed at greater trustworthiness (Mirhosseini & Pearson, 2025). A feature of some keywords was their multi-topic nature (e.g. automated writing evaluation relating to both assessment and feedback and reading, writing, and literacy). We opted to incorporate such items within two themes. Finally, we applied the final list of topics and topic themes to the document abstracts to quantitatively identify prevalence (both raw and normalized) across the dataset. We acknowledge that this approach provides quantitative perspectives on qualitative research, though argue that such data requires careful qualitative interpretation from the analyst (Pearson, 2022). We measured topic (theme) frequency as dispersion across abstracts, i.e. multiple occurrences of a term within an individual abstract were recorded as one. A complete list of topics and topic themes can be found in supplementary material part D.
e Cited sources
To identify the sources most frequently cited by authors publishing QLER, the 100 most-cited sources contained within the reference lists of the 100 documents with the most citations among the dataset were analysed using Scopus’ Analyse References function. This data was supplemented with a corpus analysis of the reference lists of all included documents. These were extracted into a corpus and input into the concordance software AntConc (Anthony, 2018). The corpus was queried for n-grams of 2–8 tokens in length to identify recurring title patterns. Candidate titles were checked with the lists of references until the top-100 most cited works were identified. Attention was paid to variations in spelling (e.g. organisation/organization and encyclopedia/encyclopaedia) and punctuation (commas sometimes featured instead of colons to separate phrases in a document title), while works with multiple editions were merged (since the specific edition was not always included in the references). For manageability, only the 15 most-cited sources are presented, normalized according to the number of documents per time window.
III Results and discussion
1 Research documents
Of the 3,758 documents retrieved, 2,039 were published in 2014–21, a 24.85% increase on 2007–13 (n = 1,105) and 37.92% increase on 1999–2006 (614). Figure 2 shows the upward trend in QLER productivity and citation impact (as measured by AWCR) over the timeframe. When normalized in Scopus across all retrieved documents published during the time window, until 2002 the growths can best be described as stable and steady, rather than dramatic. We found the proportion of qualitative studies relative to the total number of documents for each timeframe rose from 10.27% in 1999–2006 to 13.60% in 2007–13 and 16.38% in 2014–21. These increases broadly align with (now rather dated) previous meta-research that had tracked earlier trends in QLER publication over a similar timeframe (e.g. Benson et al., 2009; Lazaraton, 2002, 2005; Richards, 2009). While the proportion of QR in the most recent timeframe was somewhat below that of Amini Farsani et al. (2021) (24.9%), this discrepancy could be because we sampled a wider range of journals (34 versus 18), some of which from our own analysis appear less predisposed towards qualitative inquiry (e.g. Studies in Second Language Acquisition, Second Language Research). We do not seek to conflate the contribution of QLER with the mere number of articles published. Instead, the figures provide tentative insights into the extent qualitative researchers have disrupted the traditional construction of language teaching and learning within applied linguistics as a field that was particularly amenable to quantitative methods (Benson et al., 2009; Richards, 2009).

Trends in qualitative language education research (QLER) output and citation impact, 1999–21.
As can be seen in Figure 2, there is a general trend in increased citation impact using the metric of AWCR, which is notable since, as elsewhere (Jamali, 2018), the use of qualitative methods may have constituted the primary reason the paper was cited. Looking at the figure, the cumulative AWCR for each year tended to mirror research output, with spikes in citations across a few productive years (e.g. 2007, 2009, 2011). The jump in AWCR in 2011 is particularly notable and was driven by Canagarajah’s (2011) Codemeshing in academic writing: Identifying teachable strategies of translanguaging, Norton and Toohey’s (2011) Identity, language learning, and social change, and Borg’s (2011) The impact of in-service teacher education on language teachers’ beliefs. As research takes time to accrue citations (Aksnes et al., 2019), it is no surprise that documents published in 1999–2006 feature the highest average citation per document ratio, of 48.85 (compared with 37.59 in 2007–13 and 14.43 for 2014–21). Promisingly, rates of qualitative studies with no citations were very low, standing at just 0.98% and 0.72% for 1999–2006 and 2007–13 respectively. While 5.35% of 2014–21 of documents were uncited, it is likely this figure will reduce by virtue of prolonged exposure to potential readers (Aksnes et al., 2019).
2 Journals
Table 2 outlines the 10 journals considered the most impactful in QLER according to the accumulated AWCR over the timeframe. As found elsewhere (Benson et al., 2009; Lew et al., 2018), there is notable variation in QR output across journals. The Journal of Pragmatics bestrides the dataset in terms of output, issuing a total of 809 documents, a figure noticeably higher than the next four journals, Language and Education (248), System (244), and the International Journal of Bilingual Education and Bilingualism (215). One explanation is that scholars of pragmatics seek to generate rich understandings of the complexity of the underlying meanings, intentions, social dynamics, and organizational principles of spoken discourse. Similarly, pragmatics naturally ascribes importance to the role of context (e.g. participants, the setting) in the analysis of language use, often necessitating qualitative (or mixed methods) approaches. However, we concede that the four most productive journals are high output publications generally, with the Journal of Pragmatics publishing an impressive 16 issues per year. It is also important to highlight the contribution of the Journal of Language, Identity and Education, absent from the top 10 most impactful journals by virtue of being launched in 2002, but already having accrued the fifth largest amount of QLER documents (171).
The 10 academic journals issuing the most impactful qualitative language education research (QLER), by accumulated age-weighted citation rate (AWCR).
All journals with the exception of the Canadian Modern Language Review exhibited increases in QLER output (with the mean being +62.74%). The largest rise among journals that regularly publish QR (defined as having at least 50 QR documents) was shown by the International Journal of Bilingual Education and Bilingualism (+91.18%), followed by Applied Linguistics Review (+89.33%), and System (+85.12%). This indicates a greater desire among scholars to generate understandings of the human experiences of bilingualism and educational technology, along with the raft of topics covered in Applied Linguistics Review. One cross-sectional measure of citation impact, the citation to document ratio, indicated that articles published in TESOL Quarterly (55.68), Applied Linguistics (54.26), and Language Learning and Technology (52.11) performed well. All three featured an AWCR that was noticeably lower than the Journal of Pragmatics (1,681.86) and System (811.12), indicating these two journals publish more citeable recent works. Of particular note is the dramatic increase in System’s AWCR, from a mere 69.75 in 1999–2006 to 521.90 in 2014–21, demonstrating a tangible enhancement in both the quantity and citation impact of QLER.
3 Countries, institutions, and (co-)authorship
Table 3 outlines the productivity and citation impact of the countries of author-affiliated institutions, regardless of status as first, second, third author, etc. As found by Lew et al. (2018), authors working at US institutions have been especially productive in QLER, albeit their overall share of research has fallen slightly from 31.27% to 29.52% over the timeframe. A similar trend is exhibited by UK and Canada-based researchers, although as a greater proportion of the original figure (13.52% to 11.43% and 6.84% to 4.61% respectively). Australia stands out as the only English-dominant country to increase its share of research, though the rise is minor (0.31%). Instead, the data paint a picture of researchers in English as a foreign or second language contexts making inroads into the dominance of Anglophone nations (Hsiung, 2015) by embracing qualitative methods. This is particularly the case with China (see Liu et al., 2015), where authors went from a low uptake of qualitative approaches in 1999–2006 (0.33%) to becoming the sixth largest proponents (at 4.41%).
Frequency counts and distributions of documents and citations according to country of author affiliation, by accumulated age-weighted citation rate (AWCR).
Similar patterns are exhibited in the citation impact of scholars affiliated with institutions located in these countries. US-based authors accrued by far the highest citation count (34,559), but are matched by their UK and Australian counterparts in terms of citations per document (30.66 versus 27.64 and 27.29 respectively). Among the most impactful countries listed in Table 3, Hong Kong scholars produce QLER with the most citations per document (32.87). Factoring in the effects of age on a document, American QLER leads the way, with an AWCR of 2,959.72, more than double the UK (1,167.89), which is itself approximately double the next country, Australia (663.36). This indicates that older US QLER continues to be influential, for example, Goodwin’s (2000) Action and embodiment within situated human interaction (AWCR = 66.17), and that newer studies are also well cited, for example, Gkonou and Miller’s (2021) An exploration of language teacher reflection, emotion labor, and emotional capital (AWCR = 14.33). The accumulated AWCR across nearly all countries increased notably during the timeframe (by a minimum of 45.34%, in Canada). It was researchers situated in China that exhibited the most impressive gains (99.15%), although these were not driven by a handful of standout research studies but the broader mass of Chinese-authored documents.
In terms of co-authorship patterns (Table 4), while the majority of documents were single-authored (58.04%), the proportion of individually authored QLER has dropped notably over the timeframe (from 74.02% to 50.87%). Offsetting these falls have been increases in both domestic (+11.17%) and international (+11.98%) collaboration, reflecting broader trends towards research cooperation in the social sciences (Amini Farsani et al., 2021; Liu et al., 2015; Zhou et al., 2009), likely in light of increased multidisciplinarity and resource requirements (Bates et al., 2023; Ma et al., 2014). Additionally, qualitative researchers’ interest in language analysis and contexts and experiences of language use are typically complex and multifaceted, leading to a move away from the notion of the lone researcher (Mulvihill & Swaminathan, 2023), centred around individual instinct and feeling one’s way (Milford et al., 2017). The potential for increased collaboration for richer insights and more equitable researcher-participant dynamics (Mulvihill & Swaminathan, 2023) are reflected in the increased citability of domestic (+85.05% increase in AWCR) and international (+89.68%) collaborative research. The heightened citation impact of domestic collaborations is indicative of the creation of enduring QR groups within institutions. Similarly, more impactful international collaboration underscores the merits of researchers pooling their expertise and resources to address QLER topics and issues that cut across national boundaries.
Trends in productivity and citation impact by patterns of (co-)authorship, by accumulated age-weighted citation rate (AWCR).
Tables 5 and 6 outline the authors and their affiliated institutions that have generated the most impactful QLER in terms of citation. Despite the preponderance and citation impact of US-affiliated authors, American institutions did not dominate the list of top 10 institutions. Pennsylvania State University (48 documents/AWCR = 222.20) constituted both the most productive and impactful global institutional for QLER, thanks primarily to the contributions of S. Canagarajah (6/73.28), A. Pavlenko (7/49.66), J.K. Hall (4/27.91), and P. Golombek (3/18.81). Other US institutions, notably the Universities of New York (39/88.54), Pennsylvania (29/127.58), and Illinois Urbana-Champaign (27/88.98) constituted prominent QLER centres. Institutions in other Anglophone countries also productively published QR. These include Australia, notably through the University of Melbourne (44/123.98), helped along by the works of N. Storch (5/35.06), G. Wigglesworth (3/22.36), and J. Morton (4/16.55), and New Zealand (University of Auckland, 38/101.12). This suggests that highly cited QLER is distributed across a broad range of US institutions, but more concentrated among a smaller number of more specialist centres in Australia, Canada, and the United Kingdom. This appeared to be especially the case for the UK, which despite being the second most productive country was home to only one of the 20 most productive centres (the federated University of London).
Most productive authors and their citation impact.
Note. AWCR = age-weighted citation rate.
Productivity and citation impact across author-affiliated institutions.
Note. AWCR = age-weighted citation rate.
4 Research approaches
Table 7 outlines the most prevalent research approaches identified from document abstracts. It is noticeable that there was a slightly higher frequency of methodological approaches found (n = 3,847) than there were studies (n = 3,758), due to 809 papers featuring a combination of approaches (e.g. longitudinal case study). At the same time no methodological approach was identified in 881 studies. Far fewer documents explicitly stated the method(s) used (n = 1,401), of which 466 featured more than one method. Manual examination of the abstracts of studies which outlined no recognized approach or method (also 466) yielded a diffuse array of terms to denote a qualitative approach, notably combinations with qualitative and interpretive, and uses of reporting verbs (e.g. explores, examines, adopts, etc.), sometimes together (e.g. ‘Applying an intersectionality lens that foregrounds political and structural critique, we conducted an interpretive policy analysis . . .’). This finding indicates that authors of QLER value creativity, conceptual flexibility, and freedom of spirit in characterizing their research (Davids et al., 2025), although this could impair retrievability, particularly if only methodological search terms are employed.
Changes in the prevalence of qualitative research approaches and methods.
As can be seen, qualitative case study (n = 958) and ethnographic methods (n = 623), suitable for investigations into sociocultural and ecological contexts of teaching and learning (Harklau, 2011) constituted prevalent methodological approaches. As with Amini Farsani et al. (2021), these results underscore the versatility of qualitative case studies, which allow language education researchers to explore a range of complex, uniquely human phenomena (often the processes of language learning and teaching [Lew et al., 2018]), through close and extended analysis (Hood, 2009). Consistent with the findings of Benson et al. (2009) and the observations of Lazaraton (2003) around two decades ago, is the prevalence of ethnography, which offers rich, contextually-based insights into group cultures (Heigham & Sakui, 2009). Since language education researchers are often interested in issues of language teaching, learning, and acquisition in the home and the classroom, it is of little surprise that ethnography is a prevalent approach, given that it enables researchers to create a portrait of how people within a context live (Heigham & Sakui, 2009). Conversation and interaction analysis (n = 646) was also shown to be a prominent approach, of which 71 documents also highlighted corpus methods (out of a wider 362 studies), indicating the value of corpora of spoken and written texts to qualitative as well as quantitative analysis.
Trends across the timeframe paint a complex picture, which may say more about how authors utilize language resources to characterize QLER than changes in approach. As such, we invite the reader to interpret these results cautiously. For instance, we note a stagnation in ethnographic approaches (+0.21) which was accompanied by a rise in observation methods (3.00). Likewise, reductions in approaches that commonly feature interviewing, qualitative case studies (−5.90) and to a lesser extent, action research (−1.37) were exhibited, whilst references to qualitative interviewing as a whole rose dramatically (+15.48), cementing their status as a core method in QLER (Lew et al., 2018; Richards, 2009). Some of these rises can be accounted for by increases in multimodal analysis (3.99) and narrative approaches (3.18). Nevertheless, it seems that many authors are increasingly emphasizing clarity at the method level, whilst not explicitly aligning their study to a particular approach (Benson et al., 2009; Lew et al., 2018), perhaps for pragmatic reasons (Richards, 2009). It may also be the case that authors increasingly perceive the methodological approach as unnecessary in the abstract. We believe this to be the case with analytical procedures, which were infrequently explicated, an observation noted across entire QR papers by X. Gao (2017). Additionally, while we attempted to provide an inclusive range of keywords, it may be that the nuanced or even opaque ways in which authors characterize their approach (e.g. ‘This article adopts a poststructuralist lens to reconceptualize native and nonnative speakers as complex, negotiated social subjectivities . . .’) impacted on frequency counts, suggesting the value of manual analysis in future studies.
A further salient finding is the low prevalence of methodological approaches that are well established in educational research outside of language education, for example, phenomenology (n = 28), grounded theory (n = 45), ethnomethodology (n = 68), and longitudinal research (n = 158). Phenomenology and grounded theory appear to be less commonly used by novice researchers, which may partly explain their low prevalence. A lack of coverage in QR handbooks leading to such approaches being less well-known may also be partly responsible (Richards, 2009). Another cause, specific to phenomenology, could be the need for bracketing, which alongside being a complex and demanding process, conflicts with the notion of acknowledging or even celebrating the significance of researcher’s experiences and conceptions as a tool to facilitate interpretation (Hammersley, 2013). It is likely that researchers do indeed explore individuals’ lived experiences, albeit through other methodological approaches and methods that are not conceived of as phenomenological. As applied linguistics has grown into a mature disciplinary area (Phakiti et al., 2018), it is also the case that many issues feature well-developed theories (e.g. acquisition, L2 motivational self system), necessitating an abductive form of analysis. Alternatively, researchers are able to draw on relevant frameworks beyond language education (e.g. student engagement, performance anxiety) or, under pressure to publish, may view grounded theory as too time consuming or complex.
5 Research topics
The overall prevalence and patterns of change among the 20 topic themes are shown in Table 8. QLER approaches are typically underscored by an interest in how participants experience and interact with particular phenomena at a given point in time in some natural settings. It is not surprising that topics which seem intuitively at the forefront of language education, especially those contributing understandings to the intersection of teaching and learning (see also Dooley, 2020), including classroom approaches and practices (n = 1,565), linguistic approaches and features (n = 1,305), communication and interaction (n = 1,185), language learner characteristics (n = 1,081), teachers and teacher development (n = 952) prevailed as phenomena that have profoundly engaged researchers adopting qualitative approaches. In contrast, critical approaches to language education (n = 38) appeared seldomly, perhaps because such approaches, which naturally interrogate power dynamics and hegemonic ideological constructs, may be seen as too subjective or politically charged by institutions or research funders. The low prevalence of QLER approaches focusing on literacy is noteworthy, and can be explained by the importance attributed to standardized measures of reading and writing ability/outcomes common to such research that are usually quantified. Language policy and planning (n = 262) also occurred as an uncommon topic theme, indicating future opportunities for authors to further explore how policies in context shape curricula, teaching and learning, access to education, and the preservation or erosion of an individual or group’s linguistic and cultural identity.
Frequencies and distributions of topic themes over the timeframe.
All but one of the topic themes exhibited normalized increases in frequency over the timeframe. Tertiary-level education (7.67) demonstrated the most sizeable gains. When contrasted with the stagnation with academic language and purposes (0.68), it can be seen that authors of QLER evince a much broader array of interests in the academy, captured in such papers as Brown’s (2017) Understanding the NS/NNS division of labor in the creation and assessment of a Japanese university English entrance exam and Cui and De Costa’s (2024) Becoming Uyghur elites: How Uyghur women in a mainstream Chinese university negotiate their gendered identities. At the same time, school-level language education (−1.06) was the only topic theme to demonstrate a decrease in interest, albeit this was not substantial. This could be because of numerous factors, including the contrasting availability of participants, an increase in the expectations of ethical rigor (particularly when recruiting children as participants), and greater collaboration mechanisms between faculty and students in the academy.
The rise in teachers and teacher development (7.56) testifies to the growing recognition of qualitative approaches to explore and explain what happens in language teaching and learning contexts, particularly the classroom (Richards, 2009), by querying teachers’ beliefs, attitudes, and experiences and investigating their practices (and possibly synthesizing self-reported and observed forms of data). Likewise, the increase in technology and language learning (7.47) mirrors rises in the output of technology-focused journals, including Computer Assisted Language Learning and System. This notable trend is not only driven by the increasing ways in which technology has used to support teaching and learning over the last 30 years, but because learners using their own powerful, personal technologies to work independently, outside the classroom and without the teacher means it is important for researchers to go to their participants, rather than extricate themselves from them (Levy, 2015).
6 Cited sources
Table 9 shows normalized frequency counts for the 15 most-cited sources among the documents that comprised the dataset. In spite of the dataset being underpinned by QR, only one source among the 15 most commonly cited (n = 141), Denzin and Lincoln’s (1994) edited Handbook of qualitative research, addressed QR as a broad research tradition (albeit comprises an array of topics). The other commonly cited documents either zeroed in on a particular QR approach (notably, conversation analysis, discourse analysis, ethnomethodology, case study) or did not focus explicit on QR. This latter group mostly comprised texts adopting sociocultural perspectives on teaching and learning (especially the writings of L.S. Vygotsky) and philosophical explorations of the role and purpose of language. Given the prevalence of conversation and interaction analysis, it is not surprising that half of the 12 most-cited sources overall (n = 307) focused on the theoretical and methodological foundations of the approach. In the initial period, the data suggested a reliance upon Atkinson and Heritage’s (1984) highly influential edited volume Structures of social action: Studies in conversation analysis. While persistently popular across the timeframes, the work has been supplemented by several publications that have become prominent sources in recent years. Most notably, this includes Sidnell and Stivers’ (2012) volume The handbook of conversation analysis, whose 36 chapters of theoretical and descriptive research is by far the most cited work across the documents. Interestingly, few well-cited sources synthesize language and education (e.g. Encyclopedia of language and education) or language education and QR (e.g. Richard’s [2003] Qualitative inquiry in TESOL, which was cited 32 times).
Top 15 most highly cited sources with normalized frequency counts by time period.
Note. To save space, compressed citation information is given.
Eight of the 15 most cited sources retained their prominent position across the timeframe, indicating enduring influence in QLER. These titles appear highly cited for the foundational theories, models, concepts, and/or ideas which they contribute, particularly in conversation analysis and sociocultural theory. While this may indicate a predilection towards older sources in QLER, it is also the case that many of the questions and issues raised in these works remain relevant today, for example how language educators (should) support language learning through scaffolding and mediation. However, five of these exhibited normalized falls by 2014–21, indicating authors have shifted their attentions to a greater diversity of (likely more recent) sources as the field matures (note the lower normalized frequency counts for documents cited in 2014–21). An additional trend is a move away from influential single-authored texts (not explicitly focused on QR), many of which date back long before the initial timeframe of this study (e.g. Fairclough’s [1989] Language and power), to multi-authored volumes that tend to be more QR focused, including a QLER work (e.g. Garcia’s [2009] Bilingual education in the 21st century: A global perspective).
IV Conclusions
The results of this bibliometric analysis of 3,758 documents show that since the turn of the millennium, there has been a steady increase in QLER, turning into an explosion in 2020–21. Accompanying this are growths in citation impact as calculated by both raw and age-weighted citation measures. The analysis of topic themes showed research has primarily focused on classroom teaching and learning, linguistic features, communication, learner characteristics, and teacher development, with critical approaches, literacy, and language policy being less explored. Over the timeframe, most topic themes exhibited increases in prevalence – especially tertiary education, teacher development, and technology-mediated learning – while school-level language education has slightly declined, reflecting technological developments, shifts in participant accessibility, and new methodological opportunities. Future researchers may seek to adopt a comparative approach to identify how trends in productivity and citation impact compare with quantitative and mixed methods research, to identify the extent qualitative approaches in language education are making in-roads into a field where quantitative methods are well-established (Amini Farsani et al., 2021; X. Gao, 2017).
We uncovered a number of notable trends in prevalent, falling/rising, and stagnating approaches within QLER over the timeframe. Case study, conversation and interaction analysis, and ethnography were shown to be well-established for investigating language education issues qualitatively. Yet, many studies did not explicitly align with a given approach or state the methods used (or at least did not employ keywords that allowed us to automatically establish these). The uncovered increases were not distributed evenly across the 34 publications investigated. Instead, certain journals appear more predisposed towards QLER, often because the topics within the scope of such journals align with QR approaches, e.g. Journal of Pragmatics and Journal of Language, Identity and Education or in the case of journals with a broader readership that stress their openness to incorporating QLER (e.g. International Journal of Bilingual Education and Bilingualism and TESOL Quarterly).
Patterns of author collaboration changed significantly over the timeframe, with the tendency for single-authored articles by researchers working in institutions in Anglophone countries being supplanted by research teams composed of members from various institutions within one country and abroad. Such changes feature ramifications for how QLER is done, suggesting a move away from the individual researcher acting as the sole instrument of data collection to a more collaborative process better able to generate more nuanced understandings and creative solutions and where team members may need to reconcile competing interpretations (Mulvihill & Swaminathan, 2023). Increases in collaborative approaches also reflect the need to account for complexity that characterizes the phenomena language education researchers are interested in, and are also the result of institutional decision-making that favours collaboration in deciding what to research, by whom, and when/where.
1 Implications
Authors can capitalize on the momentum stemming from the recent surge in QLER by identifying future research possibilities within established trends (e.g. teacher development) or underexplored areas (e.g. critical approaches) to maximize its relevance and influence. Secondly, given that both output and citation impact were far from evenly distributed across journals, authors should strategically select the publication in which to submit their manuscript. Targeting the journals found in the study receptive to qualitative methods could increase the visibility, readership, and citation impact of their research. Finally, while we appreciate the paradigmatic importance of valuing authors’ creativity and freedom of spirit (Davids et al., 2025), an implication for authors of QLER is that retrievability can be improved by embedding established labels in document abstracts (and author-provided keywords) to characterize the approach at different levels of abstraction (i.e. methodology, method, and data analysis), particularly since the research approach itself may constitute a/the reason for retrieving a study (Jamali, 2018). Finally, as the study revealed a marked shift from single to multi-authored QLER, institutions should further support collaborative research endeavours (see Milford et al., 2017) through facilitating networks, recognizing collaborative outputs, and allocating funding and resources.
2 Limitations
We are cognizant of the limitations associated with bibliometric analysis and that the approach provides a quantitative way of viewing qualitative research. We are also well-aware that our particular design decisions influenced the findings (e.g. the choice of Scopus, the journals incorporated, the keywords used to identify qualitative research), and that a study adopting alternative parameters could present a different picture of QLER. An important limitation relates to the process of delimiting QR studies through the use of keywords (e.g. qualitative, narrative, interview). These are, naturally, a blunt instrument, resulting in both irrelevant hits (e.g. narrative as a genre of writing) and potentially excluded relevant ones (although we cannot be sure here). We also accept that the overlapping coding of topic keywords is unconventional because we were unable to follow Patton’s (2002) principle of the external heterogeneity of themes. Furthermore, we acknowledge that, as with other bibliometric analyses (e.g. Hyland & Jiang, 2021; Lei & Liu, 2019; Pearson, 2022), the uncovered patterns of research approaches and topic themes are based on prevalence in research abstracts. Clearly, researchers cannot provide a comprehensive outline of their study when limited to 200–250 words, which may mean the actual prevalence of approaches and topics as keywords may be higher. This is accentuated with research approaches, given that authors may devote little material to explicating these in the abstract.
Supplemental Material
sj-docx-1-ltr-10.1177_13621688251388282 – Supplemental material for Qualitative language education research in the past quarter century: A bibliometric analysis
Supplemental material, sj-docx-1-ltr-10.1177_13621688251388282 for Qualitative language education research in the past quarter century: A bibliometric analysis by William S. Pearson and Seyyed-Abdolhamid Mirhosseini in Language Teaching Research
Footnotes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
