Sage Journals: Discover world-class research

Abstract

This study provides a bibliometric analysis of International English Language Testing System (IELTS) research from 1989 to 2024, incorporating 641 research documents. Among these, 482 were obtained from five online indices (Web of Science, Scopus, ERIC, EBSCOhost, Google Scholar) and 159 were identified manually (including 146 studies sponsored by the IELTS co-owners). The analysis focuses on patterns and trends in research topics, methodological approaches, publications and disciplinary areas, author-affiliated institutions and countries, and co-authorship. The results show that topics which were well-covered by researchers extended beyond psychometrics and validation of the testing system to include consequential factors, such as teaching and learning, academic/professional contexts, test preparation, and non-linguistic constructs (e.g., stakeholders’ attitudes, beliefs). Quantitative studies constituted the most common approach but were exceeded by (often well-cited) mixed-methods inquiries sponsored by the co-owners. Research authorship diversified from Anglophone to English as a Foreign Language (EFL)/English as a Second Language (ESL) contexts, reflecting a broader view of IELTS’s validity, warranting further exploration. Implications of the findings are discussed both for IELTS and language assessment research more generally.

Keywords

Bibliometric analysis bibliometric study IELTS International English Language Testing System meta-research

Introduction

IELTS (International English Language Testing System) is an on-demand, high-stakes English language proficiency test developed and co-owned by Cambridge English, the British Council, and International Development Programme Australia. It provides a measure of a candidate’s speaking, listening, reading, and writing skills (along with a composite overall score) across a nine-band scale from band 1.0 (“non-user”) to 9.0 (“expert user”) that is widely recognised around the world (Merrifield & GBM & Associates, 2012; Taylor, 2009). The most important use of the test is its predictive capacity to facilitate test-users’ (notably higher education institutions, professional organisations, and government entities) assessment of applicants’ linguistic readiness for English-medium academic purposes and professional registration (Merrifield & GBM & Associates, 2012). In the United Kingdom, IELTS formally holds the status of SELT (Secure English Language Test), affording it the privilege of being accepted as visa evidence for workers, entrepreneurs, and students on sub-degree level programmes. Minimum linguistic requirements, as measured in IELTS band scores, are usually established by test-users in light of guidance from the IELTS co-owners (MacDonald, 2019). IELTS is not a pass/fail test per se. Rather, candidates may be perceived to have “failed,” if the overall or sub-scores they achieve do not meet the requirements of their chosen test user, often resulting in repeated test taking (Barkaoui, 2016; Estaji & Banitalebi, 2023) or an Enquiry on Result (the process by which a candidate may challenge their score).

IELTS has adapted to developments in language education research and international student recruitment, leading to its global candidature reaching more than 4 million per year in 2023 (IELTS, 2024a). The testing system has seen a number of major changes, including the 1995 replacement of disciplinary-specific areas with Academic and General Training Modules (Davies, 2008), the 2001 revision to speaking test format and scale (Brown, 2006), the 2008 expansion of the pronunciation scale (Isaacs et al., 2015), and the 2017 introduction of computer-based testing. While test-taker numbers were badly hit by the imposed hiatus caused by the COVID-19 global pandemic (Clark et al., 2021), they have since recovered, thanks in part to the co-owners’ initiatives to respond to changing market conditions through the IELTS Indicator Test (Isbell & Kremmel, 2020) and One Skill Retake (in selected countries). The latter is a relatively new process whereby a candidate can retake one of the four test components if there is a band score shortfall relative to their needs, affording greater flexibility and peace of mind to test-takers (IELTS, 2024b).

The IELTS co-owners have long emphasised the test’s research-based foundation to enhance its credibility with stakeholders, notably test-users. They have been keen to position research into IELTS as important for ensuring “an ongoing relationship with the broader linguistics and language testing community” (IELTS, 2019, p. 11), “continuous improvement of the test” (IELTS, 2019, p. 11), and “an up-to-date testing system” (IELTS, 2002, p. 24). An important way these commitments are realised is through what is termed Internal research (IELTS, 2019), that is, studies undertaken internally by IELTS’ Research and Validation teams that “bring together specialists in testing and assessment . . . and provides rigorous quality assurance for the IELTS test at every stage of development” (p. 10). This is a notable research strand as investigators may be able to access important internal datasets (e.g., samples of candidates’ speaking and writing responses, test-takers’ background characteristics). Owing to the sensitive nature of some data, not all such outputs can be conveyed to an external audience. Those that can are published in Cambridge Assessment English and Cambridge University Press’ Studies in Language Testing and Research Notes series, respectively, full volumes of which have recently been uploaded as open access documents.

Also at the forefront of IELTS’ commitment to research is test validation through External research (IELTS, 2019). As of 2025, this constitutes over 140 IELTS Research Reports (IRRs) undertaken by more than 350 researchers from a cross-section of countries globally and funded by the co-owners (with up to £45,000 available per successful proposal) (IELTS, 2025). Projects are typically co-researched in small teams and run for 1–2 years, resulting in a written report of no more than 20,000 words (significantly longer and more detailed than the outputs of articles published in most academic journals). Reports are provided open access on the IELTS website but are not currently listed on indices such as the Web of Science (WoS) or Scopus (unlike the ETS Research Report Series). As such, IRRs may not contribute to metrics used to measure authors’ and institutions’ productivity and impact, which may discourage some from undertaking research into IELTS via the IRR series or, as in some cases (e.g., Hyatt, 2013), cause them to repurpose their report into a journal article.

Meta-research

In recent years, there has been an increasing trend towards meta-research in applied linguistics (AL), manifested in the creation of new periodicals (e.g., Research Methods in Applied Linguistics, Research Synthesis in Applied Linguistics), special interest groups (e.g., BAAL Research Synthesis in Applied Linguistics), and an increasing number of empirical studies concerned with the study of research itself. The aim of many such papers is for research stakeholders, particularly readers and authors, “to understand and improve how we perform, communicate, verify, evaluate, and reward research” (Ioannidis, 2018, p. 1). Meta-research encompasses the synthesis of existing studies with a view to suggesting enhancements to how research is undertaken and reported, perhaps through the examination of research designs and methods, publication and peer review practices, and scientific standards (Ioannidis, 2018). Well-established meta-research publishing formats in the discipline, such as systematic and state-of-the-art reviews and meta-analyses, have been complemented by an array of newer types, including narrative reviews and qualitative research syntheses (Chong & Plonsky, 2023).

The meta-research turn is especially salient in language assessment, given the volume and diversity of studies (Bachman, 2000; Dong et al., 2022; Yang & Wang, 2025). The sheer amount of research—illustrated by Aryadoust et al.’s (2020) dataset comprising 4736 articles (now 6 years old)—makes it increasingly difficult to keep pace with global trends across the literature, leading scholars to conclude that further synthesis is needed to contextualise existing research, identify enduring concerns and emerging priorities, and highlight areas requiring additional scrutiny (Yang & Wang, 2025), illustrated in a range of approaches including systematic reviews (e.g., Chen et al., 2024), meta-analyses (e.g., Gagen & Faez, 2024), and scoping reviews (e.g., He et al., 2025). For high-stakes language tests, meta-research is crucial because it systematically synthesises existing evidence to evaluate test validity, fairness, and impact, thereby guiding policy decisions that affect large numbers of test-takers. While many studies have been undertaken into IELTS over its 36-year existence, including over 140 empirical studies funded by the IELTS co-owners and published as IRRs, to our knowledge, aside from two recent meta-analyses of IELTS predictive validity (e.g., Gagen & Faez, 2024; Ihlenfeldt & Rios, 2022), there have been few attempts to synthesise this research.

Bibliometric approaches

One approach to meta-research that has been increasingly taken up within applied linguistics is bibliometric analysis or bibliometrics(e.g., Aryadoust et al., 2020; Crosthwaite et al., 2022; Hyland & Jiang, 2021; Jing et al., 2024; Lei & Liu, 2019; Li, 2022; Pearson, 2024). This refers to “the application of mathematics and statistical methods” to analyse scientific publications (Pritchard, 1969, p. 348). Bibliometric researchers generate quantitative insights into publication trends, often examining the raw frequencies and distributions of research topics, research approaches, co-authorship, document citations, journals, and author-affiliated institutions by assembling and analysing datasets comprising bibliometric records (e.g., author names, document titles, abstracts, and keywords) from research databases, such as Web of Science (WoS), Scopus, and Google Scholar (GS) (Lei & Liu, 2019). The individual study constitutes the unit of analysis (Chong & Plonsky, 2023), with researcher(s) usually seeking to highlight patterns across large datasets to characterise the domain under investigation.

Of particular value in our view are bibliometric studies, in which authors guide the reader through the evolution of the domain, identifying salient/rising/falling topics of inquiry, research methodologies, the locations of research activity, and patterns of collaboration. Such information allows readers to gain a holistic view of the domain, feeding through to more informed agenda setting or decision-making about what is studied, how, and by whom (Barrot, 2024). At a higher level, the information generated may also help stakeholders understand and improve how authors communicate, verify, evaluate, and reward research (Ioannidis, 2018). This seems particularly relevant in the case of IELTS, where much research has been carried out over 30 years and where, to the best of our knowledge, no existing paper has attempted to synthesise the comprehensive body of research. Investigating the distribution of research topics is vital to ensure that any validity arguments are applied equitably across all test components. For example, if certain modules receive disproportionately less attention, the empirical basis for overall test scores becomes uneven. Likewise, if the field relies predominantly on, for example, statistical modelling while marginalising, say, qualitative approaches, research captures only a partial view of the test. As such, bibliometric research may act as a form of oversight, providing a robust, up-to-date evidence base that could lead to refinements in test constructs.

Despite being a novel form of inquiry in AL (Chong & Plonsky, 2023), bibliometric analyses have already addressed a range of subject areas within the discipline, including automated writing evaluation (Barrot, 2024), English for academic purposes (Hyland & Jiang, 2021), second language acquisition (Zhang, 2019), technology-supported learning environments (Jing et al., 2024), and written corrective feedback (Crosthwaite et al., 2022), to name but a few. A common finding across these studies is the (often burgeoning) growth in scholarship and widening of research participation (especially internationally) over the last decade (Barrot, 2024; Crosthwaite et al., 2022; Hyland & Jiang, 2021). Sometimes, individual authors most influential within a domain are highlighted (e.g., Crosthwaite et al., 2022; Lei & Liu, 2019). Such information can also be aggregated to the level of author-affiliated institution (e.g., Dong et al., 2022; Yan & Zhang, 2023) or country (e.g., Barrot, 2024; Hyland & Jiang, 2021), helping readers identify teams or departments in particular contexts that focus on specific research topics or adopt particular patterns of co-authorship (e.g., Amini Farsani et al., 2021). This is salient in the context of IELTS, where the co-owners implicitly encourage collaboration through the IELTS joint-funded research programme that emphasises the interdisciplinary nature of research (IELTS, 2025). Studies that document evolving disciplinary priorities through analyses of research topics reveal factors relevant to research into IELTS, including the impacts of technological developments (Crosthwaite et al., 2022), internationalisation (Hyland & Jiang, 2021), and growing methodological maturity (Barrot, 2024) on AL scholarship.

Of particular relevance to the present study are existing bibliometric studies that address language testing and assessment (see Aryadoust et al., 2020; Dong et al., 2022; Yang & Wang, 2025). Aryadoust et al.’s (2020) analysis of a dataset of 4736 documents focusing on measurement and validity published between 1918 and 2019 showed that language testers have long been pre-occupied with the measurement of reading, writing (also found by Yang & Wang, 2025), and oral production skills, potentially neglecting pertinent constructs such as language knowledge, feedback, and washback. That 67.03% of the dataset comprised documents from non-specialist journals indicates that much language assessment research is distributed across a wide array of publications. This is likely the case for IELTS, given the test’s impact (Hamid & Hoang, 2018; Pearson, 2019). In contrast, Dong et al. (2022) zeroed in on the journal Language Testing to analyse patterns across 759 documents published between 1984 and 2020. They found that authors typically gave greater attention to large-scale international tests (including IELTS), often addressing the staple topics of validity, reliability, and test development, albeit the methods researchers use have evolved within that timeframe (e.g., by incorporating eye-tracking technology). The enduring and significant role of IELTS in the international English language testing landscape and the large amount of research that has been generated about, on, and around the test, provide a convincing argument for a research synthesis in the form of a bibliometric analysis.

Research aims

This study reports on a bibliometric analysis of 641 research documents published across the existing lifespan of IELTS (up until 2024), retrieved from a combination of online indices, official databases of co-owner published research, and manual searching. The study examines the prevalence of particular research topics and approaches, as well as prevailing and impactful publications, affiliated author institutions, the countries of those institutions, and patterns of co-authorship. Guiding the design of the study are the following research questions (RQs):

RQ1. What topics in IELTS research have been most frequently explored?

RQ2. What are the most prevalent research approaches used in IELTS research and how have they changed over time?

RQ3. Which publications, authors, institutions, and countries have been the most productive and impactful (in terms of citations)? What is the prevalence of co-authorship?

Method

Data collection and retrieval

In this study, we sought to retrieve a complete body of IELTS research in the fields of social sciences and arts and humanities. As with other recent bibliometric analyses (e.g., Jing et al., 2024; Shi & Aryadoust, 2024), a Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) flow diagram (Page et al., 2021) was employed to improve the robustness of data identification, retrieval, and screening (see Figure 1). We began the process of identifying relevant bibliometric records using the keywords IELTS or International English Language Testing System in any bibliometric field (e.g., title, keywords, abstract, etc.) across five popular online indices of research; the Web of Science (WoS), Scopus, ERIC, EBSCOhost, and Google Scholar (GS). This yielded 2194 bibliometric records which, after duplicates were removed (n = 943), were subject to screening to determine their relevance. Data were extracted on April 15, 2025.

Figure 1.

Processes for identifying, screening, and including documents.

Complementing these data were records manually identified and retrieved from other sources, the most notable of which were 131 IRRs, published on the IELTS website. We manually created bibliometric records using data (e.g., author names, institutions, countries, abstracts) by downloading and extracting the contents of the research reports themselves. These were complemented by a further 28 studies from publications not listed on the research indices, notably Studies in Language Testing (SiLT) and Studies in Language Assessment (SiLA, formerly Melbourne Papers in Language Testing). For the purposes of the analysis, following IELTS (2019), we grouped IRRs and SiLT papers together as sponsored research (n = 146) and differentiated these to external research (n = 495), which denotes studies not published by one of the co-owners (e.g., Cambridge English) or collectively under the IELTS banner. The rationale for this delineation is that the former entails co-owner provision of funding, endorsement, and potentially access to valuable internal datasets, whereas external research is more likely to address IELTS from the outside. This relates to fundamental bibliometric questions of what is researched, how, and by whom? We included 14 journal articles that seemed to provide a more condensed version of sponsored research as external studies (e.g., Hyatt, 2013; Moore & Morton, 2005), although we did not include Cambridge English Research Notes, since material from the reports could not be copied to create bibliometric records.

Data screening and eligibility

Inclusion/exclusion criteria were applied to the documents retrieved from the research indices (mainly their title, abstracts, and author-provided keywords, as it was not feasible to review complete research reports). We set the timeframe of included studies to commence from 1989, corresponding with the inception of the IELTS test (although the first document in our sample was from 1990), and to end in 2024. We limited included documents to those classified as journal articles (including “early access” on the WoS and Scopus), books, book chapters, proceeding papers, review articles, and excluded formats that do not typically constitute research (e.g., book reviews, letters to the editor, n = 142). Articles written in languages other than English (n = 6) were not removed, as each featured an English language abstract. An absence of metadata (usually because an article text could not be retrieved) meant that bibliometric records for 33 studies were largely incomplete and, hence, removed. As the focus was on characterising a comprehensive dataset of IELTS research and avoiding publication bias, no individual publication or study was screened out based on quality measures.

Examination of document titles, abstracts, and keywords resulted in the inductive creation of four further categories for excluding irrelevant studies. We omitted 290 reports which did not focus on IELTS (often because they contained a passing reference to the test in the abstract). We also excluded 164 studies where a simulated IELTS test was administered as a pre-/post-test for some pedagogical intervention that had little to no relevance to the test itself. Likewise, we removed 47 papers that referenced IELTS band scores as markers of research participants’ proficiency without focusing on the test itself. However, studies aiming to establish relationships between IELTS band scores and other variables where the focus was on the nature or suitability of the former were included. Finally, we removed a further 25 studies where the authors had utilised the IELTS Speaking or Writing band descriptors as a means of assessing learner speaking/writing without a specific focus on the test. Since it is the case that other disciplinary areas use the abbreviation IELTS, we also removed 67 wholly irrelevant results. Ultimately, bibliometric records for 482 indexed IELTS research documents were included, supplemented by 159 records manually identified from other sources, totalling 641 documents See Supplementary Appendix A on the Open Science Framework (OSF) for the list of records alongside all other supplementary materials (Pearson & Zou, 2026). We did not scrutinise the methodological quality of the included publications because our goal was to map the overall body of literature, rather than to synthesise findings or assess the rigour of individual research papers, in line with existing applied linguistics bibliometric research.

Screening document eligibility was undertaken independently by both researchers, working on half of the bibliometric records, respectively. Each author checked 35 records screened by the other, resulting in an inter-rater reliability agreement figure of 95%, deemed acceptable. Most discrepancies (e.g., Tavakoli et al., 2014) related to whether a study was sufficiently focused on IELTS to be included, with uncertainties resolved via discussion between the authors.

Data analysis

In line with prior studies (e.g., Crosthwaite et al., 2022; Lei & Liu, 2019; Zhang, 2019), this bibliometric analysis investigates salient patterns and trends concerning the output and impact of publications, author-affiliated institutions, and the countries in which such institutions are located. In addition, it identifies prevalent research topics and approaches, and how these have changed over the timeframe. We present raw frequencies and proportional figures for research output, along with normalised frequency counts to account for changes in the prevalence of research topics and approaches within IELTS across three timespans, 1989–2009, 2010–2019, and 2020–2024. These are not equal in duration, since the number of research documents was not evenly distributed.

We operationalise the academic impact of research based on individual documents’ citation counts, which we acknowledge is a rudimentary measure. Given that we extracted bibliometric records from an assortment of indices and manually added records from other sources, we employed GS citations, manually downloading counts for each study (valid for May 28, 2025). We present citation impact data in three ways: raw citations, citations normalised relative to the number of documents (to control for output size across factors such as research approaches, publications, publication types), and as an age-weighted citation rate (AWCR, calculated by dividing the number of citations by the age of the document in years), to account for the effects of time (i.e., a lower AWCR figure indicates that older documents are less impactful). AWCR figures are accumulated, by aggregating values across discrete studies. While providing a time-sensitive dimension of citation impact, it does not equate to a paper’s total influence given that a few highly recent papers (which may be cited for recency alone or owing to the fast-moving nature of some topics) can outweigh moderately cited older ones.

We retrieved full texts of all included external research studies (along with IRRs) in order to provide a comparison of study length (in words) between sponsored and external research. Document PDFs were converted into text files, with an overall word count generated using AntConc (Anthony, 2018), which was then divided by the number of documents to obtain an average. The full texts of 37 external studies could not be retrieved (often conference proceedings or older papers) and were discounted from this calculation.

Research topics

We analysed topics within IELTS research cross-sectionally and longitudinally by drawing on both author-provided keywords and keywords from abstracts (Pearson, 2024). We began by extracting all 1117 author keywords from the external studies (and none from sponsored research, which featured no keywords). We excluded 725 keywords that appeared only once or twice, since we did not consider these to be common topics in the dataset. We removed a further 94 for being too general (e.g., strategies), 29 for being methodological (e.g., correlation), nine country names because these were attended to in a separate analysis, five because we did not consider them topics (e.g., boosting), and two where an abbreviation of the concept was provided in parentheses (e.g., artificial intelligence [AI]). This final list of 253 keyword-from-author topics was supplemented with a further 487 candidate keywords from abstracts. To delineate the latter, we applied the topic patterns based on word class combinations (e.g., noun + noun, noun + preposition + noun) generated by Pearson (2024) using AntConc’s n-gram function (Anthony, 2018). Among these keywords, we removed 153 that were not research topics (often chunks that did not stand alone as something meaningful, e.g., based approach), 114 that were duplicated author-provided keywords (e.g., cognitive processes), 38 that were methodological, 30 that appeared too infrequently, three that we felt were too general, and two author countries. After manual screening, we were left with 400 keywords that, in accordance with prior bibliometric studies, we regard as topics.

Topic frequency counts were generated by cross-referencing the uncovered keywords with document titles and abstracts. As in other bibliometric studies (e.g., Crosthwaite et al., 2022; Hyland & Jiang, 2021; Lei & Liu, 2019), we operated on the assumption that research articles may feature or address multiple topics (particularly since IRRs constituted comprehensive, wide-ranging investigations). Along with identifying increasing, decreasing, and stagnating topic trends over time, we also synthesised the 400 identified topics into 15 broader topic themes (see Supplementary Appendix C; Pearson & Zou, 2026). We did this manually by assigning each to a topic theme (e.g., preparation course, practice tests, and use of exemplars were synthesised along with 32 other topics into Test-takers and test preparation) through an inductive and iterative process that involved revising, merging, and collapsing topic themes and resolving differences through discussion.

Research approaches

AL bibliometric analyses have addressed methodological features of research in different ways. While acknowledged as not constituting research topics per se, methodological keywords identified from abstracts have been incorporated into prior analyses (e.g., Hyland & Jiang, 2021; Lei & Liu, 2019; Pearson, 2024), since such items say something about the disciplinary area. In the present study, methodological keywords were addressed separately to topic ones, to investigate phenomena including methodological change over time, the citation impact of varying approaches, and the intersection with external versus sponsored research. The study drew upon the first author’s unpublished list of methodological keywords that were manually identified from a dataset of 2416 language assessment studies (see Supplementary Appendix B; Pearson & Zou, 2026). We took the position that authors are best placed to characterise the methodological approach(es) of their research, adopting the approach of tallying occurrences of these keywords across document titles and abstracts. We did not calculate their frequency across a document’s keywords since these were absent in sponsored studies.

To obtain global perspectives on research approaches and because many keywords registered zero hits, we created combinations of Excel formulae that looked up methodological keywords and assigned the labels quantitative, qualitative, or mixed methods (MMs) depending upon whether one or more of the keywords listed were present in the title or abstract. Studies were labelled as MM if they featured this phrase (or variation) or if keywords indicating both qualitative and quantitative approaches were found in the title and/or abstract with the absence of the phrase MMs (albeit such entries were manually checked). Formulae featuring synonyms of the term proficiency interview (e.g., oral interview, speaking interview) were applied in order to distinguish interview as a research method from the test feature. We removed 63 records from the methodological analysis, either because they were non-empirical (n = 44, e.g., Pearson, 2019) or did not feature an abstract (n = 19), usually in the case of book chapters.

Publication disciplinary areas and types

The most prevalent academic journals publishing IELTS research were identified from the retrieved bibliometric records. As we found that studies were disparately situated across a large array of venues and to enable more meaningful comparison, we synthesised the relevant publications into four disciplinary areas, language assessment (e.g., Language Testing), applied linguistics/Teaching English to Speakers of Other Languages (TESOL, e.g., ELT Journal), education (e.g., Cogent Education), and other journals (e.g., Australian Journal of Social Issues). In a few instances where we could not identify a disciplinary area from the name of the publication (e.g., Alabe-Revista de Investigacion Sobre Lectura Y Escritura), we reviewed the stated aims and scope on the journal/report’s website.

Institutions, countries, and (co-)authorship

Candidate entries for the 10 most productive and impactful author-affiliated institutions and countries were initially identified using WoS and Scopus functionality (as these constituted by far the most prevalent indices that list IELTS research) and by manually reviewing the bibliometric records that we had created for sponsored research. Unlike previous studies that examined only first-author institutional affiliations (e.g., Lei & Liu, 2019; Yan & Zhang, 2023), we incorporated those of every listed author for a more comprehensive perspective. Excel formulae were created to obtain precise document and citation counts (e.g., for authors affiliated with Japanese institutions, counting the occurrences of Japan in the author affiliation field). Since our results show much IELTS research has been published over the last 10 years, we did not examine trends in the countries of author-affiliated institutions over time. We analysed the extent of author collaboration by manually labelling each study as no collaboration, domestic collaboration (which we took to mean authors either from the same or different institutions in one country), or international collaboration. In the event an author had multiple affiliations, we adopted that which was listed first. While we identified prevalent authors in the dataset, because few published widely, we opted not to present these findings.

Results and discussion

Figure 2 demonstrates trends in published IELTS research from the test’s inception in 1989 through to 2024. Research interest in IELTS was rather modest for the first 20 years of the test’s existence, only exceeding a cumulative total of 101 papers in 2008. This reflects the relatively small-scale nature of the testing operation in the 1990s (Davies, 2008) in light of the very different context of overseas student participation in Anglophone tertiary education (see Lomer, 2017). As the figure shows, the literature body prior to 2012 comprised similar degrees of both external and sponsored research (via IRRs and SiLT), although in instances of the latter, works were released in volumes (e.g., IELTS Collected Papers; Speaking and Writing) affecting their distribution in the figure. Two notable changes in research productivity were exhibited after 2012. First, there has been a marked increase in research interest overall, mirroring the growing size of the test-taking cohort. A second noteworthy trend is the clear divergence between external and sponsored research, such that by 2024, the latter comprised just 22.78% of all documents. This suggests that, as the IELTS test has grown in candidature, so too has its wider impact (Alsagoafi, 2018; Hamid & Hoang, 2018; Sinclair et al., 2019), engaging a growing body of researchers outside of the testing system itself.

Figure 2.

Cumulative trends in IELTS research document prevalence.

In addition to trends in the breadth of research as outlined in Figure 2, it is important to note tangible differences across one measure of research depth, a study’s length in words. Our analysis found that sponsored research averaged 19,376 words, significantly exceeding the length of documents classified as external research (7868 words). Free(er) from the restrictions imposed by academic journals, IRR and SiLT authors were able to undertake and report on significantly more comprehensive investigations (see section “Research approaches”).

Research topics

Table 1 shows the 15 topic themes in order of prevalence across document titles and abstracts. It can be seen that by far the most common theme is Measurement and quality (n = 380). This is unsurprising given that it incorporates topics related to the interpretations and intended use(s) of test scores that are central issues in language testing more generally (Fulcher & Davidson, 2007), including validity (n = 126), test scores (n = 78), reliability (n = 45), and raters (n = 41). In addition to these topics common to language assessment more broadly were terms more idiosyncratic to IELTS, including band score(s) (n = 71), IELTS band(s) (n = 28), and IELTS examiner(s) (n = 26), which all characterise the scoring process. This theme was followed by Test-takers and test preparation (n = 271) and Academic and professional contexts (n = 270), which is also unremarkable given the high-stakes nature and impactful gatekeeping role of the test in academic contexts (Pearson, 2019). The prevalence of Language ability (n = 236) reflects the testing purpose and the importance of language ability within contexts of test score use, while also showing that questions surrounding the theoretical and operational definitions of ability from a global perspective have engaged researchers.

Table 1.

Prevalence and patterns of change across topic themes.

Topic theme	Total (n)	Normed frequency			Normed change
Topic theme	Total (n)	1989–2009	2010–2019	2020–2024	Normed change
Measurement and quality	380	68.91	56.52	57.72	–11.18
Test-takers and test preparation	271	30.25	40.22	50.41	20.15
Academic and professional contexts	270	54.62	43.48	34.55	–20.07
Language ability	236	43.70	36.23	34.15	–9.55
Language teaching and learning	226	16.81	38.04	41.06	24.25
Writing proficiency and assessment	181	26.05	25.72	32.11	6.06
Features of language	163	18.49	25.36	28.86	10.37
Non-linguistic constructs	144	20.17	26.45	19.11	–1.06
Test purposes	132	16.81	22.10	20.73	3.92
Speaking proficiency and assessment	129	21.85	21.01	18.29	–3.56
Test owners and developers	103	21.85	16.30	13.01	–8.84
Non-IELTS tests and assessments	90	10.08	15.94	13.82	3.74
Reading proficiency and assessment	69	6.72	10.51	13.01	6.29
Listening proficiency and assessment	46	3.36	7.97	8.13	4.77
Technology and innovation	21	0.84	3.26	4.47	3.63

With the exception of Test-takers and test preparation, these topics showed the largest normalised decreases in prevalence over the timeframe. For Measurement and quality, this may reflect broader moves in the field away from a predominantly psychometric orientation to addressing the social and ethical consequences of language testing (Hamp-Lyons, 2000; McNamara & Roever, 2006). In the case of IELTS, researchers may have expanded their focus because there were no notable revisions to the testing system between 2008 and 2020, the focus of prior studies (e.g., Brown’s [2006] Candidate discourse in the revised IELTS Speaking Test and Yates et al.’s [2011] The assessment of pronunciation and the new IELTS Pronunciation scale). The test’s validity in light of the 2023 revisions to the Writing band descriptors and the introduction of the One Skill Retake seem fruitful areas to further investigate measurement and quality issues. Alternatively, investigating the test’s usefulness, particularly its reliability and absence of bias, requires navigating access to data held by the co-owners (and probably publishing via the 18-month IELTS joint-funded research programme) and thus be off-putting in the current publishing context which prizes rapid research output.

Less common topic themes included Technology and innovation (n = 21), assessing both receptive skills (Listening proficiency and assessment [n = 46], Reading proficiency and assessment [n = 69]), and perhaps unsurprisingly, Non-IELTS tests and assessments (n = 90). The reasons researchers may favour investigating the productive skills over the receptive ones may be the creation of reified artefacts (particularly live tests) and that there exist a larger number of facets that require validation (i.e., tasks, rubrics, raters). It could also be that, as candidates report that the receptive skills are easier (Lloyd-Jones & Binch, 2012), less important (Merrifield & GMB & Associates, 2012), or more conducive to score gains (O’Loughlin & Arkoudis, 2009), they are less likely to constitute a problem or issue that typically triggers research. The low (albeit, modestly rising) frequency of studies examining the mediating role of technology in IELTS is somewhat surprising given that the co-owners have instituted greater digitalisation of the testing system (albeit not to the degree of competitor tests such as the Duolingo English Test and the Pearson Test of English—Academic), including computer-based testing, computer-mediated rating and examiner training, and the at-home IELTS Indicator Test that served to facilitate university admissions during the COVID-19 pandemic (Isbell & Kremmel, 2020). It is also the case that online networks and apps supporting candidates’ test preparation practices (both official and unofficial) have proliferated, constituting a potentially ripe area for future investigation.

The most sizeable increases in the proportion of research output over the time period were seen in Language teaching and learning (+24.25%), outstripping the related topic theme of Test-takers and test preparation (+20.15%) and reflecting the natural embedding of (notably, classroom-based) IELTS preparation within English language teaching more broadly. This is visible in the settings within which more pedagogically orientated research is situated, most commonly at private language teaching organisations (e.g., Allen, 2017; Brown, 1998; Green, 2007), where IELTS constitutes a “natural” progression route for many English learners (Ahern, 2009), and occasionally as a component of English language enhancement programmes at the tertiary level (e.g., Gan, 2009). The rising interest in Test-takers and test preparation demonstrates that stakeholder engagement with IELTS (especially by test-takers and teachers) extends well beyond test-taking itself. On the one hand, this may be considered encouraging, since it evinces responses to calls to position test candidates at the forefront of research, which is salient given that they have the most to lose and gain through testing (Hamp-Lyons, 2000). On the other, its prominence highlights the impact of the test on candidates, which has been criticised in several empirical works (e.g., Ahern, 2009; Alsagoafi, 2018; Hamid & Hoang, 2018; Sinclair et al., 2019).

Topic themes were cross-referenced with a document’s status as external or sponsored research, along with the overall methodological approach (quantitative, qualitative, or MM) to gain finer insights into publication patterns. There was a sharp divide in the occurrences of Non-IELTS tests and assessments, which were very low among sponsored studies (10.00%) in contrast to external studies (90.00%). This may stem from the co-owners’ funding priorities, which naturally lie with IELTS. However, the result is that research addressing one or more external tests lacks the comprehensive treatment typically provided in an IRR. In addition, funded research’s tendency to position IELTS in isolation rather than within the wider international language testing milieu may result in certain phenomena going under-explored in this format. These include candidates switching between standardised proficiency tests (perhaps because they are struggling to achieve the desired outcome by taking IELTS) and comparative insights across like-for-like tests (notably, other SELTs). It is also notable that external studies featured a more pronounced focus on Language teaching and learning (85.84%), with researchers outside of IELTS perhaps taking a broader view and investigating the test as a prominent feature of the broader English as a Foreign Language (EFL)/English as a Second Language (ESL) landscape.

For topic themes that intersected with research approaches, we found that qualitative methods were infrequent across all themes vis-à-vis quantitative and MM approaches. The most notable was within Speaking proficiency and assessment (22.58%), indicative of investigations of candidate performance in the IELTS Speaking test that used observation, post-hoc analysis of test-takers’ performance, and/or their oral reports (e.g., Seedhouse & Harris, 2011; Vincheh et al., 2024). Quantitative studies were common in investigations of Listening assessment and proficiency (56.52%) and Reading assessment and proficiency (38.24%), often to investigate the relationship between listening/reading performance and a non-linguistic construct (e.g., stakeholder attitudes, beliefs, emotions, etc., also common at 39.57%), operationalised quantitatively (e.g., Ha & Nguyen, 2023). When including all studies that featured an abstract, Non-IELTS tests and assessments (38.20%), Test purposes (31.06%), and Language ability (28.33%) constituted especially high proportions of studies with no identifiable research approach, often because such papers constituted review or opinion articles (e.g., Chalhoub-Deville & Turner, 2000; Hall, 2009; Pearson, 2019).

Research approaches

Tables 2 and 3 demonstrate the prevalence and impact of documents drawing upon the three research traditions and those where an approach could not be identified. Studies were most commonly underscored by quantitative approaches (33.91%), reflecting the strong epistemological tradition of positivism that underscores language testing (Fulcher, 2014; McNamara & Roever, 2006). Within quantitative approaches (and covering a range of features between design and technique), experimental designs (7.18%), t-tests (7.02%), and regression analyses (4.52%) were notable. Quantitative approaches were well-represented in external research (38.41%), which may reflect the prominent influence of psychometrics and measurement (Hamp-Lyons, 2000; McNamara & Roever, 2006), perpetuated in initial training programmes, which may leave a lasting impression on practitioners (Yan & Fan, 2021). Citations per document counts in quantitative research (22.18) were notably lower than MM papers. However, an accumulated AWCR value for quantitative works (524.79) exceeded MM research (484.66), indicating that more recent works within this tradition are well-cited.

Table 2.

Document prevalence and impact according to research tradition.^a

Tradition	Documents (n)	Documents (%)	Citations (n)	Citations per document	Accumulated AWCR
Qualitative	106	18.34%	1863	17.58	245.55
Quantitative	196	33.91%	4348	22.18	524.79
Mixed-methods	152	26.30%	4725	31.09	484.66
Not reported ^b	124	21.45%	3440	27.74	372.70

Excluding 44 non-empirical studies and 19 without abstracts.

In the title and abstract.

Table 3.

Distribution of documents across publication types and time according to research tradition.

Tradition	External research	Sponsored research	1989–2009	2010–2019	2020–2024
Qualitative	17.73%	20.29%	14.74%	18.88%	19.23%
Quantitative	38.41%	19.57%	34.74%	33.33%	34.19%
Mixed-methods	20.23%	45.65%	25.26%	27.31%	25.64%
Not reported	23.64%	14.49%	25.26%	20.48%	20.94%

MM research was also common (26.30%), especially in sponsored studies (45.65%), probably because such studies tend to be more complex or comprehensive in scope than external research (which included many shorter conference papers). A notable MM approach investigated IELTS examiners’ practices and perspectives, for example, in the context of the 2008 revised Pronunciation scale (e.g., Isaacs et al., 2015). Another was students’ performance and perspectives, for instance, in IELTS Writing tasks (e.g., Phakiti, 2024). It was also the case that additional methods were integrated for their particular affordances, such as observations in classroom-based studies (e.g., of IELTS preparation, degree programmes) and for the rich data that they contribute (e.g., Lloyd-Jones & Binch, 2012). Authors also drew on MM approaches to enhance study quality, for example, determining whether large-scale survey data corroborates qualitative interview data (e.g., Dao et al., 2024), or to gain a comprehensive picture by triangulating evidence from different methods or data sources (e.g., Ma & Chong, 2022). MM studies were also notably more impactful when weighted against article frequency (31.09 citations per document) for reasons that require greater exploration (e.g., whether the triangulation of findings and completeness typically afforded by MM research translates to greater perceived study quality or utility).

The least prevalent tradition was qualitative research (18.34%), albeit it was also the most increasing in prevalence (+ 4.49%) and interviews were found to be the second-most popular research method (25.09%). Other qualitative methods were far less frequent, including observation (5.30%) and focus group discussions (2.96%). Qualitative reports tended to focus on querying stakeholders’ perspectives of stakeholders, notably that of test-takers (e.g., Clark & Yu, 2020; Yang & Badger, 2015), but also included content analyses of test materials (e.g., Noori & Mirhosseini, 2021) and candidate responses (e.g., Estaji & Hashemi, 2022). It is also apparent that qualitative IELTS research is not as well cited (17.58 citations per document, AWCR = 245.55, with 34 studies having an AWCR value less than 1.00). However, it is our interpretation that the lower take-up of qualitative approaches independently of quantitative ones is not due to a perceived lack of value in the tradition itself (given the prevalence of MM approaches). Rather, as a high-stakes test concerned with reifying language proficiency, quantitative data in the form of test results (e.g., Allen, 2017), text-analytic descriptions of student speaking or writing (e.g., Riazi & Knox, 2013), and/or evidence of learners’/test-takers’ behaviours as measured through eye-tracking or keystroke logging (e.g., Bax, 2013) may be seen as integral or complementary to research. Such data may be available on a large scale from the co-owners, whereas qualitative data may need to be generated anew, requiring time and resources. In addition, systematic training in qualitative methods is less established in language testing programmes, and as such, practitioners may be more concerned with assessment theories and the psychometric qualities of tests (Yan & Fan, 2021), which are firmly grounded in the quantitative tradition.

The notable prevalence of studies with no explicitly stated methodological approach (21.45%) is due to authors omitting methodological information or characterising their studies in ways not captured by our keywords. We noticed a tendency for IELTS predictive validity research to fall within the latter category, with authors using terminology with additional meanings (e.g., relationship, outcomes) or approaches couched in less precise terms (“Scores from IELTS and the university’s in-house English Proficiency Test are analysed to determine the predictive ability . . .”). As such, it should not be interpreted that providing fewer methodological details correlates with higher impact or is considered desirable by readers of research. Indeed, we feel that there are opportunities for authors to enhance study visibility by employing conventional methodological keywords (e.g., qualitative, quasi-experimental study, factor analysis) in author-provided keywords and/or abstracts, particularly as a study’s overall approach or methods may constitute the reason for its retrieval.

Publications, authors, and countries

Consistent with the broader disciplinary area of language assessment (Aryadoust et al., 2020), a notable finding that became apparent during the screening process was the diffuse array of venues that publish IELTS research. This suggests that IELTS research, as in language assessment more widely, addresses a diffuse range of topics (Yang & Wang, 2025). While a large portion of research was concentrated within the IRR format (20.09%), the 495 documents that constituted external research were situated across 270 discrete publications, averaging a mere 1.83 documents per venue. Among these were both international (e.g., TESOL Quarterly) and domestic journals (e.g., Kasetsart Journal of Social Sciences). Many publications, particularly among the latter category, appear less visible, credible, and influential by virtue of not being indexed on Clarivate’s Core Collection. Indeed, only 46 were found to be indexed on the SSCI (Social Sciences Citation Index), 51 on the ESCI (Emerging Sources Citation Index), and 28 on the CPCI (Conference Proceedings Citation Index). This suggests that IELTS attracts early-career researchers, teacher researchers, or non-linguists, the latter group perhaps impacted by a need to engage with the test themselves. Diffusion across publications of varying robustness is also reflected in the notably uneven distribution of citations. The median number of GS citations was 10.00, with 498 being the maximum. At the same time, 96 studies were found to have zero citations, perhaps because the study was new (24 of these were published in 2024), or for reasons of lower visibility, relevance, contribution, and/or perceived quality (including the publication).

Tables 4 and 5 show trends in prevalence and citations across the 10 most impactful publications and disciplinary areas (by accumulated AWCR). Academic journals specialising in language assessment were, understandably, the most common venue for IELTS-related research, notably Language Testing in Asia (LTiA, n = 23), Language Testing (LT, n = 15), and Language Assessment Quarterly (LAQ, n = 11). One notable exception was Assessing Writing (AW), publishing just five studies. When compared with the total age of the journal, it was found that three of the four language assessment journals averaged fewer than one article per year that focused on IELTS (LAQ = 0.60, AW = 0.17, and LT = 0.40, the only language assessment journal in existence when IELTS was created). In fact, after Hamilton et al.’s (1993) analysis of L1 user performance in the 1989 version of the test, it was not until 2001 when the next article appeared in LT that focused on IELTS. The multitude of venues for external IELTS research may pose challenges for stakeholders, particularly novice researchers, those new to the topic, and non-academics, all of whom may need to (further) develop expertise in utilising online research indices to better ensure document retrieval.

Table 4.

The 10 most impactful publications by accumulated AWCR.

Publication	Documents (n)	Citations (n)	Publication period	Citations per document	Accumulated AWCR
IELTS Research Reports	131	20.09%	1998–2024	34.47	303.97
Language Testing	15	2.30%	1993–2024	79.07	138.21
Language Testing in Asia	23	3.53%	2011–2024	24.78	90.05
Language Assessment Quarterly	11	1.69%	2006–2024	52.82	68.47
Studies in Language Testing (SiLT)	15	2.30%	1996–2012	86.53	62.12
Studies in Second Language Acquisition	4	0.61%	2012–2020	87.00	58.35
ELT Journal	13	1.99%	1997–2019	38.46	40.95
Assessing Writing	5	0.77%	2005–2022	74.80	37.49
Applied Linguistics Review	6	0.92%	2013–2024	38.50	37.18
System	4	0.61%	2000–2024	70.75	36.75

Table 5.

Prevalence and impact across publication disciplinary areas by accumulated AWCR.

Publication type	Documents (n)	Documents (%)	Citations	Citations per document	Accumulated AWCR
Language assessment	250	38.34%	9815	39.26	799.27
External	104	15.95%	4002	38.48	433.18
Sponsored	146	22.39%	5813	39.82	366.09
AL/TESOL	225	34.51%	4986	22.16	639.30
Education	103	15.80%	2224	21.59	335.04
Other	63	9.66%	482	7.65	80.09

Table 4 also shows that language assessment journals were among the most impactful when adjusted for the effects of time, especially LT and LTiA (accumulated AWCR = 138.21 and 90.05, respectively), and are thus desirable venues for authors to increase the exposure of IELTS research. Other noteworthy periodicals, both cross-sectionally as citations per document and longitudinally as AWCR, are impactful AL/TESOL publications, namely, Studies in Second Language Acquisition (SSLA, 87 citations per document/AWCR = 58.35), ELT Journal (38.46/40.95), and System (70.75/36.75). IELTS research studies in both SSLA and System were rare (n = 4) but well-cited, suggesting a boost provided by the journals’ high impact factors (2-year IF values of 4.2 and 4.9, respectively).

As shown in Table 5, IELTS does not only engage readers of language assessment periodicals but also of AL/TESOL journals, which constituted by far the most common type of academic periodical (n = 225). This reflects widespread interest in IELTS that stretches beyond scientific, technical, and inward concerns of what to test and how, which are of particular interest in language assessment (Green, 2014), encompassing a wider range of practitioner issues, such as washback to the teacher and learner and the social and ethical implications of the test. Alternatively, such journals may constitute a home for manuscripts that do not meet the more stringent requirements of LT, LAQ, and AW. AL/TESOL journals also offer a reasonable compromise for authors between article (22.16 citations per document) and time-weighted impact (accumulated AWCR = 639.30), especially when compared with sponsored language assessment research and education journals, where more impactful studies tend to be older (AWCR = 366.09 and 335.04, respectively).

The data also suggest authors to avoid publications outside of language education, unless there is clear relevance to a readership in a disciplinary area (e.g., The consequences of English language testing for international health professionals and students: An Australian case study with 44 citations in the International Journal of Nursing Studies). However, it should be acknowledged that selecting an appropriate journal is not only a matter of its impact or article visibility. Authors likely opted for education journals (e.g., Higher Education) in articles relating to uses of IELTS for the purposes of student enrolment or chose general open access education publications (e.g., Cogent Education) owing to perceptions of lower entry requirements or increased article retrievability. Regardless, readers of IELTS research need to look beyond language assessment journals and employ varied indices of research to ensure they are not unwittingly missing out.

Countries of author-affiliated institutions

Given that the British Council, Cambridge English, and IDP are UK and Australian organisations, respectively, it is perhaps unsurprising, as shown in Table 6, that authors based at institutions in these countries contributed the first and third largest shares of research output (30.50% and 27.75% documents, respectively). Such authors also produced the most impactful research when citation counts were age-weighted (437.22 and 356.66, respectively), indicating much relevant work in these country contexts has been done more recently. It is also noteworthy that authors in these two contexts contributed prominently to the IRR format, which may further explain the high citation counts, as such works are generally more impactful. Canada and New Zealand, where IELTS is also widely accepted as evidence of English language proficiency (Merrifield & GMB & Associates, 2012) and where many item writers are based (Read, 2022), were less well-represented in the dataset (7.34% and 4.36%), perhaps because the co-owners are not based in these countries. Authors in the United States unexpectedly contributed to 8.49% of documents, indicating that IELTS has a notable footprint there (e.g., Heitner et al., 2014) in light of competition from domestic tests such as Test of English as a Foreign Language (TOEFL), Duolingo, and others.

Table 6.

Productivity (including by publication disciplinary area) and impact across countries of author institutions by accumulated AWCR.

Country of primary author affiliation	Documents		Citations		Accumulated AWCR	Documents (n)
	n	%	n	Per document		Language assessment		AL/TESOL	Education	Other
	n	%	n	Per document		External	Sponsored	AL/TESOL	Education	Other
United Kingdom	133	30.50%	1969	14.80	437.22	21	58	40	11	3
Australia	121	27.75%	926	7.65	356.66	26	64	17	11	3
Iran	127	29.13%	2473	19.47	303.81	25	0	66	27	7
United States	37	8.49%	898	24.27	176.60	9	9	16	3	0
Canada	32	7.34%	713	22.28	135.17	6	6	14	6	0
China	59	13.53%	1090	18.47	114.08	4	4	15	15	20
Japan	8	1.83%	147	18.38	65.12	2	4	2	0	0
New Zealand	19	4.36%	186	9.79	62.90	6	5	5	2	1
United Arab Emirates	22	5.05%	219	9.95	48.12	1	2	12	6	1
Vietnam	17	3.90%	336	19.76	42.48	2	3	7	0	5

Moving into EFL contexts, a cohort of researchers who have contributed a notable amount of work is based at Iranian institutions (29.13%). This is likely explained by the popularity of IELTS for the purposes of academic study (e.g., Erfani, 2012; Estaji & Banitalebi, 2023), and partly due to the unavailability of US-owned in-person competitor tests (e.g., TOEFL) (Saif et al., 2021). As with their UK and Australian counterparts, Iran-based authors were well-cited across the dataset, both in terms of citations per document (19.47) and accumulated AWCR (303.81). Nevertheless, such authors presented a contrasting profile to their UK and Australian peers in that they did not participate in co-owner-funded research. Instead, they approached IELTS from the outside (probably owing to local restrictions placed upon foreign funding), particularly through investigations of washback and impact published in language assessment journals (notably, the International Journal of Language Testing, based in Iran) and higher impact AL/TESOL periodicals (nearly all of which appeared to be international from the journal’s name or description). As with authors affiliated with Chinese institutions, promotion, hiring, and funding are strongly tied to academic publication (especially in SSCI/ESCI, English-medium, international journals). IELTS, as a globally recognised and well-documented test, provides a legitimate and internationally relevant object of study that allows Iranian researchers to achieve greater visibility and meet professional benchmarks, even when direct access to proprietary data is limited.

Reflecting the importance of the Chinese test-taking candidature by virtue of high international student numbers at institutions based in Anglophone countries (Cambridge Assessment English, n.d.), authors at Chinese institutions (including those in Hong Kong and Macao) featured prominently (13.53%). As with Iran, such authors tended to publish in impactful international AL/TESOL periodicals, along with education and “other” publications (typically humanities and social science-themed conference proceedings). The latter finding may reflect authors’ attempts to build research networks and to develop a formative profile as a stepping stone to more impactful outlets. It was apparent that China-based researchers were not always concerned with the large overseas Chinese student contingent. Such scholars also investigated samples of test-takers within domestic institutions in China, Hong Kong, and Macao (e.g., Gan, 2009), including transnational education partnerships (e.g., Ma & Chong, 2022). This may also explain the interest of United Arab Emirates (UAE)-based researchers, where IELTS has long been used to screen nationals for enrolment onto tertiary programmes at domestic institutions (Garinger & Schoepp, 2013).

Author-affiliated institutions

Over the past three decades, researchers affiliated with a wide array of organisations have contributed research into IELTS. Reflecting its primary role as a measure of English language proficiency for tertiary-level study, it was found that 94.07% of documents were authored by at least one individual affiliated with a university. A small proportion of studies featured authors from private educational companies (e.g., Cambridge English, the British Council, and research centres, 5.30%) or government bodies (0.94%). Despite the incorporation of internal IELTS research via SiLT, authors directly affiliated with the IELTS co-owners featured infrequently. This reflects a dynamic whereby research into IELTS for public consumption is primarily undertaken by academics, with potentially insightful insider voices restricted to confidential internal publications. This contrasts with comparable testing organisations (e.g., LanguageCert), who provide much empirical data from internal studies to an external audience.

Table 7 highlights the 10 most impactful organisations that authors of IELTS research were affiliated with (by accumulated AWCR). Three institutions, the Islamic Azad University (IAU, n = 50), the University of Melbourne (n = 35), and the University of Bedfordshire (n = 28), indicated a notable body of IELTS research outputs. The latter two host specialist language testing and assessment centres, while IAU contains a large English language department. That the University of Melbourne publishes SiLA (formerly Melbourne Papers in Language Testing, at) contributed little to the institution’s ranking, given that only five studies were retrieved, most of which were older papers. Since IELTS originates in Australia and the United Kingdom, it is not surprising that the works of authors at other Australian (i.e., Macquarie, Monash, Queensland, Sydney, Adelaide) and UK institutions (Lancaster, Bristol, Exeter, Roehampton) comprise 50% of the most prevalent in IELTS research. Also of note are the small number of well-cited contributions from academics based at a select few institutions, including Lancaster University, UK (itself home to a renowned language assessment centre), Concordia University, Canada, and Michigan State University.

Table 7.

The 10 most impactful author-affiliated institutions by accumulated AWCR.

Author institution	Documents (n)	Citations (n)	Citations per document	Accumulated AWCR
University of Bedfordshire	28	1933	69.04	153.78
University of Melbourne	35	2570	73.43	139.46
Islamic Azad University	50	485	9.70	78.26
Concordia University	6	482	80.33	59.50
Lancaster University	5	795	159.00	55.24
University of Bristol	11	420	38.18	52.55
University of Exeter	13	248	19.08	51.69
Michigan State University	4	374	93.50	43.75
University of Queensland	10	293	29.30	37.71
Cambridge Assessment English^a	18	466	25.89	33.30

Formerly Cambridge University Press & Assessment, Cambridge ESOL, and UCLES.

Co-authorship

We found that IELTS research was a highly collaborative endeavour (63.65% of all documents), which may be explained by the interdisciplinary nature of such research (IELTS, 2025). A team approach allows researchers to draw upon a range of expertise, such as in psychometric testing, educational policy, language acquisition and learning, language teaching, and technology). Co-authorship was particularly prevalent in the case of sponsored research (74.66%). This reflects the reality where small teams of researchers band together to apply for an IELTS research grant to conduct and report on an in-depth study of the test. A notable majority of sponsored research collaborations (60.96% vs. 50.91% for external research) constituted domestically co-authored reports, for example, Chappell et al. (2015) and Rao et al. (2003). These often featured authors working within the same institution, likely due to institutional prerogatives to widen the range of research activities that contribute to global rankings. Rates of international collaboration were comparatively low, standing at 13.70% for sponsored research and 9.49% for external studies, represented in Sinclair et al. (2019) and Saif et al. (2021). There seems to be untapped potential for further collaboration in research, particularly as many of the issues and concerns faced by test users and takers (e.g., minimum language proficiency requirements, the effectiveness of IELTS preparation, test-taker anxiety) cut across national borders. On the contrary, further cross-contextual international collaborations may serve to reveal how the test and context interact to influence washback, which may help shed further light on the complexity and potentially contradictory findings of research (e.g., Saif et al., 2021).

Conclusion

This bibliometric analysis examined 641 studies of IELTS, the majority of which constituted external research situated in 270 discrete publications located at the international, national, and regional levels, many of which were not listed on an index, such as the SSCI, ESCI, and CPCI. This dispersion may pose location and retrieval difficulties for readers of IELTS research, along with raising the level of challenge in identifying high-quality research. The study revealed an explosion of interest in the test, which took off in a wide array of academic publications external to the IELTS co-owners after 2012, quickly surpassing sponsored research in the form of IRRs and SiLT, the traditional venues of research for public consumption since the 1990s.

The topic analysis revealed a wide range of concerns that both included and went beyond language testing principles (validity, reliability, etc.) as properties of a test. Instead, the prevalence of topic themes such as Test-takers and test preparation, Academic and professional contexts, and Language teaching and learning indicate a conception of IELTS testing with both evidential and consequential bases (Messick, 1989). The most prevalent and impactful research approach, and that which addressed a wide variety of topics, was MMs, thanks in part to funded sponsored research, which owing to its complexity and comprehensiveness, was often well-cited. In contrast, quantitative research featured impact measures (citations per document and AWCR) that fell short of MM studies (especially in older works) but were far in excess of qualitative approaches. The rarity and poor impact of qualitative approaches likely reflect the perceived necessity of incorporating quantitative data into investigations of IELTS, rather than weaknesses in qualitative methods themselves per se (since interviews were by far the most common research method featured). It could also be because many language assessment researchers may have received training using quantitative methods, owing to the strong positivist tradition in the field (Fulcher, 2014; McNamara & Roever, 2006).

We found that IELTS is a research concern that goes beyond features of the testing system and “inward” matters of measurement and quality. This is borne out by the fact that language assessment journals were not the most frequent nor always the most well-cited venues for IELTS research (with the exception of the citations per document measure). The findings showed that the profile of those who undertake IELTS research has changed (albeit has remained strongly collaborative), with a diversification away from authors in Anglophone contexts (particularly where the co-owners and others involved in test development are based) to EFL/ESL settings where test-takers and users are situated, further embodying a wider view of the test’s validity.

Implications for language testing

The sharp divide in research depth and access to testing data between sponsored and external research highlights a significant structural imbalance in how language testing knowledge is produced and disseminated, with implications for other large-scale, high-stakes testing organisations (at both national and international levels). The uncovered disparity facilitates a systemic asymmetry of information, where insiders hold access to large proprietary datasets and internal validation metrics that independent outsiders cannot replicate with small, local samples. Thus, the field risks a bifurcated knowledge base, where the technical truth of a test remains a closed-circuit conversation among insiders, with external researchers relegated to investigating the consequences of the test. To address this, the IELTS co-owners (and other organisations that publish sponsored investigations, e.g., ETS, LanguageCert) could move beyond existing selective funding models towards open-data initiatives. By releasing comprehensive anonymised datasets to the wider academic community, researchers could bridge this divide, ensuring that validation is not just a privileged internal exercise but a robust, peer-verified public discourse.

While this study’s findings are anchored in the specific trajectory of IELTS, they offer insights that, we believe, generalise to other large-scale, high-stakes language tests. Notably, the rising prevalence of research into the social dimensions of IELTS is indicative of a broader epistemological shift in the field, where validity is seen not merely from a psychometric orientation but as a moral and ethical obligation to ensure fair and just outcomes for all stakeholders (Kunnan, 2018). Alternatively, this can be seen as evidence that, as a test grows in candidature and stakes, its research base will likely expand beyond psychometrics to address the complex, lived realities of its test-takers and users. Furthermore, the trade-off between stability and innovation observed in the testing system—where the lack of major test revisions over a 12-year period catalysed a richer exploration of social consequences—is a phenomenon likely applicable to other established testing systems. On the contrary, the slow pace of research into the technological developments of IELTS does not generalise to competitor tests like the Pearson Test of English Academic or the Duolingo English Test because these systems rely foundationally on AI, automated scoring, and remote proctoring. Hence, the mediating role of technology is treated as an inseparable and primary focus of validation rather than as the digitalisation of a legacy paper-based system.

Limitations

We are cognisant of a number of methodological limitations to bibliometric analyses and our own approach. First, we emphasise that citation data are a blunt instrument and should not be fully equated to the quality of the works concerned. Given that we drew on data from a range of indices and incorporated 159 documents that were not listed on the five queried indices, we used GS citation data, the only impact measure common to all documents. GS citation counts incorporate higher frequencies of citations from non-journal sources as well as citations not listed on more robust indices (e.g., WoS, Scopus), which should be considered when interpreting the results. Relatedly, as we did not undertake a methodological appraisal of the included studies, we acknowledge that their quality is uncertain, especially for those papers from the 145 publications not listed on the SSCI, ESCI, or CPCI.

The absence of author-provided keywords in sponsored research meant that we had to rely on titles and abstracts to identify how authors characterise their research, which is a partial measure (Pearson, 2024). Likewise, as in other bibliometric studies (e.g., Hyland & Jiang, 2021; Lei & Liu, 2019), the identification of research approaches, topics, and in our case, topic themes was limited by being derived from cross-referencing keywords with document titles and abstracts. Since these fields are typically short (with abstracts limited to 200–250 words), they cannot comprehensively characterise the respective study, potentially leading to an underrepresentation of approaches and themes. This issue is less pronounced in IRRs, whose abstracts were often 50% longer (albeit this variation in relation to external research is acknowledged as a further limitation). Such a constraint could be addressed through manual analysis of full article texts, although this is not typical in bibliometric analyses owing to the size of the retrieved literature bodies.

While we acknowledge the epistemic limitations associated with bibliometric analyses, we also argue that the present study’s findings constitute an important foundation for further meta-research into IELTS and/or language assessment research more generally. For instance, the identified topic themes could be taken up by future researchers seeking to undertake systematic or state-of-the-art reviews, for example, into IELTS’ validity, test-takers’ preparation practices, and test uses in academic contexts. Furthermore, future researchers could incorporate the identified topic or methodological keywords (perhaps combined with other approaches, especially manual analysis) in order to generate a robust list of terms to deductively or abductively characterise the literature body through a comprehensive methodological review (perhaps retrieving full article texts). We are also of the view that the uncovered patterns of author-affiliations and co-authorship serve as a call to continue to diversify and innovate in research, perhaps with a greater focus on comparative studies that shed light on the nature of particular topic themes across contexts (e.g., Saif et al., 2021). A further strand of research could qualitatively explore some of the salient highlighted trends, addressing important “why” and “how” questions. One starting point is exploring and explaining the increasing interest in IELTS after 2012: what motivates scholars to focus on the test, and what dimensions of the testing system are particularly engaging to both authors and readers (to see if these align with the patterns identified in the topic themes)?

We believe that this study, the first to synthesise over 30 years of research into IELTS, will also be of practical value to students and academics of varying levels of experience. The findings emphasise the need for those seeking to identify relevant IELTS research to consult a wide range of indices and publications. By implication, the variety of publications that incorporate IELTS research and the frequent absence of SSCI, ESCI, or CPCI indexing of these indicate that readers need to adopt sufficiently critical perspectives (i.e., since quality papers may be published in lower-ranked publications and vice versa). Finally, we urge readers to stay abreast of developments in the area, given that the analysis showed that much new IELTS research has been published in recent years.

Supplemental Material

sj-pdf-1-ltj-10.1177_02655322261420501 – Supplemental material for A bibliometric analysis of research into the International English Language Testing System (IELTS)

Supplemental material, sj-pdf-1-ltj-10.1177_02655322261420501 for A bibliometric analysis of research into the International English Language Testing System (IELTS) by William S. Pearson and Minlin Minny Zou in Language Testing

Footnotes

Author contributions

William S. Pearson: Conceptualisation; Data curation; Formal analysis; Investigation; Methodology; Project administration; Resources; Validation; Visualisation; Writing – original draft; Writing – review & editing.

Minlin Minny Zou: Conceptualisation; Data curation; Formal analysis; Investigation; Methodology; Project administration; Resources; Validation; Visualisation; Writing – original draft; Writing – review & editing.

Declaration of conflicting interests

The authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: The first author has received funding through his university to run the Hornby Scholarship programme, which is administered by the British Council. The British Council is a co-owner of IELTS. In addition, the first author served as a language assessment specialist in the area of writing for a company contracted by the British Council from 2018 up until during the time this study was undertaken.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

William S. Pearson

Supplemental material

Supplemental material for this article is available at .

References

Ahern

(2009). “Like cars or breakfast cereal”: IELTS and the trade in education and immigration. TESOL in Context, 19(1), 39–51. https://doi.org/10.3316/ielapa.065328178845270

Alderson

J. C.

Banerjee

(2001). State-of-the-art review: Language testing and assessment part 1. Language Teaching, 34(4), 213–236. https://doi.org/10.1017/S0261444801001707

Alderson

J. C.

Banerjee

(2002). State-of-the-art review: Language testing and assessment part 2. Language Teaching, 35(2), 79–113. https://doi.org/10.1017/S0261444802001751

Allen

(2017). Investigating Japanese undergraduates’ English language proficiency with IELTS: Predicting factors and washback (IELTS Partnership Research Papers, 2, pp. 1–67). IELTS Partners: British Council, Cambridge English Language Assessment, & IDP. https://ielts.org/researchers/our-research/research-reports/investigating-japanese-undergraduates-english-language-proficiency-with-ielts-predicting-factors-and-washback

Alsagoafi

(2018). IELTS economic washback: A case study on English major students at King Faisal University in Al-Hasa, Saudi Arabia. Language Testing in Asia, 8(1), 5. https://doi.org/10.1186/s40468-018-0058-3

Amini Farsani

M. A.

Jamali

H. R.

Beikmohammadi

Ghorbani

B. D.

Soleimani

. (2021). Methodological orientations, academic citations, and scientific collaboration in applied linguistics: What do research synthesis and bibliometrics indicate? System, 100, 102547. https://doi.org/10.1016/j.system.2021.102547

Anthony

(2018). AntConc [computer software] (3.5.7). Waseda University. https://www.laurenceanthony.net/software/antconc

Aryadoust

Zakaria

Lim

M. H.

Chen

(2020). An extensive knowledge mapping review of measurement and validity in language assessment and SLA research. Frontiers in Psychology, 11, 1–29. https://doi.org/10.3389/fpsyg.2020.01941

Bachman

L. F.

(2000). Modern language testing at the turn of the century: Assuring that what we count counts. Language Testing, 17(1), 1–42. https://doi.org/10.1177/026553220001700101

10.

Barkaoui

(2016). What changes and what doesn’t? An examination of changes in the linguistic characteristics of IELTS repeaters’ Writing Task 2 scripts (IELTS Research Report Series, 3, pp. 1–55). British Council, Cambridge English Language Assessment, & IDP: IELTS Australia. https://ielts.org/researchers/our-research/research-reports/what-changes-and-what-doesnt-an-examination-of-changes-in-the-linguistic-characteristics-of-ielts-repeaters-writing-task-2-scripts

11.

Barrot

J. S.

(2024). Trends in automated writing evaluation systems research for teaching, learning, and assessment: A bibliometric analysis. Education and Information Technologies, 29(6), 7155–7179. https://doi.org/10.1007/s10639-023-12083-y

12.

Bax

(2013). The cognitive processing of candidates during reading tests: Evidence from eye-tracking. Language Testing, 30(4), 441–465. https://doi.org/10.1177/0265532212473244

13.

Brown

(2006). Candidate discourse in the revised IELTS Speaking Test (IELTS Research Reports, Vol. 6, pp. 1–19). IELTS Australia & British Council. https://ielts.org/researchers/our-research/research-reports/candidate-discourse-in-the-revised-ielts-speaking-test

14.

Brown

J. D. H.

(1998). Does IELTS preparation work? An application of the context-adaptive model of language program evaluation (IELTS Research Reports, Vol. 1, pp. 20–37). IELTS Australia. https://ielts.org/researchers/our-research/research-reports/does-ielts-preparation-work-an-application-of-the-content-adaptive-model-of-language-program-evaluation

15.

Cambridge Assessment English. (n.d.). A strategic partnership for language testing. https://www.cambridgeenglish.org/images/470637-strategic-partnership-for-language-testing.pdf

16.

Chalhoub-Deville

Turner

C. E.

(2000). What to look for in ESL admission tests: Cambridge certificate exams, IELTS, and TOEFL. System, 28(4), 523–539. https://doi.org/10.1016/S0346-251X(00)00036-1

17.

Chappell

Bodis

Jackson

(2015). The impact of teacher cognition and classroom practices on IELTS test preparation courses in the Australian ELICOS sector (IELTS Research Report Series, 6). British Council, Cambridge English Language Assessment, & IDP: IELTS Australia. https://ielts.org/researchers/our-research/research-reports/the-impact-of-teacher-cognition-and-classroom-practices-on-ielts-test-preparation-courses-in-the-australian-elicos-sector

18.

Chong

S. W.

Plonsky

(2023). A typology of secondary research in Applied Linguistics. Applied Linguistics Review, 15(4), 1569–1594. https://doi.org/10.1515/applirev-2022-0189

19.

Clark

Spiby

Tasviri

(2021). Crisis, collaboration, recovery: IELTS and COVID-19. Language Assessment Quarterly, 18(1), 17–25. https://doi.org/10.1080/15434303.2020.1866575

20.

Clark

(2020). Beyond the IELTS test: Chinese and Japanese postgraduate UK experiences. International Journal of Bilingual Education and Bilingualism, 24(10), 1512–1530. https://doi.org/10.1080/13670050.2020.1829538

21.

Crosthwaite

Ningrum

Lee

(2022). Research trends in L2 written corrective feedback: A bibliometric analysis of three decades of Scopus-indexed research on L2 WCF. Journal of Second Language Writing, 58, 100934. https://doi.org/10.1016/j.jslw.2022.100934

22.

Dao

Nguyen

M. X. N. C.

Nguyen

H. V.

(2024). “IELTS juniors” in Vietnam: Perceptions of learners, parents and IELTS preparation course providers (IELTS Research Reports Online Series, 3/24). British Council, IDP IELTS, & Cambridge University Press & Assessment. https://ielts.org/researchers/our-research/research-reports/ielts-juniors-in-vietnam-perceptions-of-learners-parents-and-ielts-preparation-course-providers

23.

Davies

(2008). Assessing academic English language proficiency: 40+ years of U.K. language tests. In Fox

Wesche

Bayliss

Cheng

Turner

C. E.

Doe

(Eds.), Language testing reconsidered (pp. 73–86). University of Ottawa Press.

24.

Dong

Gan

Zheng

Yang

(2022). Research trends and development patterns in language testing over the past three decades: A bibliometric study. Frontiers in Psychology, 13, 801604. https://doi.org/10.3389/fpsyg.2022.801604

25.

Erfani

S. S.

(2012). A comparative washback study of IELTS and TOEFL iBT on teaching and learning activities in preparation courses in the Iranian context. English Language Teaching, 5(8), 185–195. https://doi.org/10.5539/elt.v5n8p185

26.

Estaji

Banitalebi

(2023). A study of test-taking strategies of Iranian IELTS repeaters: Any change in the strategy use? International Journal of Testing, 23(3), 205–230. https://doi.org/10.1080/15305058.2023.2195662

27.

Estaji

Hashemi

(2022). Phraseological competence in IELTS academic writing task 2: Phraseological units and test-takers’ perceptions and use. Language Testing in Asia, 12(1), 34. https://doi.org/10.1186/s40468-022-00180-7

28.

Fulcher

(2014). Philosophy and language testing. In Kunnan

A. J.

(Ed.), The companion to language assessment (pp. 1431–1451). John Wiley & Sons.

29.

Fulcher

Davidson

(2007). Language testing and assessment: An advanced resource book. Routledge.

30.

Gagen

Faez

(2024). The predictive validity of IELTS scores: A meta-analysis. Higher Education Research & Development, 43(4), 873–888. https://doi.org/10.1080/07294360.2023.2280700

31.

Gan

(2009). IELTS preparation course and student IELTS performance: A case study in Hong Kong. RELC Journal, 40(1), 23–41. https://doi.org/10.1177/0033688208101449

32.

Garinger

Schoepp

(2013). IELTS and academic achievement: A UAE case study. TESOL Arabia Perspectives, 21(3), 7–14.

33.

Green

(2007). Washback to learning outcomes: A comparative study of IELTS preparation and university pre-sessional language courses. Assessment in Education: Principles, Policy & Practice, 14(1), 75–97. https://doi.org/10.1080/09695940701272880

34.

Green

(2014). Exploring language testing and assessment: Language in action. Routledge.

35.

H. T.

Nguyen

D. T. B.

(2023). Using structural equation modeling to examine the relationship between receptive vocabulary knowledge and receptive language proficiency: A new light on an old issue. International Journal of Applied Linguistics, 33(2), 382–398. https://doi.org/10.1111/ijal.12475

36.

Hall

(2009). International English language testing: A critical response. ELT Journal, 64(3), 321–328. https://doi.org/10.1093/elt/ccp054

37.

Hamid

M. O.

Hoang

N. T. H.

(2018). Humanising language testing. TESL-EJ, 22(1), 1–20.

38.

Hamilton

Lopes

McNamara

T. F.

Sheridan

(1993). Rating scales and native speaker performance on a communicatively oriented EAP test. Language Testing, 10(3), 337–353. https://doi.org/10.1177/026553229301000307

39.

Hamp-Lyons

(2000). Social, professional and individual responsibility in language testing. System, 28(4), 579–591. https://doi.org/10.1016/S0346-251X(00)00039-7

40.

Sénécal

A.-M.

Stansfield

Suvorov

(2025). A scoping review of research on second language test preparation. Language Testing, 42(1), 11–47. https://doi.org/10.1177/02655322241249754

41.

Heitner

R. M.

Hoekje

B. J.

Braciszewski

P. L.

(2014). Keys to college: Tracking English language proficiency and IELTS test scores in an international undergraduate conditional admission program in the United States. In Connor-Linton

Amoroso

L. W.

(Eds.), Measured language: Quantitative approaches to acquisition, assessment, and variation (pp. 183–197). Georgetown University Press.

42.

Hyatt

(2013). Stakeholders’ perceptions of IELTS as an entry requirement for higher education in the UK. Journal of Further and Higher Education, 37(6), 844–863. https://doi.org/10.1080/0309877X.2012.684043

43.

Hyland

Jiang

(2021). A bibliometric study of EAP research: Who is doing what, where and when? Journal of English for Academic Purposes, 49, 100929. https://doi.org/10.1016/j.jeap.2020.100929

44.

IELTS. (2002). The IELTS handbook. University of Cambridge Local Examinations Syndicate, The British Council, IDP Australia.

45.

IELTS. (2019). Guide for educational institutions, governments, professional bodies and commercial organisations. Cambridge Assessment English, The British Council, IDP Australia. https://takeielts.britishcouncil.org/sites/default/files/ielts-guide-for-organisations.pdf

46.

IELTS. (2024a). IELTS continues to lead, support, and empower around the world. https://ielts.org/news-and-insights/ielts-trusted-by-millions-in-2023

47.

IELTS. (2024b). IELTS one skill retake: Helping you show your full potential. https://ielts.org/take-a-test/booking-your-test/one-skill-retake

48.

IELTS. (2025). Research funding: IELTS joint-funded research programme. https://ielts.org/researchers/funding-and-awards/research-funding

49.

Ihlenfeldt

S. D.

Rios

J. A.

(2023). A meta-analysis on the predictive validity of English language proficiency assessments for college admissions. Language Testing, 40(2), 276–299. https://doi.org/10.1177/02655322221112364

50.

Ioannidis

J. P. A.

(2018). Meta-research: Why research on research matters. PLOS Biology, 16(3), Article e2005468. https://doi.org/10.1371/journal.pbio.2005468

51.

Isaacs

Trofimovich

Muñoz Chereau

(2015). Examining the linguistic aspects of speech that most efficiently discriminate between upper levels of the revised IELTS Pronunciation scale (IELTS Research Report Series, 4). British Council, Cambridge English Language Assessment, & IDP: IELTS Australia. https://ielts.org/researchers/our-research/research-reports/examining-the-linguistic-aspects-of-speech-that-most-efficiently-discriminate-between-upper-levels-of-the-revised-ielts-pronunciation-scale

52.

Isbell

D. R.

Kremmel

(2020). Test review: Current options in at-home language proficiency tests for making high-stakes decisions. Language Testing, 37(4), 600–619. https://doi.org/10.1177/0265532220943483

53.

Jing

Wang

Chen

Shen

Shadiev

(2024). A bibliometric analysis of studies on technology-supported learning environments: Hot topics and frontier evolution. Journal of Computer Assisted Learning, 40(3), 1185–1200. https://doi.org/10.1111/jcal.12934

54.

Kunnan

A. J.

(2018). Evaluating language assessments. Routledge.

55.

Lei

Liu

(2019). Research trends in applied linguistics from 2005 to 2016: A bibliometric analysis and its implications. Applied Linguistics, 40(3), 540–561. https://doi.org/10.1093/applin/amy003

56.

(2022). Research trends of blended language learning: A bibliometric synthesis of SSCI-indexed journal articles during 2000-2019. ReCALL, 34(3), 309–326. https://doi.org/10.1017/S0958344021000343

57.

Lloyd-Jones

Binch

(2012). A case study evaluation of the English language progress of Chinese students on two UK postgraduate engineering courses (IELTS Research Reports, Vol. 13, pp. 1–56). IDP: IELTS Australia & British Council. https://ielts.org/researchers/our-research/research-reports/a-case-study-evaluation-of-the-english-language-progress-of-chinese-students-on-two-uk-postgraduate-engineering-courses

58.

Lomer

(2017). Recruiting international students in higher education: Representations and rationales in British policy. Palgrave Macmillan.

59.

Chong

S. W.

(2022). Predictability of IELTS in a high-stakes context: A mixed methods study of Chinese students’ perspectives on test preparation. Language Testing in Asia, 12, 2. https://doi.org/10.1186/s40468-021-00152-3

60.

MacDonald

J. J.

(2019). Sitting at 6.5: Problematizing IELTS and admissions to Canadian universities. TESL Canada Journal, 36(1), 160–171. https://doi.org/10.18806/tesl.v36i1.1308

61.

McNamara

Roever

(2006). Validity and the social dimension of language testing. Language Learning, 56, 9–42. https://doi.org/10.1111/j.1467-9922.2006.00379.x

62.

Merrifield

, & GBM Associates. (2012). The use of IELTS for assessing immigration eligibility in Australia, New Zealand, Canada and the United Kingdom (IELTS Research Report, Vol. 13, pp. 1–32). IDP: IELTS Australia & British Council. https://ielts.org/researchers/our-research/research-reports/the-use-of-ielts-for-assessing-immigration-eligibility-in-australia-new-zealand-canada-and-the-united-kingdom

63.

Messick

(1989). Validity. In Linn

R. J.

(Ed.), Educational measurement (pp. 13–103). Macmillan.

64.

Moore

Morton

(2005). Dimensions of difference: A comparison of university writing and IELTS writing. Journal of English for Academic Purposes, 4(1), 43–66. https://doi.org/10.1016/j.jeap.2004.02.001

65.

Noori

Mirhosseini

S.-A.

(2021). Testing language, but what?: Examining the carrier content of IELTS preparation materials from a critical perspective. Language Assessment Quarterly, 18(4), 382–397. https://doi.org/10.1080/15434303.2021.1883618

66.

O’Loughlin

Arkoudis

(2009). Investigating IELTS exit score gains in higher education (IELTS Research Report, Vol. 10, pp. 1–86). IELTS Australia & British Council. https://ielts.org/researchers/our-research/research-reports/investigating-ielts-exit-score-gains-in-higher-education

67.

Page

M. J.

McKenzie

J. E.

Bossuyt

P. M.

Boutron

Hoffmann

T. C.

Mulrow

C. D.

Shamseer

Tetzlaff

J. M.

Akl

E. A.

Brennan

S. E.

Chou

Glanville

Grimshaw

J. M.

Hróbjartsson

Lalu

M. M.

Loder

E. W.

Mayo-Wilson

McDonald

. . .Moher

. (2021). The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. The BMJ, 372, 1–9. https://doi.org/10.1136/bmj.n71

68.

Pearson

W. S.

(2019). Critical perspectives on the IELTS test. ELT Journal, 73(2), 197–206. https://doi.org/10.1093/elt/ccz006

69.

Pearson

W. S.

(2024). Research topics in applied linguistics as keywords from authors and keywords from abstracts: A bibliometric study. In Meihami

Esfandiari

(Eds.), A scientometrics research perspective in applied linguistics (pp. 113–134). Springer Nature. https://doi.org/10.1007/978-3-031-51726-6_5

70.

Pearson

Zou

(2026, March 3). A bibliometric analysis of research into the International English Language Testing System (IELTS). https://osf.io/ef3kq

71.

Phakiti

(2024). Independent versus integrated writing tasks for IELTS Writing: A mixed methods study (IELTS Research Reports Online Series, 1/24). British Council, IDP IELTS, & Cambridge University Press & Assessment. https://ielts.org/researchers/our-research/research-reports/independent-versus-integrated-writing-tasks-for-ielts-writing

72.

Pritchard

(1969). Statistical bibliography or bibliometrics? Journal of Documentation, 25(4), 348–349.

73.

Rao

McPherson

Chand

Khan

(2003). Assessing the impact of IELTS preparation programs on candidates’ performance on the General Training reading and writing test modules (IELTS Research Reports, Vol. 5, pp. 236–262). IELTS Australia. https://ielts.org/researchers/our-research/research-reports/assessing-the-impact-of-ielts-preparation-programs-on-candidates-performance-on-the-general-training-reading-and-writing-test-modules

74.

Read

(2022). Test review: The International English Language Testing System (IELTS). Language Testing, 39(4), 679–694. https://doi.org/10.1177/02655322221086211

75.

Riazi

A. M.

Knox

J. S.

(2013). An investigation of the relations between test-takers’ first language and the discourse of written performance on the IELTS Academic Writing Test, Task 2 (IELTS Research Report Series, 2). IDP: IELTS Australia. https://ielts.org/researchers/our-research/research-reports/an-investigation-of-the-relations-between-test-takers-first-language-and-the-discourse-of-written-performance-on-the-ielts-academic-writing-test-task-2

76.

Saif

May

Cheng

(2021). Complexity of test preparation across three contexts: Case studies from Australia, Iran and China. Assessment in Education: Principles, Policy & Practice, 28(1), 37–57. https://doi.org/10.1080/0969594X.2019.1700211

77.

Seedhouse

Harris

(2011). Topic development in the IELTS Speaking test (IELTS Research Reports, Vol. 12). IDP: IELTS Australia & British Council. https://ielts.org/researchers/our-research/research-reports/topic-development-in-the-ielts-speaking-test

78.

Shi

Aryadoust

(2024). A systematic review of AI-based automated written feedback research. ReCALL, 36(2), 187–209. https://doi.org/10.1017/S0958344023000265

79.

Sinclair

Larson

E. J.

Rajendram

(2019). “Be a machine”: International graduate students’ narratives around high-stakes English tests. Language Assessment Quarterly, 16(2), 236–252. https://doi.org/10.1080/15434303.2019.1628238

80.

Tavakoli

Jabbari

A. A.

Aghabagheri

(2014). Acquisition of subjunctive mood by Iranian EFL learners: Adverbial clause of condition. Procedia—Social and Behavioral Sciences, 98, 688–694. https://doi.org/10.1016/j.sbspro.2014.03.469

81.

Taylor

(2009). Introduction. In Osbourne

(Ed.), IELTS research report (Vol. 10, pp. vii–xvi). IELTS Australia.

82.

Vincheh

M. H.

Mirzaei

Roohani

(2024). A cognitive diagnostic approach to IELTS speaking test: Unveiling the subskills and test-takers’ perceptions. Language Testing in Asia, 14(1), 42. https://doi.org/10.1186/s40468-024-00311-2

83.

Yan

Zhang

(2023). Trends and hot topics in linguistics studies from 2011 to 2021: A bibliometric analysis of highly cited papers. Frontiers in Psychology, 13, 1052586. https://doi.org/10.3389/fpsyg.2022.1052586

84.

Yan

Fan

(2021). “Am I qualified to be a language tester?”: Understanding the development of language assessment literacy across three stakeholder groups. Language Testing, 38(2), 219–246. https://doi.org/10.1177/0265532220929924

85.

Yang

Badger

(2015). How IELTS preparation courses support students: IELTS and academic socialisation. Journal of Further and Higher Education, 39(4), 438–465. https://doi.org/10.1080/0309877X.2014.953463

86.

Yang

Wang

(2025). Current status and research trend of English language assessment: A bibliometric analysis. Language Testing in Asia, 15(1), 11. https://doi.org/10.1186/s40468-024-00317-w

87.

Yates

Zielinski

Pryor

(2011). The assessment of pronunciation and the new IELTS Pronunciation scale (IELTS Research Reports, Vol. 12, pp. 1–46). IDP: IELTS Australia & British Council. https://ielts.org/researchers/our-research/research-reports/the-assessment-of-pronunciation-and-the-new-ielts-pronunciation-scale

88.

Zhang

(2019). A bibliometric analysis of second language acquisition between 1997 and 2018. Studies in Second Language Acquisition, 42(17), 199–222. https://doi.org/10.1017/S0272263119000573

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.98 MB