Abstract
The productivity of a specific research field hinges on the periodic examination of both the knowledge produced and the knowledge production activities. By harnessing the strength of traditional bibliometric analyses and a variety of Natural language processing (NLP) techniques, this study portrayed a holistic landscape of higher education internationalisation (HEI) research that incorporated time and region through a spatial lens. The findings reveal the field's evolution into establishment, significant regional variations in research focus, and the expansion of networks for disseminating knowledge. These factors collectively contribute to a diverse ‘lived’ space of HEI research. However, the dominance of Western-centric key concepts, theories, and discourses highlights a homogenous ‘conceived’ space, pointing to an underlying tension between these spaces. Despite these challenges, opportunities for breakthroughs exist. Additionally, the study underscores the immense potential of NLP techniques in facilitating the exploration of how research fields evolve, further enriching our understanding of HEI.
Keywords
Introduction
When society entered the twenty-first century, human connectivity and mobility accelerated to an unprecedented level. This phenomenon has resulted in profound changes in the way that higher education is conceived and practised worldwide (de Wit, 2002; Knight, 2004; Marginson, 2011; Scott, 2000). The new global context is compelling universities to reconsider their missions, goals, and responsibilities, as well as to develop innovative strategies to improve their relevance and competitiveness. Thus, internationalisation has become one of the key discourses in higher education and attracted increasing attention of researchers from different parts of the globe (Gao, 2019; Liu & Gao, 2022). The first attempt to analyse the international development of higher education scientifically dates back to the mid-1970s (Cerych, 1974). Since the 2000s, higher education internationalisation (HEI hereafter) 1 began to emerge as a research field. Since then, the field has continued to grow and approached its maturity. This field's prosperity goes hand in hand with HEI practices’ increased complexity and expanded scope, which has encouraged more researchers to devote themselves to the field.
For a research field to achieve healthy and sustainable development, it is necessary to review and scrutinise its evolution process at different stages. This will also facilitate understanding the nature and features of knowledge production in the field (LiCausi & McFarland, 2022). Many such efforts have been made in mapping studies on HEI. For instance, Kuzhabekova et al. (2015) analysed 2,302 publications on international higher education to identify the global trend in the field. More recently, Ghani et al. (2022) and Saubert and Cooper (2023) replicated and extended this analysis to include more up-to-date publications. Other researchers have depicted different segments of the field from a disciplinary perspective, such as Sochan (2008) and Mackay et al. (2016) in nursing, and de Villiers and Hsiao (2017) in accounting. Scholars may concentrate their reviews on specific topics, including international student mobility (Gümüş et al., 2020; Lo, 2019), internationalisation at home (Li & Xue, 2023; Robson et al., 2018), transnational higher education (Tran et al., 2023), and diaspora (Bamberger, 2022). Research on HEI in a particular nation or region can be in the spotlight of mapping practices as well. Different reviews had been conducted in Russia (Atsyor, 2023), Brazil (Barbosa et al., 2022), Australia (O’Dwyer, 2023), China (Xu, 2023; Zheng & Kapoor, 2021), and Southeast Asia (Grothaus & Zawacki-Richter, 2021).
Efforts to delineate the landscape of HEI research have been both valuable and varied, piecing together different segments of this complex field. Despite these efforts, previous practices often exhibit limitations in scope or methodological approach. Reviews focusing on singular themes or confined geographical areas, while insightful, may not capture the field's full thematic breadth or global diversity. Conversely, studies aiming for a comprehensive overview typically rely on bibliometric analyses or manual content analysis. Bibliometric methods, capable of processing large datasets, mainly gauge journal and author metrics, offering less insight into the content and thematic organisation of research. Manual content analysis, on the other hand, delves into publications’ themes, methodologies, and theoretical frameworks but is time-intensive and thus limited in scale. Addressing these limitations, this study aims to create a more holistic and nuanced map of HEI research. By integrating bibliometric capabilities with automated text analysis, this approach seeks to overcome the shortcomings of each method when used in isolation. Furthermore, this cartography is designed to track the field's evolution, providing insights into changes over time and across different regions, rather than merely presenting a static overview.
A Spatial Lens
Scholars who attempted to map a research field normally did not articulate an explicit theoretical lens that guided their practices, with a few exceptions in which Burt's social network theory (e.g., Kuzhabekova et al., 2015), Foucault's theory of power and knowledge (e.g., Saubert & Cooper 2023), and Bourdieu's field reproduction theory (e.g., Munoz-Najar Galvez et al., 2020) were applied. This study primarily rested on Lefebvre (1991)'s spatial theory to portray the research space of HEI. Indeed, a spatial lens has been increasingly used to understand higher education (see, e.g., Marginson, 2022) and internationalisation of higher education (Singh et al., 2007). According to Lefebvre (1991), social space is both a product and means for production, and simultaneously a result and cause, product and producer, with a key feature of relational, fluid, and open. This fits this study's purpose well—mapping the research space by understanding the product (academic publications on HEI), producer (authors), and, most importantly, their interrelation.
Lefebvre (1991) argued that “… no space disappears in the course of growth and development: the worldwide does not abolish the local… The local does not disappear, for it is never absorbed by the regional, national or even worldwide level” (p. 86). Such recognition of the locality aligns perfectly with HEI. Understanding of, and approaches to, internationalisation are highly sensitive to institutions and nations’ specific contexts (de Wit, 2014; Gao, 2015; Marginson, 2011; Yang, 2002). Consequently, scholars from different parts of the world have various priorities in their HEI research agenda. Such divergences should not be omitted or masked in the cartography. In addition, Lefebvre (1991) used the metaphor of ‘flaky mille-feuille pastry’ to describe social space's structure (p. 86). An unlimited multiplicity of social spaces interpenetrates one another without mutually limiting boundaries. This metaphor is particularly useful when drawing research spaces’ landscape. HEI research on different themes, regions, and disciplines have formed their own overlapping spaces. HEI research as a whole is embedded within broader spaces such as higher education research, social science research, and knowledge production.
Lefebvre (1991) presented his theory of space as a ‘triple dialectic’—spatial practice, representation of space, and representational space, also known as ‘perceived’, ‘conceived’, and ‘lived’ spaces. Spatial space is perceived space, which highlights the materiality of space, the particular locations and their characteristics, through which it is possible to identify flows and movements in the realm of daily routine. In research space, this pillar highlights both the venue of knowledge communication and circulation and researchers’ deployment of such venues. The venues’ structure and network embody the researchers’ relationships, the knowledge they produce, and the direction of knowledge flow. Academic books, journals, and conferences serve as the media in channelling research findings within the relevant international academic communities (Ghani et al., 2022). Researchers in the space select their preferred type of publications, publisher, and collaborator to disseminate the knowledge that they generate to their targeted communities.
Representation of space is conceived space, which is imaginative and occupies the dominant position in any society and reveals in what way space is conceptualised (Lefebvre, 1991). In research space, this pillar can be interpreted as the prevalent discourses, theories, and methodologies used in a particular field. The selection of a certain theory and methodology manifests researchers’ epistemological stances, which determines the way that they understand the phenomenon investigated (Munoz-Najar Galvez et al., 2020). Different epistemological perspectives have their own strengths in answering three fundamental questions: what the world (the phenomenon investigated) is, what it should be, and why it is like this?
The third pillar in Lefebvre's (1991) dialectic is representational spaces, as directly lived space for social practices, which are essentially qualitative, fluid, and dynamic and may be directional, situational, or relational (pp. 39, 245). The lived space cannot be separated from what is perceived and conceived, but is not bounded necessarily by the perceived and conceived spaces. It highlights the practical ‘I’ (p. 61), and can be appropriated. In a research space, this can be interpreted as researchers’ individual choice of topic, subject, unit of analysis, etc. As the lived space is dynamic and fluid, researchers’ focus and choice may change over time and location. Together with the theory and methodology that researchers employ, the discourse they join or follow, and the venue in which they choose to publish their findings, all of these elements describe and present a research space's state, form, and structure.
Methods
This study aimed to create a comprehensive mapping of the HEI research landscape, integrating established bibliometric methods with advanced Natural Language Processing (NLP) techniques. Bibliometric analysis offers insights into publication and citation trends, revealing the dynamics of knowledge production in a field. NLP, a suite of computational methods for analysing and interpreting human language, enhances the capacity to process textual data on a large scale, offering valuable tools for science sociology (Chowdhary, 2020; Daenekindt & Huisman, 2020; LiCausi & McFarland, 2022; Nadkarni et al., 2011). Increasingly utilised in the social sciences and humanities, NLP facilitates tasks such as document classification and the exploratory analysis of extensive text collections (Carron-Arthur et al., 2016; Székely & vom Brocke, 2017). Its application extends to education research (i.e., Daenekindt & Huisman, 2020; Munoz-Najar Galvez et al., 2020), including HEI, as evidenced by Xu and Huang's (2023) sentiment analysis of international doctoral students’ experiences. In this study, we applied specific NLP techniques including text classification, Named Entity Recognition (NER), and topic modelling. Text classification involves automatically categorising documents into predefined classes. NER identifies and classifies entities within texts, such as people, locations, and organisations. Topic modelling uncovers latent topics within document collections through patterns of word co-occurrences, assigning topic membership probabilities to each document (Blei et al., 2003; Lane et al., 2019; Saxton, 2018). An illustrative diagram detailing the methodology flow aligned with the theoretical framework is provided in Appendix A.
Data
To precisely map the HEI research landscape, the initial critical step involved compiling an accurate dataset. After evaluating commonly used databases in education research, including EBSCO, ERIC, and Web of Science, Scopus was chosen for its balance between the comprehensiveness of publication records and the completeness, accessibility, and consistency of the required metadata, such as publication year, author affiliations, and reference lists. To identify relevant publications within Scopus, we refined our search strategies through multiple iterations, based on in-depth domain knowledge, aiming to capture the most pertinent publications while minimising omissions. This process resulted in a dataset comprising 3,620 articles, books, book chapters, and referenced conference papers published in English from 1969 to 2024, as detailed in Appendix B.
Bibliometric Analysis
Informed by the spatial theory, a number of bibliometric analyses were performed in this study to illustrate HEI research's ‘perceived’ and ‘conceived’ space, respectively. The flow of researchers into and out of the field, these researchers’ collaboration pattern, as well as their activeness were observed first, which highlighted the space's producers and their participation in producing space. The general publication trend through time and across regions was then identified, with the type of medium that researchers preferred to communicate their research findings. References cited in the dataset's 2,948 journal articles 2 were examined to identify the prevalent theories used that underpin research on HEI, which constitutes the ‘conceived’ space.
Text Mining
A variety of text mining techniques were applied to explore the ‘conceived’ and ‘lived’ spaces within HEI research. To categorise the methodologies used in the studies within our dataset, text classification was employed, leveraging the GPT-4 API via Python. This approach involved providing the API with a prompt to infer the methodology from each study's abstract, categorising them into four predefined groups: qualitative, quantitative, mixed, and not explicit. The GPT-4 model then assigned one of these labels to each abstract. To validate the classification accuracy of the GPT-4, we manually annotated a random sample of 100 abstracts, finding an 88 per cent correctness rate. For identifying the diversity of HEI research foci globally, we extracted geographic entities from each abstract using Stanza, an NLP toolkit with pre-trained NER models. This step facilitated the automatic tagging of geolocation names within the unstructured data. The prompts used and examples of the results of the GPT-4 and Stanza are documented in Appendix C.
The Structural Topic Model (STM) was then performed to discern topics within HEI research. The corpus underwent several pre-processing steps before STM execution, including the removal of short abstracts, elimination of stopwords, and extraction of n-gram terms. This processing resulted in a corpus with a vocabulary of 7,858 terms (244,242 tokens) appearing at least twice. Given the sub-field status of HEI within higher education research and the relative small size of our dataset, we explored a range of topic numbers from 15 to 30. A model with 21 topics was selected for offering an optimal balance between parsimony, thematic inclusiveness, and analytical depth. This selection was justified by comparing three internal validity measures (Semantic Coherence, Exclusivity, and Held-out Evaluation), indicating the chosen model satisfied statistical criteria (Grimmer & Stewart, 2013; Munoz-Najar Galvez et al., 2020). We labelled the selected models’ topics, referring to each topic's most probable terms and FREX term. In this step, the top ten of each topic's high-loaded articles were also checked to improve the label's accuracy.
Half of the 3,620 abstracts contained specific geolocation information captured by NER, allowing us to categorise them into ten regions. We then computed aggregated topic proportions for each region and standardised these proportions to ensure a consistent comparison scale. A linear regression model was applied to examine the evolution of these topics over time, incorporating estimation uncertainty through multiple sets of topic proportions. Additionally, to investigate the relationships between topics, we calculated the MAP correlation coefficients (maximum a posteriori probability) for the topic proportions, revealing the marginal correlation of the mode of the variational distribution.
HEI's Research Space
The 3,620 publications on HEI identified were written by 6,226 authors, which indicated that research collaborations among scholars are a common practice in the space. 842 researchers published at least twice, and thus, can be considered active participants. Further, we identified 73 researchers who published more than five times as key producers who shape the space. Of these key participants, 31 were from EU countries, 18 from East Asia, 7 from North America, and 6 from Oceania. Their regional distribution indicated the importance and relevance of HEI in the regional research agenda. A close look at these researchers’ activities revealed that 12 of them generated over ten publications, with 26 at most. This particular research space is dynamic, with researchers moving in and out. In the late 1990s, some scholars began to study HEI phenomena systematically and became the space's founders. These early entrants normally stayed in the space 3 for a long while, over 20 years until recently. It is worth noting that this span was underestimated because many early studies on HEI were published as research reports by universities or research institutes that Scopus does not index, such as de Wit's (1997) and Knight's (1994, 1997) works. We also observed that not all participants remained in the space. Some left 4 after just several years of engagement.
With the space thriving, an increasing number of researchers from different parts of the world committed to HEI study and generated new knowledge. Figure 1 displays the distribution of publications by region between 1974 and 2024. Before the 2000s, there were only a few regions engaged in HEI study. By the 2010s, all regions had established their position in the space, which signals HEI research's maturity. With respect to the collaboration pattern between authors 5 , we assigned authors to a specific region based upon their affiliation. Figure 2 shows that scholars tend to collaborate with peers in the same region. EU countries, North America, and East Asia are key nodes in the space, as researchers from these regions had developed collaborations with counterparts throughout the world. The void (i.e., Non-EU European countries with Southeast Asia, Australia with Central Asia) in the current collaboration network also deserves notice, as they indicate the opportunity to extend the space in the future.

Distribution of publications by region between 1974 and 2024.

Author's collaboration heat map across regions.
The Perceived Space
Once the scope of, and participants in, HEI research space were defined, we could proceed to the first pillar of Lefebvre's ‘triple dialectic’—the perceived space. Academic publishers and conferences served as the main venues for researchers’ spatial practice. Journal articles, books (incl. book chapters), and conference papers constitute the ‘places’ (Lefebvre, 1991, p. 288) for researchers to present and communicate their views and findings. The researchers made decisions about where and in what way to report their research, which defined their research practices’ ‘routine’ (p. 38) and these ‘places’ simultaneously. Our results revealed that journals are the most preferred venue to disseminate knowledge in the space, as nearly 70 per cent of the research outputs were journal papers. Before the mid-2000s, journals were the only ‘place’ that HEI scholars could appropriate for their research activities. With the growth of space, venues became diversified. Books and conference papers constituted a considerable number of the publications in the space after the 2010s.
A scrutiny of the top 20 journals listed in Table 1 reveals more attributes of the perceived space, which published one-third of all articles on HEI. Apart from several journals that focus on HEI particularly (e.g., Journal of Studies in International Education, Research in Comparative and International Education, etc.), others suggest a more general coverage of educational issues (e.g., Studies in Higher Education, Tertiary Education and Management, etc.), through which HEI's research space is connected to more broader spaces, such as higher education and education in general. Most of the top journals are high-quality venues, ranked in the first two quartiles by Scimago Journal & Country Rank (SJR), which indicates that the HEI research space has achieved a certain degree of prominence in spaces in which it is embedded. Some journals are well-established, launched in the middle of the twentieth Century, while others were founded recently. Notably, Frontiers of Education in China ceased publication in 2021. This implies that ‘places’ can be created and then may perish. Four of the 20 journals show particular regional foci, two on Europe and two on China. Lefebvre (1991) argued that the perceived space embodies the relation between local and global. Journal articles, books, and conference papers serve as routes and networks that assemble and mobilise knowledge of HEI around the globe and connect researchers with other academic communities in the space.
Top Journals in HEI Research Space.
HEI, higher education internationalisation.
The Conceived Space
As Lefebvre (1991) argued, representation of space or the conceived space is imbued with ideological and political content, which has the capacity to translate and reproduce space, working ideologically to legitimate or contest particular practices in the space. We analysed the top-cited 80 authors’ works listed in the references of our 2,948 journal articles to identify the influential theories that underpin and frame research in HEI. The results revealed that the academic works cited most are conceptual, which define the HEI phenomenon’ key concepts, frameworks, and models. Examples include Knight's (1997) Internationalisation of Higher Education: A Conceptual Framework; and de Wit's (1997) Strategies for the Internationalisation of Higher Education in Asia Pacific Countries, etc. These works may not be considered strictly theoretical or they can be categorised as micro-level theory because they address the issue of HEI specifically and cannot be applied to explain the mechanism of broad educational and social phenomena. Prevalent theories employed to inform investigations of HEI include Bourdieu's (1987, 2002) field and capital theories, Slaughter and Rhoades’s (2004) academic capitalism theory, Hofstede and Bond's (1984) cultural theory, and Foucault's (1982) power and knowledge theory.
The GPT-4's results in classifying the methodology used in the 3,620 works indicated that the qualitative method dominated the HEI research space (55.4 per cent). This is not surprising, as the qualitative design is based upon a constructivist epistemology and explores what it assumes to be a socially constructed dynamic reality through a framework that is value-laden, flexible, descriptive, holistic, and context-sensitive (Creswell, 2007; Yilmaz, 2013). This approach fits well with the HEI phenomenon's nature, which is highly sensitive to multilevel social and political contexts (Liu & Gao, 2022; Marginson, 2011; Yang, 2002). Quantitative and mixed-method studies constituted 14.5 and 9.4 per cent, respectively. One-fifth of the publications failed to articulate the methodology used in the abstract explicitly. This result echoed Tight's (2012) finding that less than four-fifths of journal articles in higher education described their methodology clearly. With respect to publication type, journal articles were most likely to provide the research design (97.50%), followed by conference papers (67.14%). Table 2 documents the results of the analysis of the methodology used by publication type.
Methodology Used in HEI Research by Publication Type.
HEI, higher education internationalisation.
Further, among the 21 topics generated by STM analysis (see Table 3 in the next section), Topic 7 presented some leading discourses in HEI, which include neoliberalism, decolonisation, post-colonialism, global citizenship, etc. It is interesting to note that both the prevalent theories and discourses identified originated in the Western context and then were used and followed by researchers to investigate and interpret HEI in different regions. Such a practice had profound implications for the knowledge produced in the research space, which will be revisited in the discussion section.
Topic Descriptions.
HEI, higher education internationalisation.
The Lived Space
According to Lefebvre (1991), the representational space in which ‘inhabitants’ and ‘users’ live directly (p. 39) is alive and speaks, where the forgotten practical ‘I’ (p. 61) in epistemological space can play its role as agent. Thus, representational spaces need obey no rules of consistency or cohesiveness. We identified 21 topics that the studies on HEI available covered to capture the lived space's diversity, and the change in these topics’ prevalence over time and location to reflect this space's dynamics. A close examination of the topics listed in Table 3 suggested that the international dimension had been integrated into multiple aspects of higher education practices, such as curricula (T8), performance evaluation (T17), professional development (T2), etc. Moreover, internationalisation has been discussed with broader issues, such as system reform (T6&20) and national capacity building (T4&10). The topics varied in size also. Some drew a large number of researchers’ attention and had been discussed extensively, including ‘Concepts and leading discourses in HEI’ (T7), ‘Internationalisation at home and student experience’ (T14), and ‘Performance evaluation’ (T17), while others tended to be under-explored (e.g., T16 ‘Individual engagement with internationalisation’, T6 ‘Higher education in transition’, and T19 ‘Employment of graduates’), which indicated gaps and opportunities for further research.
Nearly half of the topics displayed strong regional indications. For example, T1, ‘Research performance & doctoral education’ was often studied with respect to Japanese universities, and ‘Japan’ was an exclusive term for that topic, which suggests that it occurred less frequently in other topics. Similarly, T9, ‘Regional educational hub’ tended to be a priority in Hong Kong, Singapore, and Malaysia's HEI research agenda. Such similar topics included T4, ‘HEI & country development’ (Africa/South Africa), T14, ‘Internationalisation at home and student experience’ (Canada), T18, ‘Regionalisation of HE’ (Europe/EU), etc. To observe the diversity of the regional HEI research agenda further, we calculated the topic distribution across regions, as Figure 3 illustrates. In addition, topics like ‘Internationalisation of curriculum’ (T8) and ‘Technology-based talent training’ (T12) displayed particularly disciplinary connections with nursing and engineering, respectively. Gender issue tended to be prominent in T19, ‘Employment of graduates’.

Topic prevalence by region.
Figure 4 shows all topics’ evolution over time. Ten topics displayed a strong increasing tendency (Figure 4a) that has attracted increased attention from researchers. The increasing trend combined with a high average proportion indicated that some topics constituted key pillars of the HEI research space, such as ‘Concepts and leading discourses’ (T7), ‘Transnational academic mobility’ (T15), and ‘Performance evaluation’ (T17). Other topics became prominent in response to new HEI conditions, such as Online education (T21) during the COVID pandemic. Another three topics (T6, ‘HE in transit’, T12, ‘Technology-based talent training’, and T14, ‘Internationalisation at home and student experience’) maintained a stable evolution with a mild increase over the decades (Figure 4b).

Evolution of topic prevalence over time.
The remaining eight topics that lost ground over time deserve further examination (Figure 4c). Topics such as ‘Regional educational hub’ (T9) and ‘Regionalisation of HE’ (T18) used to very prevalent in HEI research, but the declining trend may be attributable to their sensitivity to national or regional policies. Projects in building Singapore and Malaysia as regional hubs could be dated to the mid-2000s (Mok, 2008). The ERASMUS program, which was designed to strengthen EU countries’ regional integration, was established in 1987 (Valiulis, 2013). Once these programs approached their maturity or new initiatives had been launched, the interest in these topics waned. Decreasing popularity does not necessarily indicate lack of importance, but may suggest that a novel concept or model replaced the old one with extended scope, such as T8, ‘Internationalisation of curriculum’.
The evolution of HEI research over time as a whole is manifested in the change in topic diversity. A clear upward trend can be observed, as Figure 5 illustrates. This implies that more recent publications in HEI are characterised by greater topic diversity. With respect to the similarity between topics, Figure 6 visualises their correlation. The size of the node indicates the size of the topic and the thickness of the line between nodes indicates the strength of the association between the topics. For example, T3, ‘Multilingual policy’ and T13, ‘English as medium of instruction’ are highly likely to co-occur in the same study. Some clusters are larger than others and aggregate more topics. The diagram also shows certain key nodes that connect a number of different topics. T2, ‘Teachers’ intercultural competence’ serves as one such node that is related closely to issues of Internationalisation of curriculum (T8), Online education (T21), Multilingual policy (T3), and Technology-based talent training (T12). In addition to those identified clusters, we noticed a few silos in the research space that were more likely to be investigated independently rather than combined with other topics. Such silos include T1, ‘Research performance & doctoral education’, T19, ‘Employment of graduates’, and T20, ‘University and HE system reform’. Networking these silos to other topics indicates novelty and innovation in future production of the space.

Topic diversity through time.

Topic correlation network.
Discussion
By integrating bibliometric analyses with NLP techniques, this study has painted a comprehensive picture of the HEI research space, taking into account temporal and geographical dimensions. Our findings reveal the claimed transformation of HEI research from its nascent, sporadic beginnings in the 1990s to its recognition as a distinct field of study over four decades (de Wit, 2019, 2024). This evolution is evidenced by several factors. First, the emergence of a stable cohort of active researchers signifies the formation of a dedicated scholarly community. Furthermore, the field has broadened to include contributions from beyond traditional Western and Northern regions, showcasing increased inclusivity (Figure 1). There has also been a notable rise in inter-regional collaborations among researchers (Figure 2). The foundational knowledge structure of HEI, encompassing key concepts and analytical models, has been established. The ‘perceived’ space of HEI research has widened, with new platforms for knowledge dissemination emerging between the mid-2000s and 2010s (Table 1). Moreover, the range of topics covered has diversified (Figure 5), aligning with trends observed in prior mappings of the HEI field (e.g., Ghani et al., 2022; Kuzhabekova et al., 2015; Saubert & Cooper, 2023). The distribution of topics across different regions (Figure 3) highlights the diverse research agendas globally, reinforcing the notion that the concept of internationalisation varies significantly across contexts. Different countries have embraced internationalisation uniquely, at their own pace, and for various reasons (de Wit & Altbach, 2020; Gao, 2019; Marginson, 2011; Stein, 2021), underscoring the heterogeneous nature of HEI research's ‘lived’ space.
In sharp contrast to the lived space's plurality is the conceived space's monotony. Scholars such as de Wit (2014, 2019) and Stein (2021) have highlighted and expressed concerns over the tension between HEI research's conceived and lived spaces, advocating for the exploration of alternative HEI paradigms. Marginson (2023) has critiqued the pervasive endorsement of Knight's (1994, 2004) definitions and conceptualisations of HEI as indicative of a ‘discursive hegemony’ (p. 3). Lefebvre (1991) considers hegemony as a comprehensive influence exerted across society, encompassing culture and knowledge, and generally via human mediation. de Wit (2024) suggested that ‘it is the way scholars and practitioners have constructed internationalisation, through interpretation, strategy and activity that has resulted in the dominance of the Western paradigm’ (p. 9). Our research substantiates the presence of Western dominance within HEI research, with Western concepts, theories, and discourses prevailing across investigations irrespective of the specific local or national contexts. It seems customary for non-Western scholars to adopt and apply existing Western definitions, concepts, and paradigms to their empirical contexts without substantial critique. Our analysis reveals that Topic 7, ‘Concepts and leading discourses in HEI,’ assumes greater significance in regions such as North America, the European Union, and Oceania compared to others.
There are good reasons for scholars to express their unease about the tension between the conceived and lived spaces. Lefebvre (1991) emphasises the significance of the representation of space in its formation, often subtly influencing through prohibitions. The fundamental concepts of a phenomenon form an integral part of the knowledge and knowledge-generation activities within a field (Scholte, 2008). Marginson (2023) contends that, ‘mapping cross-border practices using a rigid framework that correlates norms to scales, undermines that understanding’ (p. 12). He argues for the abandonment of ideological baggage in favour of neutral terminology, resonating with Lefebvre's (1991) caution against ideologies that foster abstract spatiality and fragmented space representations. ‘Naturally, such ideologies do not present themselves for what they are; instead, they pass themselves off as established knowledge… and this denies the possibility of knowing on the other’ (Lefebvre, 1991, p. 90; 107).
Although the tension is alarming, the relation between the dominant conceived space and dominated lived space, as Lefebvre (1991) claimed, is complex and subtle, and no single force can master the space completely. To mitigate this tension, Marginson (2023) suggests the adoption of an open, explanatory, and relational framework to supersede the nation-centric, normative, and universal conceptualisation of HEI. The domain of HEI has been in a constant state of flux. Academics persistently engaged in reflecting upon, rejuvenating, and critically assessing the foundational knowledge of HEI (de Wit, 2024), as evidenced by our findings that Topic 7, ‘Concepts and leading discourses in HEI’, not only represents the largest topic area but also exhibits a growing relevance from its inception. Changes have indeed emerged, with researchers from the Global South proposing definitions of internationalisation that better align with their specific contexts (Heleta & Chasi, 2023).
This study also demonstrated the substantial potential of applying various NLP techniques to the topography of research fields, illustrating that while NLP does not supplant in-depth qualitative reviews and analyses, it significantly complements them. The effective deployment of NLP in research inquiries fundamentally relies on human judgment and evaluation. Additionally, the challenges and gaps unveiled through text mining pave the way for future research directions. We encountered many technical challenges in collecting and processing data, which resulted in some limitations in the current study. Chief among these was the dataset's comprehensiveness. This issue of data availability constrained the breadth of our mapping endeavour, limiting our portrayal of the perceived space by excluding specific venues and forms of spatial practice. Future enhancements in prompt engineering could further refine the accuracy of text classification, such as determining the methodologies used in studies, and optimise the efficacy of NLP techniques in textual analysis. In addition to the technical constrains, the employment of Lefebvre's spatial theory in guiding this study's mapping practice is exploratory in nature. Though Lefebvre's spatial theory is powerful in explaining higher education at different scales, Marginson (2022) acknowledged the challenge for higher education studies to apply such lens, moving from the suggestive and fluid theorisations of Lefebvre to concepts that are operationalisable as empirical observations. Despite these challenges, we believe that Lefebvre's poetic articulation aptly captures the interconnected, dynamic, and pervasive nature of social space, affording researchers considerable flexibility in adapting the theory to their specific investigative needs.
Conclusion
The vitality of any research field hinges on regular and periodic reviews of both the knowledge it generates and the processes underpinning this generation. As knowledge and knowing are understood as social production, which are shaped by contextual factors, individual positionality, and prevailing ideologies (Merton, 1973), these reviews are crucial. They illuminate the inner workings of knowledge structures, elucidate the mechanisms of knowledge production, and draw connections between knowledge, the act of knowing, and the wider socio-political context at various levels. HEI stands as a relatively young field when compared to longer-established ones. Its significance has surged alongside the globalisation process, characterised by an increase in cross-border flows of people and ideas. This field has already amassed a significant body of knowledge. Mapping exercises, akin to the one presented in this study, serve as valuable tools for monitoring and evaluating the activities of researchers within the field, including their modes and preferences for academic production, the evolution of conceptual frameworks, the development of new discourses and methodologies, and shifts in research focus across different times and locations. Furthermore, field mapping reveals gaps in existing knowledge and highlights areas where tensions suggest both challenges and opportunities. Notably, advancements in NLP techniques offer field researchers new avenues to leverage existing data more effectively and efficiently. The ability of NLP to mine text on a large scale empowers researchers to pose novel questions, test hypotheses, and substantiate arguments related to HEI with empirical evidence.
Supplemental Material
sj-docx-1-jsi-10.1177_10283153241251924 - Supplemental material for Mapping Higher Education Internationalisation as a Research Space via Natural Language Processing (NLP) Techniques
Supplemental material, sj-docx-1-jsi-10.1177_10283153241251924 for Mapping Higher Education Internationalisation as a Research Space via Natural Language Processing (NLP) Techniques by Yuan Gao, Xuechun Wang and Xu Liu in Journal of Studies in International Education
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental material for this article is available online.
Notes
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
